Re: security.bsd.map_at_zero=0 problem with samba33 (including solution)
On Sat, Oct 03, 2009 at 10:27:39PM +, Bjoern A. Zeeb wrote: ... As we will try to keep the default in 8.x and 9.x to disallow user mappings at virtual address 0, we are interested in further issues that were not yet metnioned in either this thread or the Errata Notice. quagga 0.99.15 (built from ports) has the same issue as samba. -- Richard Perini Internet: r...@ci.com.au Corinthian Engineering Pty Ltd PHONE: +61 2 9552 5500 Sydney, AustraliaFAX: +61 2 9552 5549 ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: ral(4) on 8-RC1
On Monday 05 Oct 2009 00:28:20 you wrote: maxpower are expressed as dBm. Thanks for that and the pointers to get the channels right. A temporary hack in net80211 was trivial and I can take my time with the ral end of things. I believe Kip Macy was the last to look at ral in-depth, around the time support was added for gen 2 chipset, although I can't seem to find the code that was in p4 at that time. It was a while ago... I had problems w/ the iwi firmware on 64-bit so set the build to i386 only. The problems I had were relocation errors and noone could help; if those are gone then building the fw image for amd64 should be fine. Whether the driver works is another matter... The iwi driver seems to work for normal operation with the caveat that 802.11s is never going to work, but that's noted on the wiki anyway. I haven't been able to get Kismet to work (it used to on i386 on the iwi) although that may be down to PEBKAC in not fully understanding the 802.11 architectural changes in 8. I'll hammer the thing for a few more days, see if I can find any regression tests to apply to this setup and maybe move it to a different machine to ensure it works in multiple environments. This is, however, exactly the same hardware that failed to work with 7.1 amd64, so I'm pretty confident that it should be consistent. I'm torn between decent support for most things on ral and dual band and a more sensitive radio, to the order of around 5dB in the same location, on iwi. I'm tempted to just get an Atheros a/b/g card and be done with it. Oxford Tech have some AR5414/AR5006XS cards for reasonable money. Thanks for your help, Sam. Best regards, -- Matt Dawson MTD15-RIPE ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: [setup] no floppies FreeBSD 8 ?
I am not sure whether we dropped floppy support. But I can imagine that the release candidates do not have floppies. On Oct 4, 2009, at 1:54 PM, luca wrote: hi, I'm looking for the floppies images to install FreeBSD 8 on a PC which can't boot from CDROM ; but the images are not available (i.e. no floppies directory) Do FreeBSD 8 dropped floppy install ? Regards, Luca ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org -- /\ Best regards,| re...@freebsd.org \ / Remko Lodder | re...@efnet Xhttp://www.evilcoder.org/| / \ ASCII Ribbon Campaign| Against HTML Mail and News ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: em0 watchdog timeouts
Hi, I've been struggling with watchdog timeouts in 7.1/7.2-RELEASE for the past 6months too. It looks related. I've tried to replace the hardware 3 times (2 different IBM x3755 chassis, one IBM x3650 chassis). I tried first with onboard broadcom NICs (bce-based) PCIx-based, until I had issues with watchdog timeout. I tried replacing it with a 4-port pci-x Intel NIC, which gave me same problems. I was told that the 4-port intel NICs had an onboard bus- controller, that could cause trouble, so I replaced this with a 2-port PCI-e intel, which I was told by a Sepherosa Ziehau was the best performing gig-e NIC (rx/tx). Still getting watchdog timeouts, I tried upgrading all sort of sysctls I found in mailing-list threads (disable msi/msix interrupts, adjust rx/tx processing, etc, etc). I tried upgrading BIOS, firmware on all kinds of stuff (disks, BMC, etc, etc) to newest version. I also tried using a different qlogic isp(4) FC-controller (PCI-e). No matter what I tried, I could not diagnose this problem, or at least fix it. Also it happened rarely enough, to not be easy to debugging. I would get a series of watchdog timeout -- resetting, until the NIC would go completly offline - at the point I'd reboot it from console. This happened about once every 1-10 days, usually about 11-13:00. This machine has now been replaced with Linux, unfortunately, just to avoid more customer complaints and downtime. The IBM x3755 with FreeBSD7.2 which was replaced with Linux, is still online, and can be put at disposal for any developers who would like to debug this further. Like Stefan Krueger mentioned, this machine is also running as NFS server, with a mix of BSD and Linux clients, and it's getting hit pretty hard by clients. Hope we can iron this bug out, in the future. Best regards, Daniel Bond. On Oct 2, 2009, at 10:36 PM, Rudy wrote: Ah, I'll stop messing with them. I just set them all to 0 to see if that will help and noticed the card was leaving tx_int_delay=1. # sysctl dev.em.4.debug=1 Oct 2 13:26:07 mango kernel: em4: tx_int_delay = 1, tx_abs_int_delay = 0 Oct 2 13:26:07 mango kernel: em4: rx_int_delay = 0, rx_abs_int_delay = 0 # sysctl dev.em.4 dev.em.4.%desc: Intel(R) PRO/1000 Network Connection 6.9.12 dev.em.4.rx_int_delay: 0 dev.em.4.tx_int_delay: 0 dev.em.4.rx_abs_int_delay: 0 dev.em.4.tx_abs_int_delay: 0 Splitting traffic to different ports has brought down the watchdog events to once a day. ... essentially, I have a quad 30Mbps (not quad 1Gbps) card. heheh. Would turning off net.inet.ip.fastforwarding or any other setting help? Today, I set net.inet.ip.fw.enable=0 and I'll see if that helps. I have a feeling that isn't related to the NIC at all, but I'm not sure what else to try. Rudy Jack Vogel wrote: Watchdog resets the adapter. Messing with these values is of dubious value anyway. Jack On Fri, Oct 2, 2009 at 11:36 AM, Rudy cra...@monkeybrains.net wrote: I noticed something interesting. I set the rc_int_delay to 0: sysctl dev.em.5.rx_int_delay=0 Chcking via sysctl dev.em.5.debug=1 shows ex_int_delay is indeed 0: Oct 1 17:32:41 mango kernel: em5: rx_int_delay = 0, rx_abs_int_delay = 66 After a watchdog event, sysctl dev.em.5.debug=1 shows ex_int_delay is now 32: Oct 2 11:29:49 mango kernel: em5: rx_int_delay = 32, rx_abs_int_delay = 66 However, running sysctl dev.em.5 shows it as 0: dev.em.5.rx_int_delay: 0 dev.em.5.tx_int_delay: 66 Seems like the adapter and the kernel don't agree on the rx_int_delay value. Rudy ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org PGP.sig Description: This is a digitally signed message part
Re: security.bsd.map_at_zero=0 problem with samba33 (including solution)
At 12:47 PM 10/4/2009, Andre Albsmeier wrote: On Sat, 03-Oct-2009 at 22:27:39 +, Bjoern A. Zeeb wrote: On Sat, 3 Oct 2009, Andre Albsmeier wrote: Hi, On Sat, 03-Oct-2009 at 16:27:32 -0400, jhell wrote: On Sat, 3 Oct 2009 14:42 -, Andre.Albsmeier wrote: FYI, after setting security.bsd.map_at_zero to 0 on 7.2-STABLE all samba33 programmes did abort() immediately after start. The solution was to use CONFIGURE_ARGS+= --disable-pie -Andre To add an additional note samba33 even when not running (not enabled by a rcvar) also runs a tdbcleanup routine on shutdown and/or start that also does abort(). Yes, every samba programme is linked with -pie per default (so all abort()). Thanks for reporting the issue. People are aware of the problem now and we'll try to present a solution within the next days for better position-independent executable (PIE) handling. Meanwhile there are multiple solutions for people affected: (1) recompile the port; but as more than just samba might be affected and we generally do not want to flip the pie switch everywhere that's probably only a temporary, private solution. I'll stick to this since I am happy about having the map_at_zero option and want to continue to try it out on 7.2-STABLE. And I see now reason why samba has to be linked with -pie (without -pie it is also 4% smaller). Hi, What are the impacts (if any) of compiling all the ports with PIE disabled that are effected by setting security.bsd.map_at_zero=0 ? ---Mike Mike Tancsa, tel +1 519 651 3400 Sentex Communications,m...@sentex.net Providing Internet since 1994www.sentex.net Cambridge, Ontario Canada www.sentex.net/mike ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: em0 watchdog timeouts
On Oct 2, 2009, at 4:36 PM, Rudy wrote: Today, I set net.inet.ip.fw.enable=0 and I'll see if that helps. I have a feeling that isn't related to the NIC at all, but I'm not sure what else to try. Just curious, have you tried (or are you using) device polling? -- Robert Blayzor, BOFH INOC, LLC rblay...@inoc.net http://www.inoc.net/~rblayzor/ ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: security.bsd.map_at_zero=0 problem with samba33 (including solution)
On Mon, 5 Oct 2009, Mike Tancsa wrote: Hi Mike, Thanks for reporting the issue. People are aware of the problem now and we'll try to present a solution within the next days for better position-independent executable (PIE) handling. Meanwhile there are multiple solutions for people affected: (1) recompile the port; but as more than just samba might be affected and we generally do not want to flip the pie switch everywhere that's probably only a temporary, private solution. I'll stick to this since I am happy about having the map_at_zero option and want to continue to try it out on 7.2-STABLE. And I see now reason why samba has to be linked with -pie (without -pie it is also 4% smaller). Hi, What are the impacts (if any) of compiling all the ports with PIE disabled that are effected by setting security.bsd.map_at_zero=0 ? Basically in first place compared to yesterday, there is no impact if you do it privately as it will not make much, if any, difference as PIE support in FreeBSD so far has basically been non-existent and was more working out of luck, according to my current understanding. Actually there is a slight difference that I should mention. With PIE valid user code is currently mapped at virtual address 0. That's why it started to fail for people setting map_at_zero to 0. So NULL pointer dereferences in applications like samba will not lead to the obvious error but will point at something, which in that case usually will be garbage and the application will either not work as intended (well that's alsready the case with a NULL derefernce;) or crash in random code (as a later consequence of the NULL deref). In the future though, it seems we will support PIE and in that case you'll get mappings at different place, in the end ideally slightly random so that it'll be hard to exploit the code itself as people no longer can easily pre-guess where things are in virtual memory. Disbaling PIE now means, that this will not happen later but you'll have the fixed (entry point) address, unless you recompile the ports again. I said, no impact if you do it privately above; the problems for the ports crew here are: 1) the entire set of ports affected by PIE is unidentified. 2) they build packages for 6/7 and 8, if not yet 9 as well soon for multiple architectures and cannot just rebuild everything. 3) they can especially not just rebuild the package set for the upcoming 8.0-RELEASE (don't ask, I don't know when it'll happen;). 4) basically they shouldn't need to care about which way the port (read the package as released by its devlepors) choses to be compiled by default. I do not have strong arguments if PIE makes a lot of sense but it seems that for the long term we'll have to support it, as parts of the other world support it as well, and once we support it even old packages will then continue to work even if people disable mapping_at_zero or if the release by default does that. /bz -- Bjoern A. Zeeb It will not break if you know what you are doing. ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: em0 watchdog timeouts
This posting just muddies the issue, first you talk about having a problem that involves Broadcom, ok, so post about that on something other than em :) Then you make some references to hardware that you might have bought but didn't, I'm not about debugging 'possible worlds problems' though so can't help you there either :) Finally you never say what the actual hardware is, other than a person who I do not know told you it was the best performer... so, what exactly is it? You have a problem once every 10 days, and at a specific time no less, this almost always means something in your environment, a cron job run amok, a piece of hardware that resets, I dunno, but the last thing I would suspect given this description is the driver. You need a good sysadmin for this debugging I would venture, not a driver developer. Jack On Mon, Oct 5, 2009 at 7:19 AM, Daniel Bond d...@danielbond.org wrote: Hi, I've been struggling with watchdog timeouts in 7.1/7.2-RELEASE for the past 6months too. It looks related. I've tried to replace the hardware 3 times (2 different IBM x3755 chassis, one IBM x3650 chassis). I tried first with onboard broadcom NICs (bce-based) PCIx-based, until I had issues with watchdog timeout. I tried replacing it with a 4-port pci-x Intel NIC, which gave me same problems. I was told that the 4-port intel NICs had an onboard bus-controller, that could cause trouble, so I replaced this with a 2-port PCI-e intel, which I was told by a Sepherosa Ziehau was the best performing gig-e NIC (rx/tx). Still getting watchdog timeouts, I tried upgrading all sort of sysctls I found in mailing-list threads (disable msi/msix interrupts, adjust rx/tx processing, etc, etc). I tried upgrading BIOS, firmware on all kinds of stuff (disks, BMC, etc, etc) to newest version. I also tried using a different qlogic isp(4) FC-controller (PCI-e). No matter what I tried, I could not diagnose this problem, or at least fix it. Also it happened rarely enough, to not be easy to debugging. I would get a series of watchdog timeout -- resetting, until the NIC would go completly offline - at the point I'd reboot it from console. This happened about once every 1-10 days, usually about 11-13:00. This machine has now been replaced with Linux, unfortunately, just to avoid more customer complaints and downtime. The IBM x3755 with FreeBSD7.2 which was replaced with Linux, is still online, and can be put at disposal for any developers who would like to debug this further. Like Stefan Krueger mentioned, this machine is also running as NFS server, with a mix of BSD and Linux clients, and it's getting hit pretty hard by clients. Hope we can iron this bug out, in the future. Best regards, Daniel Bond. On Oct 2, 2009, at 10:36 PM, Rudy wrote: Ah, I'll stop messing with them. I just set them all to 0 to see if that will help and noticed the card was leaving tx_int_delay=1. # sysctl dev.em.4.debug=1 Oct 2 13:26:07 mango kernel: em4: tx_int_delay = 1, tx_abs_int_delay = 0 Oct 2 13:26:07 mango kernel: em4: rx_int_delay = 0, rx_abs_int_delay = 0 # sysctl dev.em.4 dev.em.4.%desc: Intel(R) PRO/1000 Network Connection 6.9.12 dev.em.4.rx_int_delay: 0 dev.em.4.tx_int_delay: 0 dev.em.4.rx_abs_int_delay: 0 dev.em.4.tx_abs_int_delay: 0 Splitting traffic to different ports has brought down the watchdog events to once a day. ... essentially, I have a quad 30Mbps (not quad 1Gbps) card. heheh. Would turning off net.inet.ip.fastforwarding or any other setting help? Today, I set net.inet.ip.fw.enable=0 and I'll see if that helps. I have a feeling that isn't related to the NIC at all, but I'm not sure what else to try. Rudy Jack Vogel wrote: Watchdog resets the adapter. Messing with these values is of dubious value anyway. Jack On Fri, Oct 2, 2009 at 11:36 AM, Rudy cra...@monkeybrains.net wrote: I noticed something interesting. I set the rc_int_delay to 0: sysctl dev.em.5.rx_int_delay=0 Chcking via sysctl dev.em.5.debug=1 shows ex_int_delay is indeed 0: Oct 1 17:32:41 mango kernel: em5: rx_int_delay = 0, rx_abs_int_delay = 66 After a watchdog event, sysctl dev.em.5.debug=1 shows ex_int_delay is now 32: Oct 2 11:29:49 mango kernel: em5: rx_int_delay = 32, rx_abs_int_delay = 66 However, running sysctl dev.em.5 shows it as 0: dev.em.5.rx_int_delay: 0 dev.em.5.tx_int_delay: 66 Seems like the adapter and the kernel don't agree on the rx_int_delay value. Rudy ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: security.bsd.map_at_zero=0 problem with samba33 (including solution)
On Sun, 4 Oct 2009 12:07 -0700, dougb@ wrote: Bjoern A. Zeeb wrote: On Sat, 3 Oct 2009, Andre Albsmeier wrote: Hi, On Sat, 03-Oct-2009 at 16:27:32 -0400, jhell wrote: On Sat, 3 Oct 2009 14:42 -, Andre.Albsmeier wrote: FYI, after setting security.bsd.map_at_zero to 0 on 7.2-STABLE all samba33 programmes did abort() immediately after start. The solution was to use CONFIGURE_ARGS+= --disable-pie -Andre To add an additional note samba33 even when not running (not enabled by a rcvar) also runs a tdbcleanup routine on shutdown and/or start that also does abort(). Yes, every samba programme is linked with -pie per default (so all abort()). Thanks for reporting the issue. People are aware of the problem now and we'll try to present a solution within the next days for better position-independent executable (PIE) handling. Meanwhile there are multiple solutions for people affected: (1) recompile the port; Just to be clear, you have to recompile the port with --disable-pie added to the CONFIGURE_ARGS in the Makefile. It would also be nice if there were a __FreeBSD_version bump for this new feature. Doug Just to add on to this for those that may be wondering what they can do to solve this for just the ports infrastructure in the mean time. You may add the following to /etc/make.conf .if ${.CURDIR:M/usr/ports*} CONFIGURE_ARGS+= --disable-pie .endif This is assuming that you have your ports installed in the standard place of /usr/ports. If not you may adjust the match accordingly. This could also be extended to individual ports or substructures of your liking so that you are not adding those configure arguments to every port under the sun. Keep in mind, this should be followed carefully and not expected to be a full workaround as a greater solution still lies in wait. Best regards. -- %{+ | dataix.net!jhell 2048R/89D8547E 2009-09-30 | | BSD since FreeBSD 4.2Linux since Slackware 2.1 | | 85EF E26B 07BB 3777 76BE B12A 9057 8789 89D8 547E | +%} ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: glabel+gmirror (8.0-RC1 problem)
Pawel Jakub Dawidek wrote: On Mon, Sep 28, 2009 at 08:37:56PM +0200, Oliver Lehmann wrote: Hi Pawel, Pawel Jakub Dawidek wrote: Does anything change between you upgrade from BETA3 and RC1? For example gmirror was compiled into the kernel before and now is loaded as module or something similar? Nope, it was a clean BETA3 installation with the default GENERIC kernel which has afaik geom_label in kernel, but not geom_mirror (nevertheless I loaded geom_label.ko at boottime as well as geom_mirror) The same with RC1 - clean and fresh installation with the default GENERIC kernel and geom_label in kernel (default), but still loaded as module at boottime as well as geom_mirror. Could you test this patch: http://people.freebsd.org/~pjd/patches/improved_taste.patch This makes gmirror+glabel work again on RC1 Thanks for confirmation. gjorunal is also affected. I tried to use one partition of my gmirror disk as journal device for my 3ware raid-5 device which works until I reboot - the journal is then gone as well. Is this patch likly to fix this as well? Will it be included in a future RC? Until now I've stayed away using glabel+gmirror but I didn't knew that gjournal is affected as well so I'm now left with warning that the journal provider is gone wile booting - and more tragically I'm left without journaling at all (which hurts on a 2.7TB partition when the system was not cleanly shut down) -- Oliver Lehmann http://www.pofo.de/ http://wishlist.ans-netz.de/ ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Still possible to panic RELENG_7 with ZFS (kmem exhaustion)?
(Note: please keep me CC'd, as I am not subscribed to freebsd-stable) Is it still possible with ZFS to panic a RELENG_7 amd64 box (kernel/world from recent[1] source) with kmem map too small or similar conditions? Why I ask: Our production SQL/backup box kernel panic'd a couple days ago. Sadly, the box also acts as a serial console server, so I don't have the exact message spit back from the kernel prior to being dumped to DDB, but the backtrace looks very much like the historic problem of the ZFS ARC exhausting all kernel memory, so I'm betting the message prior to being dumped to DDB was kmem map too small: db bt Tracing pid 40738 tid 100168 td 0xff001f078720 kdb_enter_why() at kdb_enter_why+0x3d panic() at panic+0x176 kmem_malloc() at kmem_malloc+0x548 uma_large_malloc() at uma_large_malloc+0x3c malloc() at malloc+0xc1 arc_get_data_buf() at arc_get_data_buf+0x1bb arc_buf_alloc() at arc_buf_alloc+0xa1 arc_read_nolock() at arc_read_nolock+0xd1 arc_read() at arc_read+0x71 dbuf_prefetch() at dbuf_prefetch+0x135 dmu_zfetch_dofetch() at dmu_zfetch_dofetch+0xe3 dmu_zfetch() at dmu_zfetch+0xa58 dbuf_read() at dbuf_read+0x433 dmu_buf_hold_array_by_dnode() at dmu_buf_hold_array_by_dnode+0x119 dmu_buf_hold_array() at dmu_buf_hold_array+0x57 dmu_read_uio() at dmu_read_uio+0x3f zfs_freebsd_read() at zfs_freebsd_read+0x55a vn_read() at vn_read+0x1ef dofileread() at dofileread+0x88 kern_readv() at kern_readv+0x43 read() at read+0x4d syscall() at syscall+0x247 Xfast_syscall() at Xfast_syscall+0xab The machine in question has absolutely no loader.conf tuning applied, and kernel/world was built from RELENG_7 dated 2009/06/11. The ZFS pool consisted of a single (entire) disk; nothing special. I do not have sysctl counters from the box before it panic'd, but these are what are active presently: hw.physmem: 4286558208 vm.kmem_size_max: 329853485875 vm.kmem_size_min: 0 vm.kmem_size: 1381478400 With regards to the above counters: ZFS is not in use. I had to switch back to UFS2 (zpool destroy + newfs -O2 -U...) because of stability concerns relating to the question at hand. If someone familiar with the FreeBSD ZFS internals, and/or the VM, please make a statement I think it would beneficial to those who are considering using/migrating to ZFS on FreeBSD. The only semi-official statements I've read as of late are here: http://lists.freebsd.org/pipermail/freebsd-stable/2009-September/051810.html http://lists.freebsd.org/pipermail/freebsd-stable/2009-September/051830.html And what's in src/UPDATING: 20090207: ZFS users on amd64 machines with 4GB or more of RAM should reevaluate their need for setting vm.kmem_size_max and vm.kmem_size manually. In fact, after recent changes to the kernel, the default value of vm.kmem_size is larger than the suggested manual setting in most ZFS/FreeBSD tuning guides. Thanks! [1]: Recent means post-February 2009, specifically after Alan Cox's commits listed here: http://svn.freebsd.org/changeset/base/188291 http://svn.freebsd.org/changeset/base/187523 http://svn.freebsd.org/changeset/base/187522 http://svn.freebsd.org/changeset/base/187520 http://svn.freebsd.org/changeset/base/187485 http://svn.freebsd.org/changeset/base/187466 http://svn.freebsd.org/changeset/base/187465 http://svn.freebsd.org/changeset/base/187464 http://svn.freebsd.org/changeset/base/187458 http://svn.freebsd.org/changeset/base/187428 http://svn.freebsd.org/changeset/base/187425 http://svn.freebsd.org/changeset/base/187420 http://svn.freebsd.org/changeset/base/187419 http://svn.freebsd.org/changeset/base/187416 http://svn.freebsd.org/changeset/base/187414 http://svn.freebsd.org/changeset/base/187408 http://svn.freebsd.org/changeset/base/187407 http://svn.freebsd.org/changeset/base/187404 http://svn.freebsd.org/changeset/base/187400 -- | Jeremy Chadwick j...@parodius.com | | Parodius Networking http://www.parodius.com/ | | UNIX Systems Administrator Mountain View, CA, USA | | Making life hard for others since 1977. PGP: 4BD6C0CB | ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: em0 watchdog timeouts
Hi Jack, I'll comment your mail inline: On Oct 5, 2009, at 6:57 PM, Jack Vogel wrote: This posting just muddies the issue, first you talk about having a problem that involves Broadcom, ok, so post about that on something other than em :) I only meant to indicate that the problem might exist outside the intel driver. I'm also indicating that it happens with several drivers (bge, bce and em) on several different machines, on both pci-x and pci-e. I'm sorry if this is confusing to you, but I still think it's relevant to mention. Then you make some references to hardware that you might have bought but didn't, I'm not about debugging 'possible worlds problems' though so can't help you there either :) No. I only made references to hardware I actually used, and had real- world issues with. Finally you never say what the actual hardware is, other than a person who I do not know told you it was the best performer... so, what exactly is it? Sepherosa is a guy that writes drivers for BSD based operating systems. Including FreeBSD. He has a lot of knowledge in this area. http://people.freebsd.org/~sephe/ The NIC you are referring to, the one sephe recommended me, is a 82571EB. I didn't mention specific hardware, as I think it's more important to note this is an issue I'm experiencing across different sets of hardware and drivers. You have a problem once every 10 days, and at a specific time no less, this almost always means something in your environment, a cron job run amok, a piece of hardware that resets, I dunno, but the last thing I would suspect given this description is the driver. This is not what I wrote. I wrote I had a problem every 1-10 days, but it would usually happen once every 3-4 days. At worst, every day in periods. It's not at any specific time. If you read my email correctly, I say it *usually* happens arround 11-13:00, but it has happened at random times too. This is my point exactly. I don't think it's the Intel-driver, I think the problem is elsewhere. I had a suspicion it had to do with the combination of nic + qlogic fc-controller, but I have no evidence of this. You need a good sysadmin for this debugging I would venture, not a driver developer. What I need is useful advice/help. I never stated I needed a driver developer. I'd like to be able to run my favorite OS on cool hardware, in the future, for a high-performing NFS-server, without problems like I've experienced the past 6months, on a production system. Please note that I'm managing a server-park almost completely based on FreeBSD, and I'm running many NFS servers on other hardware, for other services, without issues. I've seen several other FreeBSD-users having problems with this too, so I think it's of importance for the project. As I mentioned originally, I'm happy to dispose the hardware to any FreeBSD developer that might want to look further into this. Debugging it further is above my skill-set, I don't even know where to begin looking, especially since I can't produce any panics. I'm sorry to say, but your reply was %0 useful, Jack. Jack - Daniel PGP.sig Description: This is a digitally signed message part
Re: em0 watchdog timeouts
Sorry, its a Monday morning, I was being kinda facetious, guess it didn't work very well :) I apologize. I know it must be annoying for you, its as much so for me when its something I can't just fix because its not reproducible. So, I feel your pain. Will try to restrain my Monday blues in the future. Jack On Mon, Oct 5, 2009 at 11:32 AM, Daniel Bond d...@danielbond.org wrote: Hi Jack, I'll comment your mail inline: On Oct 5, 2009, at 6:57 PM, Jack Vogel wrote: This posting just muddies the issue, first you talk about having a problem that involves Broadcom, ok, so post about that on something other than em :) I only meant to indicate that the problem might exist outside the intel driver. I'm also indicating that it happens with several drivers (bge, bce and em) on several different machines, on both pci-x and pci-e. I'm sorry if this is confusing to you, but I still think it's relevant to mention. Then you make some references to hardware that you might have bought but didn't, I'm not about debugging 'possible worlds problems' though so can't help you there either :) No. I only made references to hardware I actually used, and had real-world issues with. Finally you never say what the actual hardware is, other than a person who I do not know told you it was the best performer... so, what exactly is it? Sepherosa is a guy that writes drivers for BSD based operating systems. Including FreeBSD. He has a lot of knowledge in this area. http://people.freebsd.org/~sephe/ http://people.freebsd.org/%7Esephe/ The NIC you are referring to, the one sephe recommended me, is a 82571EB. I didn't mention specific hardware, as I think it's more important to note this is an issue I'm experiencing across different sets of hardware and drivers. You have a problem once every 10 days, and at a specific time no less, this almost always means something in your environment, a cron job run amok, a piece of hardware that resets, I dunno, but the last thing I would suspect given this description is the driver. This is not what I wrote. I wrote I had a problem every 1-10 days, but it would usually happen once every 3-4 days. At worst, every day in periods. It's not at any specific time. If you read my email correctly, I say it *usually* happens arround 11-13:00, but it has happened at random times too. This is my point exactly. I don't think it's the Intel-driver, I think the problem is elsewhere. I had a suspicion it had to do with the combination of nic + qlogic fc-controller, but I have no evidence of this. You need a good sysadmin for this debugging I would venture, not a driver developer. What I need is useful advice/help. I never stated I needed a driver developer. I'd like to be able to run my favorite OS on cool hardware, in the future, for a high-performing NFS-server, without problems like I've experienced the past 6months, on a production system. Please note that I'm managing a server-park almost completely based on FreeBSD, and I'm running many NFS servers on other hardware, for other services, without issues. I've seen several other FreeBSD-users having problems with this too, so I think it's of importance for the project. As I mentioned originally, I'm happy to dispose the hardware to any FreeBSD developer that might want to look further into this. Debugging it further is above my skill-set, I don't even know where to begin looking, especially since I can't produce any panics. I'm sorry to say, but your reply was %0 useful, Jack. Jack - Daniel ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: em0 watchdog timeouts
On Mon, Oct 05, 2009 at 08:32:14PM +0200, Daniel Bond wrote: What I need is useful advice/help. I never stated I needed a driver developer. I'd like to be able to run my favorite OS on cool hardware, in the future, for a high-performing NFS-server, without problems like I've experienced the past 6months, on a production system. Please note that I'm managing a server-park almost completely based on FreeBSD, and I'm running many NFS servers on other hardware, for other services, without issues. I've seen several other FreeBSD-users having problems with this too, so I think it's of importance for the project. As I mentioned originally, I'm happy to dispose the hardware to any FreeBSD developer that might want to look further into this. Debugging it further is above my skill-set, I don't even know where to begin looking, especially since I can't produce any panics. I can give one bit of advice that helped me in a similar situation: check you motherboards. I run about a dozen fileservers on FreeBSD, and have always been very happy with their performance, but some months ago I began to experience problems with one of them. These problems were 'watchdog timeout' errors. Tried all manner of things, different NICs of different types, changing settings, etc., but nothing helped over the long term. At some point, when very heavy i/o was going on to our Beowulf cluster, the 'watchdog timeouts' would begin. What was strange is that other (supposedly identical) machines handled _more_ i/o without a problem. Finally, while doing some comparisons, I realized that the motherboard having the problem was _not_ the same as the others; it was similar, but not identical. I changed the motherboard and all the problems went away, never to reappear. I don't know if it was a specific problem with that particular motherboard, or something about that model, but for whatever reason, it appears that the buses just couldn't handle a RAID card and three active NICs. -- greg byshenk - gbysh...@byshenk.net - Leiden, NL ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: em0 watchdog timeouts
Finally, while doing some comparisons, I realized that the motherboard having the problem was _not_ the same as the others; it was similar, but not identical. This is a good piece of info. I can try swapping out the MB and see what happens. I do want to add: thank you Jack for all your help and if does turn out to be the MB, then double thanks. Viva Monday! :) What would be nice would be MORE info for a watchdog timeout... maybe a sysctl dev.watchdog.debug=1 or something where when a watchdog event happened --- for whatever driver --- a bunch of stats were dumped relating to the event. Rudy ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: em0 watchdog timeouts
Hmmm, I did have one of the drivers print more info at watchdog time, but I just looked and that's not em, time to add that I guess. Since you're in the driver there isn't a huge amount of info that you can print, it still may not be enough to help. BTW, I've always been somewhat dissatisfied with the watchdog design and think its kinda flawed, I could try and make you an experimental with debug and some changes that you can try if you'd like. Jack On Mon, Oct 5, 2009 at 1:54 PM, Rudy cra...@monkeybrains.net wrote: Finally, while doing some comparisons, I realized that the motherboard having the problem was _not_ the same as the others; it was similar, but not identical. This is a good piece of info. I can try swapping out the MB and see what happens. I do want to add: thank you Jack for all your help and if does turn out to be the MB, then double thanks. Viva Monday! :) What would be nice would be MORE info for a watchdog timeout... maybe a sysctl dev.watchdog.debug=1 or something where when a watchdog event happened --- for whatever driver --- a bunch of stats were dumped relating to the event. Rudy ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: openssh concerns
Hi. I explained my opinion quite well (imo) a bit further down in my previous email. I'm not sure what to answer. I don't necessarily think it's relevant for every computer running sshd. I see a tendency to change sshd port to 2022 and other port numbers. I'm not sure everyone doing it is aware that using unprivileged ports also has consequences, compared to (often) a few harmless logentries. I'd much rather use an privileged port, or mac_portacl(4), like mentioned earlier. Best regards, Daniel. I've noticed quite a bit of suggestions to use 2022, and such On Oct 5, 2009, at 11:58 PM, Doug Barton wrote: Daniel Bond wrote: However, I'm concerned about the suggestion of using an unprivileged port Please explain your reasoning, and how it's relevant in a world where the vast majority of Internet users have complete administrative control over the systems they use. Doug -- This .signature sanitized for your protection ___ freebsd-secur...@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-security To unsubscribe, send any mail to freebsd-security-unsubscr...@freebsd.org PGP.sig Description: This is a digitally signed message part
libthr and daemon()
I have some code that tries to use pthread_cond_wait() and it's getting back EPERM. Upon further investigation, here's what I've found: When the app starts, libthr's _libpthread_init calls init_main_thread() to set the thread id in struct pthread's tid. The app opens a log file then calls daemon(). daemon() calls fork() fork() does not appear to be linked to _fork() in libthr; see below. The app creates a thread to handle signals. The app attempts to wait on a condition variable (pthread_cond_wait(); this gives EPERM). Looking into libthr's cond_wait_common(), it does a THR_UMUTEX_LOCK on the cv's c_lock using the struct pthread from _get_curthread(). Here, curthread points to the pthread struct that got the tid from thr_self on startup. Because of fork() this is the same address in the daemonized app as the original. But curthread-tid is the tid of the original app, not the daemonized version, hence my assumption that fork() didn't resolve to libthr's _fork(). When cond_wait_common() calls into the kernel to actually do the cv_wait, do_unlock_umutex/do_unlock_normal() returns EPERM since the tid does not match. AFAICT this has nothing to do with any code in the app itself. The two things I don't know: 1) what utilities can I use to show me which version of fork will be used at runtime? ldd just shows me that the app is linked against libc and libthr. 2) why would fork resolve to the one in libc (presumably, I'm not sure how to prove this) instead of the one in libthr? Thanks, matthew ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: libthr and daemon()
On Mon, 5 Oct 2009, Matthew Fleming wrote: I have some code that tries to use pthread_cond_wait() and it's getting back EPERM. Upon further investigation, here's what I've found: When the app starts, libthr's _libpthread_init calls init_main_thread() to set the thread id in struct pthread's tid. Is the application threaded before calling daemon()? The app opens a log file then calls daemon(). daemon() calls fork() fork() does not appear to be linked to _fork() in libthr; see below. The app creates a thread to handle signals. The app attempts to wait on a condition variable (pthread_cond_wait(); this gives EPERM). Was the condition variable created before daemon() was called? The picture is not clear to me. POSIX states that only async-signal-safe function calls can be made from a child fork()'d from a threaded application. The intent is that the child should soon after call a function in the exec() family. Certainly, any more threaded calls in the child are invalid. -- DE ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: em0 watchdog timeouts
BTW, I've always been somewhat dissatisfied with the watchdog design and think its kinda flawed, I could try and make you an experimental with debug and some changes that you can try if you'd like. I'm game -- it would be nice if the machine still reset the watchdog in 3 seconds and didn't cause any more damage from the debug code (eg a panic). :) My frequency of watchdog events is about 2 or 3 times per day. I am running: Intel(R) PRO/1000 Network Connection 6.9.12 Rudy ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org