MFC: Distributed audit daemon committed (was: svn commit: r243752 - in head: etc etc/defaults etc/mail etc/mtree etc/rc.d share/man/man4 usr.sbin usr.sbin/auditdistd (fwd)) (fwd)

2012-12-18 Thread Robert Watson
are enforced for both targets.) More details on the daemon below. Robert N M Watson Computer Laboratory University of Cambridge -- Forwarded message -- Date: Sat, 1 Dec 2012 15:15:11 + (GMT) From: Robert Watson rwat...@freebsd.org To: curr...@freebsd.org Cc: secur...@freebsd.org

FYI: Userspace DTrace MFC to stable/8

2011-02-28 Thread Robert Watson
Feb 2011 23:28:35 + (UTC) From: Robert Watson rwat...@freebsd.org To: src-committ...@freebsd.org, svn-src-...@freebsd.org, svn-src-sta...@freebsd.org, svn-src-stabl...@freebsd.org Subject: svn commit: r219107 - in stable/8/sys: amd64/amd64 amd64/include boot/common cddl/compat/opensolaris

Re: HEADS UP: FreeBSD 6.4 and 8.0 EoLs coming soon

2010-09-19 Thread Robert Watson
On Wed, 8 Sep 2010, Vadim Goncharov wrote: Which part of support for the Giant lock *over the network stack* was removed [emphasis mine] do you not understand? No, component removed was (1), I've underlined. The reason is performance for overall network stack, not ideology. For a

Re: HEADS UP: FreeBSD 6.4 and 8.0 EoLs coming soon

2010-09-05 Thread Robert Watson
On Wed, 1 Sep 2010, Hans Petter Selasky wrote: - Or whatever other method to get ISDN back in kernel ? It seems code exists :-) http://old.nabble.com/ISDN4BSD-on-8-current-td23919925.html ISDN4BSD package has been updated to compile on FreeBSD 8-current

Re: 8.1 speed issues

2010-06-18 Thread Robert Watson
On Fri, 18 Jun 2010, William D. Colburn (Schlake) wrote: So I've just upgraded from whatever was stable in 2004 to 8.1 (it's a private file server in my house, I pay no attention to it until it crashes), and uh, the speed difference is very noticeable. In short, it's like I bought a brand

Re: Results of BIND RFC

2010-04-02 Thread Robert Watson
On Fri, 2 Apr 2010, Poul-Henning Kamp wrote: The result of the RFC was that bind is not a mandatory component to make a usable system, so you argument suffers from bad logic. With an eye on the date of Doug's suggestive e-mail, I actually am concerned that we maintain support for DNSSEC

Survey results very helpful, thanks! (was: Re: net.inet.tcp.timer_race: does anyone have a non-zero value?)

2010-03-08 Thread Robert Watson
On Sun, 7 Mar 2010, Robert Watson wrote: If your system shows a non-zero value, please send me a *private e-mail* with the output of that command, plus also the output of sysctl kern.smp, uptime, and a brief description of the workload and network interface configuration. For example: it's

Re: Survey results very helpful, thanks! (was: Re: net.inet.tcp.timer_race: does anyone have a non-zero value?)

2010-03-08 Thread Robert Watson
On Mon, 8 Mar 2010, Doug Hardie wrote: I run a number of 4 core systems with em interfaces. These are production systems that are unmanned and located a long way from me. Under unusual conditions it can take up to 6 hours to get there. I have been waiting to switch to 8.0 because of the

net.inet.tcp.timer_race: does anyone have a non-zero value?

2010-03-07 Thread Robert Watson
Dear all: I'm embarking on some new network stack locking work, which requires me to address a number of loose ends in the current model. A few years ago, my attention was drawn to a largly theoretical race, which had existed in the BSD code since inception. It is detected and handled in

Re: is dtrace usable?

2010-03-06 Thread Robert Watson
On Sat, 6 Mar 2010, Daniel Braniss wrote: link_elf_obj: symbol lapic_cyclic_clock_func undefined when trying kldload dtraceall this is with a fearly resent 8-stable I'm trying to help Rick Maclem debug the NSF/UDP problem, and I thought it would be a good chance to learn

Re: is dtrace usable?

2010-03-06 Thread Robert Watson
On Sat, 6 Mar 2010, Alexander Leidinger wrote: Take a look at the DTrace configuration information here: http://www.freebsd.org/doc/en_US.ISO8859-1/books/handbook/dtrace.html I've just reread it (despite the fact that I already used it). Some comments: Last time I tried, I didn't see any

Re: FreeBSD-8.0 802.11n support with ath

2010-02-27 Thread Robert Watson
On Sat, 27 Feb 2010, Spil Oss wrote: Thanks for the confirmation! Is anything known re. a timeline for implementation of wireless-N? (8.1? 9.0?) I know that Rui Paulo is working on this actively; I've added him to the CC line as I'm not sure if he follows freebsd-stable. Robert N M

The press release (was: Re: 8.0-RELEASE completed...)

2009-11-27 Thread Robert Watson
On Thu, 26 Nov 2009, Ken Smith wrote: Just a quick note in case there are people here who aren't subscribed to the freebsd-announce@ mailing list. We have completed the 8.0-RELEASE cycle. Details about the release are available from the main web site, in particular the announcement itself

[libdispatch-dev] FreeBSD 8-STABLE now supports GCD, libdispatch port updated (fwd)

2009-11-13 Thread Robert Watson
: Fri, 13 Nov 2009 12:21:40 + (GMT) From: Robert Watson rob...@fledge.watson.org To: libdispatch-...@lists.macosforge.org Subject: [libdispatch-dev] FreeBSD 8-STABLE now supports GCD, libdispatch port updated Dear all: Just an FYI that all the parts are now in place to use GCD

Re: Extreme console latency during disk IO (8.0-RC1, previous releases also affected according to others)

2009-10-13 Thread Robert Watson
On Tue, 13 Oct 2009, Ivan Voras wrote: Thomas Backman wrote: I'm copying this over from the freebsd-performance list, as I'm looking for a few more opinions - not on the problems *I* am having, but rather to check whether the problem is universal or not, and if not, find a possible common

Re: openssh concerns

2009-10-11 Thread Robert Watson
On Thu, 8 Oct 2009, Oliver Fromme wrote: Are you sure? The majority of BSD machines in my vicinity have multiple accounts. And even if there's only one account, there is no reason to be careless with potential port-takeover risks. Therefore I advise against running critical daemons on

Re: samba - SIGABRT

2009-10-08 Thread Robert Watson
On Thu, 8 Oct 2009, Oliver Lehmann wrote: This was caused by your setting of the following: security.bsd.map_at_zero=0 You can reset that value to 1 and you should be alright to operate like normal otherwise you will have to compile samba over again with the above mentioned configure

Re: samba - SIGABRT

2009-10-08 Thread Robert Watson
On Thu, 8 Oct 2009, Daniel Eischen wrote: While it's probably a bug that the Samba port compiles --pie, it's also a bug that our linking bits aren't handling PIE properly either. The goal is to fix PIE with the non-NULL mapping feature in the immediate future, so with any luck the abort

Re: 8.0-RC1: kernel page fault in NLM master thread (VIMAGE or ZFS related?)

2009-09-26 Thread Robert Watson
On Fri, 25 Sep 2009, Jamie Gritton wrote: It seems to be NFS related. I think the null pointer in question is from the export's anonymous credential. Try the patch below and see if it helps (which I guess means run it overnight and see if it crashes again). I've also patched a similar

Quick 8.0 update: BETA3 builds in progress

2009-08-22 Thread Robert Watson
For those tracking the 8.0 release process, BETA3 builds have now started. They were held up for a few days while a few critical issues were resolved: - Changes to make newbus MPSAFE were reverted as they lead to reports of a number of WITNESS warnings and panics during device driver

Re: Upgrade FreeBSD 7.1 to 7.2

2009-08-22 Thread Robert Watson
On Fri, 21 Aug 2009, Miroslav Lachman wrote: I would like to do a binary upgrade from 7.1 to 7.2. I've seen the instructions here: http://www.freebsd.org/releases/7.2R/announce.html I've heard that it's safest to start the machine in single user mode when doing upgrades, but I see no notice

Re: Quick 8.0 update: BETA3 builds in progress

2009-08-22 Thread Robert Watson
On Sat, 22 Aug 2009, Pete French wrote: I don't have a specific ETA on BETA3 going out the door, except to say that so far several architectures have reported back on successful builds, so probably quite soon. Is there any point in making bug reports for BETA2 at this point ? I only got to

Re: Blocked process

2009-08-22 Thread Robert Watson
On Sat, 22 Aug 2009, Daniel O'Connor wrote: On Fri, 21 Aug 2009, CmdLnKid wrote: came back or the machine was rebooted. I continued for a while using /var/mail over NFS while setting or unset mail variables for the shell. You may also want to check into whether something is trying to acquire

Update - RELENG_8 open for business (but you probably noticed)

2009-08-13 Thread Robert Watson
Just a quick status update from the release engineering team: As was discussed on this mailing list, problems with the Subversion-CVS exporter arose during the RELENG_8 branching process. These have now been resolved, and for the last day or so, pending bug fixes have been rushing into the

Re: FW: 8.0-BETA2 sysinstall ignoring setting of nonInteractive

2009-08-09 Thread Robert Watson
On Mon, 3 Aug 2009, Rink Springer wrote: On Mon, Aug 03, 2009 at 11:04:31AM -0400, David Boyd wrote: Can someone PLEASE commit this fix. This fix looks OK to me; I'll ask re@ for permission. Just a status update: this is in the re@ queue but approval is pending completion of the stable/7

Re: Loading ng_socket at runtime?

2009-07-30 Thread Robert Watson
On Wed, 29 Jul 2009, Matthew Fleming wrote: I'm doing a migration from releng/6.1 to stable/7, and one of the many new things is that I get a warning when doing things with ng_socket that didn't used to happen. WARNING: attempt to net_add_domain(netgraph) after domainfinalize() I've

Re: smbfs panic when lost connection or unmount --force

2009-07-10 Thread Robert Watson
On Fri, 10 Jul 2009, Oliver Pinter wrote: It is a kernel panic, when force unmount the smbfs volume or lost the connection with the samba server. This is a NULL pointer dereference in the kernel. Per Attilio's e-mail, a stack trace should help us track it down. Thanks! Robert N M Watson

Re: trap 12

2009-07-04 Thread Robert Watson
On Fri, 3 Jul 2009, Ian J Hart wrote: Is this likely to be hardware? Details will follow if not. This looks like a kernel NULL pointer deference (faulting address 0x0), which means it is most likely a kernel bug, although it could be triggered by a hardare problem. If this early in the

Re: RELENG_7 crash

2009-04-25 Thread Robert Watson
On Tue, 21 Apr 2009, Ruslan Ermilov wrote: Is it possible I am running into some of the interface lock fixes rwatson has been working on ? This box has a lot of ng interfaces which come and go. Perhaps snmp asking about an interface that just went away caused the panic ? I disabled bsnmp

Re: RELENG_7 crash

2009-04-23 Thread Robert Watson
On Tue, 21 Apr 2009, Mike Tancsa wrote: At 04:53 PM 4/21/2009, Mikolaj Golub wrote: Just FYI, the same problem has already been registered in pr database as kern/132734. Thanks, http://www.freebsd.org/cgi/query-pr.cgi?pr=132734 does look familiar :) If you disable the snmpwalk, is

Re: RELENG_7 crash

2009-04-21 Thread Robert Watson
On Tue, 21 Apr 2009, Mike Tancsa wrote: At 11:31 AM 4/21/2009, Ruslan Ermilov wrote: : : Note that these changes simply close races around use of ifindex_table, : and make no attempt to solve the probem of disappearing ifnets. Further

Re: RELENG_7 crash

2009-04-21 Thread Robert Watson
On Wed, 22 Apr 2009, Mikolaj Golub wrote: RW There are several bugs here, one difficult to fix (lack of RW refcounting), but also stuff like ifp being derived from an interface RW number twice, but checked against NULL only the first time (line 85 RW checked for NULL, re-queried but no check

Re: FreeBSD 7.2 Release process starting...

2009-03-21 Thread Robert Watson
On Wed, 18 Mar 2009, kama wrote: What I meant was the todo page on www.freebsd.org. Like: http://www.freebsd.org/releases/7.2R/TODO.html Where problems and showstoppers where brought up. I found that information very valueble. Especially when the release went overdue I could easily see

Re: FreeBSD 7.2 Release process starting...

2009-03-19 Thread Robert Watson
On Thu, 19 Mar 2009, Jack Raats wrote: One of the most important things for us to keep an eye on in this release is that the boot loader now works on a number of pieces of hardware on which it reressed for 6.4/7.1. If it proves successful, we'll likely also do errata notes and roll new ISOs

Re: FreeBSD 7.2 Release process starting...

2009-03-18 Thread Robert Watson
On Wed, 18 Mar 2009, kama wrote: Since it's often the case that developers process quite a few outstanding MFCs during the last couple days before a code freeze starts I have changed RELENG_7 to say it is 7.2-PRERELEASE now as a bit of a heads-up that the release cycle is imminent. You

Re: NICs locking up, *tcp_sc_h

2009-03-18 Thread Robert Watson
On Sun, 15 Mar 2009, Nick Withers wrote: I'll need to think a bit about a proper fix for this, but you'll find the problem likely goes away if you eliminate all uid/gid/jail rules from your firewall. You could also tweak the syncache logic not to use a retransmit timer, which might slightly

Re: FreeBSD 7.2 Release process starting...

2009-03-18 Thread Robert Watson
On Wed, 18 Mar 2009, Ken Smith wrote: On Wed, 2009-03-18 at 10:23 +0100, kama wrote: Is it possible to get back the todo page during this release phase? During the last couple of releases I simply didn't have time to do everything and this was one of the things that fell by the wayside.

Appeal for active bug reports relating to TCP, UDP, routing locking in 7-STABLE

2009-03-17 Thread Robert Watson
Dear all: With 7.2 approaching, I wanted to review the set of known network bug reports (especially panics, hangs, lock order reversals) relating to TCP, UDP, sockets, and routing in 7-STABLE. If you are aware of problems along these that you can confirm definitely occur with 7-STABLE

Re: NICs locking up, *tcp_sc_h

2009-03-14 Thread Robert Watson
On Sat, 14 Mar 2009, Nick Withers wrote: Right, here we go! ... Turns out that the problem is a lock cycle triggered by the syncache calling, indirectly, the firewall during output, and the firewall trying to look up the connection for the packet. Thread one: Tracing PID 31 tid 100030

Re: NICs locking up, *tcp_sc_h

2009-03-13 Thread Robert Watson
On Fri, 13 Mar 2009, Nick Withers wrote: I recently installed my first amd64 system (currently running RELENG_7 from 2009-03-11) to replace an aged ppc box and have been having dramas with the network locking up. Breaking into the debugger manually and ps-ing shows the network card (e.g.,

Re: NICs locking up, *tcp_sc_h

2009-03-13 Thread Robert Watson
On Fri, 13 Mar 2009, Nick Withers wrote: Sorry for the original double-post, by the way, not quite sure how that happened... I can reproduce this problem relatively easily, by the way (every 3 days, on average). I meant to say this before, too, but it seems to happen a lot more often on the

Re: NICs locking up, *tcp_sc_h

2009-03-13 Thread Robert Watson
On Fri, 13 Mar 2009, Robert Watson wrote: Sounds like a lock leak -- if you're running INVARIANTS, then show allocks should read WITNESS and show allchains would be useful. I've had a report of a TCP lock leak possibly in tcp_input

Panics involving ppp following routing fixes

2009-03-08 Thread Robert Watson
On Sun, 8 Mar 2009, Ruben van Staveren wrote: Just a minor heads up: I've merged both Kip Macy's lock order fixes to the kernel routing code, and the route locking and reference counting fixes from kern/130652 to stable/7. These fixes should correct a number of reported network-related

Re: Where is nfsiod now?

2009-03-08 Thread Robert Watson
On Sun, 8 Mar 2009, Yoshihiro Ota wrote: I thought rc used to start nfsiod if you set nfs_cilent_enable back years ago. Now, on my 7.1-RELEASE machine, it sets up a couple of sysctls in /etc/rc.d/nfsclient script but not nfsiod. Is nfsiod obsolete by now? It is still on the system; does it

Re: Big problems with 7.1 locking up :-(

2009-02-25 Thread Robert Watson
On Wed, 25 Feb 2009, Pete French wrote: FYI, I'm currently awaiting testing results from Pete on the MFC of a number of routing table locking fixes, and once that's merged (hopefully tomorrow?) I'll start on the patches in the above PR. I've taken a crash-course in routing table locking in

Various route locking fixes merged to stable/7 (was: Re: Big problems with 7.1 locking up :-()

2009-02-25 Thread Robert Watson
of Cambridge On Wed, 25 Feb 2009, Robert Watson wrote: On Wed, 25 Feb 2009, Pete French wrote: FYI, I'm currently awaiting testing results from Pete on the MFC of a number of routing table locking fixes, and once that's merged (hopefully tomorrow?) I'll start on the patches in the above PR

Re: Big problems with 7.1 locking up :-(

2009-02-24 Thread Robert Watson
On Mon, 23 Feb 2009, aneeth wrote: http://www.freebsd.org/cgi/query-pr.cgi?pr=130652cat= OK, will give this a try, unless anyone else wants any traces from this locked machine ? Is there a known way to tickle this bug when I've rebooted, to make sure it's fixed ? We'v been having similar

Re: The machdep.hyperthreading_allowed ULE weirdness in 7.1

2009-02-23 Thread Robert Watson
On Sun, 22 Feb 2009, Maxim Sobolev wrote: Hi Jeff, I have a single-CPU system with P4 HTT-enabled processor (7.1-RELEASE-p3), kernel compiled with SCHED_ULE. This is because machdep.hlt_logical_cpus doesn't do what you think it does. It causes HTT cores to invoke the hlt instruction in

Re: The machdep.hyperthreading_allowed ULE weirdness in 7.1

2009-02-23 Thread Robert Watson
On Mon, 23 Feb 2009, Maxim Sobolev wrote: Robert Watson wrote: In the mean time, it sounds like the sysctl does need to be reimplemented or removed, but one question is how far to take it -- caches are shared to varying degrees at varying levels of the topology. However, I believe

Re: The machdep.hyperthreading_allowed ULE weirdness in 7.1

2009-02-23 Thread Robert Watson
On Mon, 23 Feb 2009, Robert Watson wrote: It's not quite that simple -- in a world of device drivers pinning threads to CPUs for workload distribution, callout threads and sched_bind()/sched_pin() for crypto load distribution, etc, you need a whole infrastructure for software-disabled CPUs

Re: The machdep.hyperthreading_allowed ULE weirdness in 7.1

2009-02-23 Thread Robert Watson
On Mon, 23 Feb 2009, Maxim Sobolev wrote: Unfortunately access to BIOS is not always an option and also some BIOSes don't even provide a feature to turn HTT off. It's not quite that simple -- in a world of device drivers pinning threads to CPUs for workload distribution, callout threads and

Re: Big problems with 7.1 locking up :-(

2009-02-21 Thread Robert Watson
On Tue, 17 Feb 2009, Mike Tancsa wrote: Do you have any other details about these issues ? Were the fixes ever MFC'd Earlier today I handed off some patches for Pete to test (attached below), which he's running alongside the patches in kern/130652. When I run with the patches,

Re: Big problems with 7.1 locking up :-(

2009-02-18 Thread Robert Watson
On Tue, 17 Feb 2009, Mike Tancsa wrote: At 05:38 PM 1/29/2009, Robert Watson wrote: On Fri, 9 Jan 2009, Pete French wrote: I have a number of HP 1U servers, all of which were running 7.0 perfectly happily. I have been testing 7.1 in it's various incarnations for the last couple of months

Re: FreeBSD 7.1-Stable only support 16 CPUs !?

2009-02-12 Thread Robert Watson
On Thu, 12 Feb 2009, James Chang wrote: Does any ever try FreeBSD 7.1-stable on box that has more than 16 CPU? I got a HP ProLiant DL 785 G5 with 32 core (Quad-Core AMD Opteron(tm) Processor 8356 (2300.10-MHz K8-class CPU) and 256G memory. When I boot this machine, it could detect 32

Re: impossible packet length ...

2009-02-08 Thread Robert Watson
On Sun, 8 Feb 2009, Peter Jeremy wrote: On 2009-Feb-08 11:31:45 +0200, Danny Braniss da...@cs.huji.ac.il wrote: Q: with rxcsum on, and a bad checksum packet is received, is it dropped by the NIC? if not, then it somewhat explains the behaviour If checksum offloading is working correctly

Re: impossible packet length ...

2009-02-08 Thread Robert Watson
On Sun, 8 Feb 2009, Danny Braniss wrote: looking at the bce source, it's not clear (to me :-). If errors are detected in bce_rx_intr(), the packet gets dropped, which I would expect to be the treatment of an offloded chekcum error, but it seems that is not the case. I think we're thinking

Re: impossible packet length ...

2009-02-08 Thread Robert Watson
On Sun, 8 Feb 2009, Danny Braniss wrote: On Sun, 8 Feb 2009, Danny Braniss wrote: looking at the bce source, it's not clear (to me :-). If errors are detected in bce_rx_intr(), the packet gets dropped, which I would expect to be the treatment of an offloded chekcum error, but it seems that

Re: jail: external and localhost distinction

2009-02-07 Thread Robert Watson
On Sat, 7 Feb 2009, Dmitry Morozovsky wrote: On Fri, 6 Feb 2009, Robert Watson wrote: RW Thank you for clarification, now I see this is actually expected behaviour RW :) RW RW Would then starting second jail with the same root and, say, 127.10.0.1 as RW an address be a workaround? RW RW

Re: To John Birrell: weird behaviors of DTrace on amd64

2009-02-06 Thread Robert Watson
On Thu, 5 Feb 2009, Klapper Zhu wrote: I am exploring DTrace on 7.1-STABLE FreeBSD amd64 and I found several weird behaviors: 1) Not all kernel functions show up in fbt provider. Take isp(4) as example: dtrace -l shows static void isp_freeze_loopdown(ispsoftc_t *, int, char *);

Re: jail: external and localhost distinction

2009-02-06 Thread Robert Watson
On Thu, 29 Jan 2009, Dmitry Morozovsky wrote: Thank you for clarification, now I see this is actually expected behaviour :) Would then starting second jail with the same root and, say, 127.10.0.1 as an address be a workaround? There's no technical reason you can't have more than one jail

Re: Puzzling change in performance

2009-01-31 Thread Robert Watson
On Fri, 30 Jan 2009, Borja Marcos wrote: The attached graphs are from a server running FreeBSD 7.1-i386 (now) with the typical Apache2+MySQL with forums, Joomla... I just cannot explain this. Disk I/O bandwidth was suffering a lot, and after the update the disks are almost idle. Any

Re: jail: external and localhost distinction

2009-01-29 Thread Robert Watson
On Thu, 29 Jan 2009, Dmitry Morozovsky wrote: am I right concluding that under FreeBSD jail there is no way to attach two processes to the same port of external interface address and localhost? I tried to move rather standard two-tier nginx(ip:80)+apache(127.1:80) scheme into a jail and on

Re: Big problems with 7.1 locking up :-(

2009-01-29 Thread Robert Watson
On Fri, 9 Jan 2009, Pete French wrote: I have a number of HP 1U servers, all of which were running 7.0 perfectly happily. I have been testing 7.1 in it's various incarnations for the last couple of months on our test server and it has performed perfectly. So the last two days I have been

Re: unkillable proceess

2009-01-21 Thread Robert Watson
On Wed, 21 Jan 2009, dikshie wrote: Hi, how to kill unkillable process: # ps axuf |grep http www 66005 73.4 1.3 87656 13164 ?? R 4:58PM 62:24.41 /usr/local/sbin/httpd -DSSL -DNOHTTPACCEPT www 4277 71.6 1.4 88680 13964 ?? R 4:12PM 48:23.40 /usr/local/sbin/httpd -DSSL

Re: Big problems with 7.1 locking up :-(

2009-01-16 Thread Robert Watson
On Fri, 16 Jan 2009, Pete French wrote: hi, please type: show lock 0xff0001254d20 and then show thread 0xXXX where X is 'owner' of previous output. http://toybox.twisted.org.uk/~pete/71_pdns_lock.png That's in Power DNS - which is interesting because the one difference

Re: Big problems with 7.1 locking up :-(

2009-01-16 Thread Robert Watson
On Fri, 16 Jan 2009, Pete French wrote: I rather feared as much. Let's run down the path of perhaps there's a problem with the new UDP locking code for a bit and see where it takes us. Is it possible to run those boxes with WITNESS -- I believe that the fact that show alllocks is failing is

Re: Big problems with 7.1 locking up :-(

2009-01-15 Thread Robert Watson
On Thu, 15 Jan 2009, Pete French wrote: Just an update on this - I tried the various kernels, but now the machine is not locking up at all. As I havent actually chnaged anything then this does not make me as happy as you might expect. I don;t know what to do now - I daare not upgrade the

Re: Big problems with 7.1 locking up :-(

2009-01-15 Thread Robert Watson
On Thu, 15 Jan 2009, Pete French wrote: In any case, if it starts to reproduceably recur, send out mail and we can see if we can track it down some more. BTW, did you establish if the version of iLo you have has a remote NMI? I seem to recall that some do, and being able to deliver an NMI

Re: Big problems with 7.1 locking up :-(

2009-01-15 Thread Robert Watson
On Thu, 15 Jan 2009, Pete French wrote: desirable. You might want to give the NMI a test run just to make sure it behaves as you think it should, though -- be aware that if DDB/KDB aren't compiled into the kernel, then an NMI will panic the box. Unfortunately it does this...

Re: Big problems with 7.1 locking up :-(

2009-01-14 Thread Robert Watson
On Wed, 14 Jan 2009, Pete French wrote: If you have BREAK_TO_DEBUGGER compiled into the kernel, then try pressing ctrl-alt-break on the console to see if you can drop into the debugger, or issue a serial break on a serial console. Well, I added BREAK_TO_DEBUGGER to the kernel config I had

Re: Big problems with 7.1 locking up :-(

2009-01-13 Thread Robert Watson
On Tue, 13 Jan 2009, Pete French wrote: Features like WITNESS and INVARIANTS may change the timing of the kernel making certain race conditions less likely; I'd run with them for a bit and see if you can reproduce the hang with them present, as they will make debugging the problem a lot

Re: Big problems with 7.1 locking up :-(

2009-01-13 Thread Robert Watson
On Tue, 13 Jan 2009, Pete French wrote: I can't (fortunately) make it lock up. I have a DL360 G5 which is unused atm. and can test on it if needed. Would it be possible to install that under amd64 and hammer it with DNS requests ? I have been trying to think what the difference might be

Re: Big problems with 7.1 locking up :-(

2009-01-12 Thread Robert Watson
On Fri, 9 Jan 2009, Garance A Drosihn wrote: At 2:39 PM -0500 1/9/09, Robert Blayzor wrote: On Jan 8, 2009, at 8:58 PM, Pete French wrote: I have a number of HP 1U servers, all of which were running 7.0 perfectly happily. I have been testing 7.1 in it's various incarnations for the last

Re: Big problems with 7.1 locking up :-(

2009-01-12 Thread Robert Watson
On Sat, 10 Jan 2009, Pete French wrote: FWIW, the other guy I know who is having this problem had already switched to using ULE under 7.0-release, and did not have any problems with it. So *his* problem was probably not related to SCHED_ULE, unless something has recently changed there.

Re: Big problems with 7.1 locking up :-(

2009-01-12 Thread Robert Watson
/09, Robert Watson wrote: On Fri, 9 Jan 2009, Garance A Drosihn wrote: At 2:39 PM -0500 1/9/09, Robert Blayzor wrote: On Jan 8, 2009, at 8:58 PM, Pete French wrote: I have a number of HP 1U servers, all of which were running 7.0 perfectly happily. I have been testing 7.1 in it's various

Re: Big problems with 7.1 locking up :-(

2009-01-12 Thread Robert Watson
On Mon, 12 Jan 2009, Pete French wrote: I'm not sure if you've done this already, but the normal suggestions apply: have you compiled with INVARIANTS/WITNESS/DDB/KDB/BREAK_TO_DEBUGGER, and do any results / panics / etc result? Sometimes these debugging tools are able to convert hangs into

Re: Big problems with 7.1 locking up :-(

2009-01-12 Thread Robert Watson
On Mon, 12 Jan 2009, Garance A Drosihn wrote: He is not eager to do a whole lot of experiments to track down the problem, since this is happening on busy production machines and he can't afford to have a lot of downtime on them (especially now that the semester at RPI has started up). The

Re: Panic in RELENG_7_1 with fxp(4)

2009-01-07 Thread Robert Watson
On Tue, 6 Jan 2009, Brandon Weisz wrote: http://people.freebsd.org/~yongari/fxp/if_fxp.c http://people.freebsd.org/~yongari/fxp/if_fxpreg.h http://people.freebsd.org/~yongari/fxp/if_fxpvar.h With this version, the system still panics as before. After the system panic with this patch, I went

Re: TCP packet out-of-order problem

2009-01-06 Thread Robert Watson
the output of uname -a on the system? Thanks, Robert N M Watson Computer Laboratory University of Cambridge On Mon, Jan 5, 2009 at 9:13 PM, Robert Watson rwat...@freebsd.org wrote: On Fri, 2 Jan 2009, Lin Jui-Nan Eric wrote: After running netstat -s -p tcp, we found that lots of packets

Re: rdump stuck in sbwait state (RELENG_7)

2009-01-05 Thread Robert Watson
On Mon, 5 Jan 2009, Terry Kennedy wrote: I may have missed this earlier in the thread, but I don't see a kernel stack trace of the stuck thread/process. Could you grab one using procstat -k, DDB, or KGDB? I'd like to confirm that the 'sbwait' really reflects waiting to send, rather than

Re: TCP packet out-of-order problem

2009-01-05 Thread Robert Watson
On Fri, 2 Jan 2009, Lin Jui-Nan Eric wrote: After running netstat -s -p tcp, we found that lots of packets are discarded due to memory problems. We googled for it, and found that sysctl oid net.inet.tcp.reass.maxsegments became 0, therefore packets never reassembled. Then we checked our

Re: rdump stuck in sbwait state (RELENG_7)

2009-01-05 Thread Robert Watson
On Sat, 3 Jan 2009, Terry Kennedy wrote: Sorry, I can't think of any - by the time you see it hung, whatever went wrong has already happened. You might glean some insight from the TCP socket state (on the FreeBSD side, use 'netstat -A' to print the PCB address and gdb to dump the contents

Re: MFC ZFS: when?

2008-11-24 Thread Robert Watson
On Fri, 21 Nov 2008, Zaphod Beeblebrox wrote: In several of the recent ZFS posts, multiple people have asked when this will be MFC'd to 7.x. This query has been studiously ignored as other chatter about whatever ZFS issue is discussed. Presumably the MFC schedule is largely up to Pawel, who

Re: Install issues with 7.x

2008-11-02 Thread Robert Watson
On Wed, 29 Oct 2008, Ryan wrote: Hello, I purchased a new Clevo M860TU on the account that it ran linux very well and was hoping it would fair the same on FreeBSD. Not so much, little help? I posted this in mobile originally but though stable would be a better choice. Don't know if it is more

Re: 7.x and multiple IPs in jails

2008-10-29 Thread Robert Watson
On Tue, 28 Oct 2008, Chris St Denis wrote: Serious question here (not trolling). These patches have been around for years, why have they never been committed to trunk/stable? Network stacks are incredibly complicated pieces of software, and some of the short-cuts jail took to accomplish

Re: UDP LOR with the latest RELENG_7

2008-10-12 Thread Robert Watson
On Fri, 10 Oct 2008, Robert Watson wrote: On Fri, 10 Oct 2008, Jeremy Chadwick wrote: I'll see whether the system still locks up or not though.. Okay, I'm bringing rwatson@ into the thread since this is specific to UDP. I've now fixed the bug leading to the lock order reversal; I'd

Re: UDP LOR with the latest RELENG_7

2008-10-10 Thread Robert Watson
On Fri, 10 Oct 2008, Jeremy Chadwick wrote: I'll see whether the system still locks up or not though.. Okay, I'm bringing rwatson@ into the thread since this is specific to UDP. Crumbs. It looks like the tunable fetch got dropped into the wrong function of udp_inpcb_init() and

Re: UDP LOR with the latest RELENG_7

2008-10-10 Thread Robert Watson
On Fri, 10 Oct 2008, Jeremy Chadwick wrote: I'll see whether the system still locks up or not though.. Okay, I'm bringing rwatson@ into the thread since this is specific to UDP. I've now fixed the bug leading to the lock order reversal; I'd be interested in knowing if it also corrects

Re: stable 7.0 and nslookup help command

2008-10-07 Thread Robert Watson
On Tue, 7 Oct 2008, Jeremy Chadwick wrote: Not to dissuade you from what you're trying to accomplish, but nslookup has been deprecated (this has been stated a few times by the BIND folks), and host is probably on its way out as well (though I remember somewhere, sometime, nslookup used to

Re: Is FreeBSD a suitable choice for a MacBook? --- WHY?

2008-10-06 Thread Robert Watson
On Mon, 6 Oct 2008, Dr. Aharon Friedman wrote: Sorry, I meant BSD. Here is the link: http://www.freebsd.org/news/press-rel-3.html Aharon Friedman I don't see the origina message you replied to on the list, so am replying to it via your post... I'm just a lurker, but even I know that

Re: bad NFS/UDP performance

2008-10-05 Thread Robert Watson
On Sat, 4 Oct 2008, Danny Braniss wrote: at the moment, the best I can do is run it on a different hardware that has if_em, the results are in ftp://ftp.cs.huji.ac.il/users/danny/lock.prof/7.1-1000.em the benchmark ran better with the Intel NIC, averaged UDP 54MB/s, TCP 53MB/s (I get the

Re: bad NFS/UDP performance

2008-10-03 Thread Robert Watson
On Fri, 3 Oct 2008, Danny Braniss wrote: it more difficult than I expected. for one, the kernel date was missleading, the actual source update is the key, so the window of changes is now 28/July to 19/August. I have the diffs, but nothing yet seems relevant. on the other hand, I tried

Re: bad NFS/UDP performance

2008-10-03 Thread Robert Watson
On Fri, 3 Oct 2008, Danny Braniss wrote: gladly, but have no idea how to do LOCK_PROFILING, so some pointers would be helpfull. The LOCK_PROFILING(9) man page isn't a bad starting point -- I find that the defaults work fine most of the time, so just use them. Turn the enable syscl on just

Re: bad NFS/UDP performance

2008-10-03 Thread Robert Watson
On Fri, 3 Oct 2008, Danny Braniss wrote: OK, so it looks like this was almost certainly the rwlock change. What happens if you pretty much universally substitute the following in udp_usrreq.c: Currently Change to - - INP_RLOCK

Re: bad NFS/UDP performance

2008-10-03 Thread Robert Watson
On Fri, 3 Oct 2008, Danny Braniss wrote: On Fri, 3 Oct 2008, Danny Braniss wrote: gladly, but have no idea how to do LOCK_PROFILING, so some pointers would be helpfull. The LOCK_PROFILING(9) man page isn't a bad starting point -- I find that the defaults work fine most of the time, so

Re: resource leak

2008-10-02 Thread Robert Watson
On Wed, 1 Oct 2008, Stephen Clark wrote: A big part of problem is this seems to take about 100 days of uptime to occur. We have some inhouse test boxes but have never seen the problem, probably because non of them have been up more than about 45 days. The units in the field, of which there

Re: resource leak

2008-10-01 Thread Robert Watson
On Wed, 1 Oct 2008, Gary Palmer wrote: Periodically logging ps -auxw output to a file would be useful, as ideally you'd gradually see the list get longer and longer over time; it's possible you have many zombie processes as a result of a parent which is not reaping its children (calling

Re: jails and mac_seeotheruids problems in 6-STABLE

2008-09-30 Thread Robert Watson
On Tue, 30 Sep 2008, George Mamalakis wrote: I have 3 servers in my lab. 2 of them are running 6-STABLE and one of them is running 7-STABLE. All three have services running in jails. I noticed a very peculiar behavior in 6-STABLE when I set the sysctl security.mac.seeotheruids.enabled=1. The

Re: system hangup - I'm lost

2008-09-30 Thread Robert Watson
On Tue, 30 Sep 2008, Gavin Atkinson wrote: On Mon, 2008-09-29 at 22:14 +0200, Oliver Lehmann wrote: Any idea what I could do to shed some more light on this behaviour? Why it is happening and what really is causing it? Would enabling the kernel debugger really help here? I mean the

Re: 7.1-PRERELEASE : bad network performance (nfe0)

2008-09-30 Thread Robert Watson
On Tue, 30 Sep 2008, Arno J. Klaassen wrote: However, the request/respones tests are awfull for my notebook (test repeated on the notebook for the sake of conviction) : Is it possible to rerun these tests with a 7.0 kernel of the same general configuration? That would help us determine if

  1   2   3   4   5   6   >