Re: amd(8) cores dump when load high
On Tue, 23.12.2008 at 00:44:53 +0800, Lin Jui-Nan Eric wrote: Dear listers, We currently found that amd frequently cores dump while loading is high (about 4~5) after we upgrade world kernel from 7.0-RELEASE to 7.1-PRERELEASE. I have read -stable and svn log of 7-STABLE, but can not found a report or a solution. Did anyone have the same issue? Thank you very much. Ever since I switched from file-based NSS to LDAP, amd(8) has been crashing on me almost every day, especially if there's no LDAP server available during boot (ie. the laptop is not on the home network). It looks like the error handling in NSS requests could be improved, but I've yet to investigate the whole matter. Load plays no role in amd(8) crashing (at least for me). Cheers, Ulrich Spoerlein -- It is better to remain silent and be thought a fool, than to speak, and remove all doubt. ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
moused(8) ate my umass(4) devices, it's true!
Hey all, I've observed a very weird behaviour with my USB stick for quite a while now (probably 4 months; sadly, I don't have any dates handy). Anyway, I have this weird SUN Keyboard - USB adapter, which offers an ukbd(4) and ums(4) device to the system, although there is no mouse attached to the Sun keyboard I'm using. ukbd0: vendor 0x0430 PS/2 KB MS, class 0/0, rev 1.00/0.04, addr 3 on uhub4 kbd2 at ukbd0 ums0: vendor 0x0430 PS/2 KB MS, class 0/0, rev 1.00/0.04, addr 3 on uhub4 ums0: 3 buttons. This worked fine on RELENG_7 till somewhere around summer. Now, whenever there is a moused(8) listening on this fake ums(4) port, the umass(4) device will get stuck somewhere in CAM-land. It probes fine: Dec 14 10:24:49 roadrunner kernel: umass0: Samsung YP-U2, class 0/0, rev 2.00/10.01, addr 6 on uhub4 but then only BBB bulk transfer timeout messages follow every so often. The da0 device never shows up. Dec 14 10:26:59 roadrunner kernel: umass0: BBB reset failed, TIMEOUT Dec 14 10:27:04 roadrunner kernel: umass0: BBB bulk-in clear stall failed, IOERROR Dec 14 10:27:04 roadrunner kernel: umass0: BBB bulk-out clear stall failed, IOERROR Dec 14 10:28:09 roadrunner kernel: umass0: BBB reset failed, IOERROR Dec 14 10:28:09 roadrunner kernel: umass0: BBB bulk-in clear stall failed, IOERROR Dec 14 10:28:09 roadrunner kernel: umass0: BBB bulk-out clear stall failed, IOERROR Dec 14 10:29:14 roadrunner kernel: umass0: BBB reset failed, IOERROR Dec 14 10:29:14 roadrunner kernel: umass0: BBB bulk-in clear stall failed, IOERROR Dec 14 10:29:14 roadrunner kernel: umass0: BBB bulk-out clear stall failed, IOERROR Dec 14 10:30:19 roadrunner kernel: umass0: BBB reset failed, IOERROR Dec 14 10:30:19 roadrunner kernel: umass0: BBB bulk-in clear stall failed, IOERROR Dec 14 10:30:19 roadrunner kernel: umass0: BBB bulk-out clear stall failed, IOERROR Dec 14 10:31:24 roadrunner kernel: umass0: BBB reset failed, IOERROR Dec 14 10:31:24 roadrunner kernel: umass0: BBB bulk-in clear stall failed, IOERROR Dec 14 10:31:24 roadrunner kernel: umass0: BBB bulk-out clear stall failed, IOERROR I cannot unplug the USB stick (instant panic) and kldunloading umass is also bad (instant panic). I have to reboot the system and remove the device then. Today, I figured out that it depends wholly on moused(8) running on that unpopulated mouse port. Killing the moused process, which will start automatically when ums0 attaches, before plugging in the umass device and everybody is happy. I'm glad I found this workaround, but the situation sucks anyway. Other than binary searching the offending commit, what debugging could I do? Would a ktrace of the moused(8) be helpful when attaching umass? Is it perhaps polling the port too often waiting for a mouse to appear? Also, can I somehow blacklist the mouse-port of this adapter? I do not intend to use a 3 button Sun mouse, ever. Cheers, Ulrich Spoerlein -- It is better to remain silent and be thought a fool, than to speak, and remove all doubt. ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: LORs in RELENG_7
On Thu, 20.11.2008 at 17:56:07 -0500, Michael Proto wrote: On Thu, Nov 20, 2008 at 4:11 PM, Ulrich Spoerlein [EMAIL PROTECTED]wrote: Hi, I'm running my RELENG_7 kernel with WITNESS and there's always a LOR when pf(4) is enabled: lock order reversal: 1st 0xc09ca828 ifnet (ifnet) @ /usr/src/sys/net/if.c:849 2nd 0xc45d604c pf task mtx (pf task mtx) @ /usr/src/sys/modules/pf/../../contrib/pf/net/pf_if.c:916 KDB: stack backtrace: db_trace_self_wrapper(c08df797,fb671764,c0630e8e,c08e1c96,c45d604c,...) at db_trace_self_wrapper+0x26 kdb_backtrace(c08e1c96,c45d604c,c45d3b1c,c45d3b1c,c45d379e,...) at kdb_backtrace+0x29 witness_checkorder(c45d604c,9,c45d379e,394,c08e9058,...) at witness_checkorder+0x6de _mtx_lock_flags(c45d604c,0,c45d379e,394,fb6717dc,...) at _mtx_lock_flags+0xbc pfi_attach_group_event(0,c445,c08e9058,374,c44a920c,...) at pfi_attach_group_event+0x4e if_addgroup(c441c000,c08f70d6,4,0,0,...) at if_addgroup+0x2c7 if_clone_createif(0,0,c08e9563,87,0,...) at if_clone_createif+0x81 if_clone_create(fb671943,4,0,44,180,...) at if_clone_create+0x8c tunclone(0,c461e400,fb671943,4,fb67195c,...) at tunclone+0x17a devfs_lookup(fb6719d0,fb6719d0,fb671b7c,c418de04,2,...) at devfs_lookup+0x50e VOP_LOOKUP_APV(c0928f40,fb6719d0,c412f230,c08e77ef,2a9,...) at VOP_LOOKUP_APV+0xa5 lookup(fb671b7c,c08e77ef,c6,bf,c461e92c,...) at lookup+0x58e namei(fb671b7c,c412f230,fb671a74,246,c0983774,...) at namei+0x34b vn_open_cred(fb671b7c,fb671c78,ce8,c461e400,c4460558,...) at vn_open_cred+0x2c9 vn_open(fb671b7c,fb671c78,ce8,c4460558,c05e807d,...) at vn_open+0x33 kern_open(c412f230,80a0f18,0,3,808ecfa,...) at kern_open+0xe7 open(c412f230,fb671cfc,c,c08e28c3,c092c0b8,...) at open+0x30 syscall(fb671d38) at syscall+0x2b3 Xint0x80_syscall() at Xint0x80_syscall+0x20 --- syscall (5, FreeBSD ELF32, open), eip = 0x2835a65b, esp = 0xbfbfeafc, ebp = 0xbfbfeb38 --- Are you using user or group rules in your pf.conf? IIRC there is still a known LOR in the socket layer with rules using the user or group filters. No, I'm aware of the problems with pf(4) and user/group rules. This LOR is in combination with rules on tun(4) devices, as you can see from the backtrace. I wonder what tunclone() is doing in there, though. Cheers, Ulrich Spoerlein -- It is better to remain silent and be thought a fool, than to speak, and remove all doubt. ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: LORs in RELENG_7
if xpt_async() is calling into uma (as it obviously does). sys/dev/firewire/sbp.c: 2202 if (sdev-path) { 2203 SBP_LOCK(sdev-target-sbp); 2204 xpt_release_devq(sdev-path, 2205 sdev-freeze, TRUE); 2206 sdev-freeze = 0; 2207 xpt_async(AC_LOST_DEVICE, sdev-path, NULL); 2208 xpt_free_path(sdev-path); 2209 sdev-path = NULL; 2210 SBP_UNLOCK(sdev-target-sbp); 2211 } Cheers, Ulrich Spoerlein -- It is better to remain silent and be thought a fool, than to speak, and remove all doubt. ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
LORs in RELENG_7
+0x10 passcleanup(c42c6700,c08b50c7,c09eff00,4,c08db41b,...) at passcleanup+0x2e camperiphfree(c42c6700,0,f93a96b0,c04568dd,c42c6700,...) at camperiphfree+0xbb cam_periph_invalidate(c42c6700,c0983774,f93a96e4,c046a5ea,c42c6700,...) at cam_periph_invalidate+0x3e cam_periph_async(c42c6700,100,c418a250,0,f93a96e0,...) at cam_periph_async+0x2d passasync(c42c6700,100,c418a250,0,c42f8a00,...) at passasync+0xca xpt_async_bcast(0,4,c08b53c5,11a5,c404d280,...) at xpt_async_bcast+0x32 xpt_async(100,c418a250,0,89b,0,...) at xpt_async+0x194 sbp_cam_detach_sdev(c402f4c8,0,c402f484,1,f93a982c,...) at sbp_cam_detach_sdev+0xa4 sbp_cam_detach_target(c14729a8,c14729a8,c08250c6,c44263f0,10,...) at sbp_cam_detach_target+0x5b sbp_post_explore(c402f400,f93a9ce8,f93a9ce4,675,0,...) at sbp_post_explore+0xa2 fw_bus_probe_thread(c404f000,f93a9d38,c08d8d0f,31c,c402b570,...) at fw_bus_probe_thread+0x69b fork_exit(c0513500,c404f000,f93a9d38) at fork_exit+0xb8 fork_trampoline() at fork_trampoline+0x8 --- trap 0, eip = 0, esp = 0xf93a9d70, ebp = 0 --- (da1:sbp0:0:1:0): lost device (da1:sbp0:0:1:0): removing device entry I reckon these problems should appear in -STABLE ... Cheers, Ulrich Spoerlein -- It is better to remain silent and be thought a fool, than to speak, and remove all doubt. ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Any working ichsmb(4) platforms out there?
On Thu, 11.09.2008 at 15:14:52 +0100, Bruce M Simpson wrote: Does anyone have ichsmb(4) actually seeing SMBus devices? e.g. you run smbmsg -p on your FreeBSD-STABLE system and see something. I just tried it again on my IBM ThinkPad T43 and saw nothing, all I get is: ichsmb0: device timeout, status=0x41 ...in dmesg. No luck with an ICH5, here: ichsmb0: Intel 82801EB (ICH5) SMBus controller port 0x2400-0x241f irq 17 at device 31.3 on pci0 ichsmb0: [GIANT-LOCKED] smbus0: System Management Bus on ichsmb0 smb0: SMBus generic I/O on smbus0 ichsmb0: device timeout, status=0x41 ichsmb0: device timeout, status=0x41 ichsmb0: device timeout, status=0x41 ichsmb0: device timeout, status=0x41 ... # uname -rsm FreeBSD 6.3-STABLE i386 # devinfo -v|grep smb ichsmb0 pnpinfo vendor=0x8086 device=0x24d3 subvendor=0x1734 subdevice=0x101c class=0x0c0500 at slot=31 function=3 handle=\_SB_.PCI0.PM__ # kenv|grep smb smbios.bios.reldate=11/25/2004 smbios.bios.vendor=FUJITSU SIEMENS // Phoenix Technologies Ltd. smbios.bios.version=5.00 R2.14.1534.01 smbios.chassis.maker=FUJITSU SIEMENS smbios.chassis.serial=YBFC445826 smbios.chassis.tag= smbios.chassis.version=SCEE smbios.planar.maker=FUJITSU SIEMENS smbios.planar.product=D1534 smbios.planar.serial= smbios.planar.version=S26361-D1534 smbios.socket.enabled=1 smbios.socket.populated=1 smbios.system.maker=FUJITSU SIEMENS smbios.system.product=SCENIC E smbios.system.serial=YBFC445826 smbios.system.uuid=93D4A7A3-705F-11D9-8688-00300577E7A0 smbios.system.version= Cheers, Ulrich Spoerlein -- It is better to remain silent and be thought a fool, than to speak, and remove all doubt. ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: cpufreq(4) panic on RELENG_7 (was: Re: Call for bfe(4) testers.)
Hi John, I now figured out the who, the why still eludes me. So, after your MFC of ichss.c on June 27th the device now attaches at my laptop. It didn't before, so it could cause no trouble. With ichss loaded, the kernel will panic 1-3 minutes after powerd has been started (if I kill powerd early enough, it seems pretty stable). I'm now running a kernel from 2008-08-08 with hint.ichss.0.disabled=1 Applying your patch to kern_cpu.c does not help though. I'll be happy to try further patches to make ichss behave well, although I'll never use it for this laptop, as EST is the only technique useful on this old Pentium-M. Will also disable p4tcc. This was not attaching during the RELENG_6 times but leads to ridiculous rates of 75 MHz. If p4tcc attaching is new, that might point to the culprit. A good quick test would be to disable individual cpufreq drivers to find out which one causes the panic. p4tcc attaching was new relative to RELENG_6, not relative to my working 7.x kernel of 2008-06-13. Cheers, Ulrich Spoerlein -- It is better to remain silent and be thought a fool, than to speak, and remove all doubt. ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Problem with /boot/loader [A new patch]
On Sat, 09.08.2008 at 17:22:01 +0800, Eugene Grosbein wrote: On Fri, Aug 08, 2008 at 12:49:28PM -0400, John Baldwin wrote: My realization this morning is that software interrupts ('int X') in real mode disable interrupts just like hardware interrupts do. Thus, my patch changes BTX to disable interrupts for both cases 1) and 2) now. I think this will fix the hangs. I'm still including the code to explicitly initialize the eflags for user requests to a known-good value. It still has interrupts enabled which means that case 3) should know always run with interrupts enabled (which is the desired state), but the client can disable interrupts in the eflags in the vm86 structure if desired. The updated patch (same URL, new patch) is at http://www.FreeBSD.org/~jhb/patches/btx_hang.patch Sigh, it does not fix my problem described here: http://groups.google.ru/group/muc.lists.freebsd.stable/browse_thread/thread/538039f40b469e2a I've just updated my 7.0-STABLE to latest sources, applied your patch using cd /usr/src; patch -p6 ~/btx_hang.patch, it has applied cleanly. Then I've rebuilt and reinstalled kernel and world and rebooted. My problem persists as it was. I'm not sure about which piece of code you are talking here (boot0, boot1, boot2, loader?) But if it's one of the former, you dont need to installworld, but install new boot blocks using either fdisk -B or bsdlabel -B (or both). hth, Ulrich Spoerlein -- It is better to remain silent and be thought a fool, than to speak, and remove all doubt. ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
What is cryptosoft0?
Hi, today I discovered the following dmesg line on my laptop: cryptosoft0: software crypto on motherboard and I've not seen this one before, so: what is cryptosoft and should I care? I could imagine it's a pseudo-device by crypto(9) so the API is the same whether crypto hardware is installed or not. Anyway, I think a manpage link/update would be in order: % man -k cryptosoft cryptosoft: nothing appropriate Cheers, Ulrich Spoerlein -- It is better to remain silent and be thought a fool, than to speak, and remove all doubt. ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
cpufreq(4) panic on RELENG_7 (was: Re: Call for bfe(4) testers.)
On Mon, 04.08.2008 at 16:07:55 -0400, John Baldwin wrote: On Monday 04 August 2008 02:29:19 pm Ulrich Spoerlein wrote: Fatal trap 12: page fault while in kernel mode cpuid = 0; apic id = 00 fault virtual address = 0x38 fault code = supervisor read, page not present instruction pointer = 0x20:0xc058ec16 stack pointer = 0x28:0xfb8b8ac8 frame pointer = 0x28:0xfb8b8ac8 code segment= base 0x0, limit 0xf, type 0x1b = DPL 0, pres 1, def32 1, gran 1 processor eflags= interrupt enabled, resume, IOPL = 0 current process = 1176 (powerd) db:0:kdb.enter.default show pcpu cpuid= 0 curthread= 0xc4ec0aa0: pid 1176 powerd curpcb = 0xfb8b8d90 fpcurthread = none idlethread = 0xc3f80cc0: pid 10 idle: cpu0 APIC ID = 0 currentldt = 0x50 db:0:kdb.enter.default bt Tracing pid 1176 tid 100103 td 0xc4ec0aa0 device_is_attached(0,c87e6b40,fb8b8afc,0,101,...) at device_is_attached+0x6 cf_set_method(c420b600,c87e6b40,64,fb8b8ba4,c87e33b4,...) at cf_set_method+0x6a3 cpufreq_curr_sysctl(c420d840,c4207000,0,fb8b8ba4,fb8b8ba4,...) at cpufreq_curr_sysctl+0x232 sysctl_root(fb8b8ba4,4,1,c4ec0aa0,c4501d38,...) at sysctl_root+0x137 userland_sysctl(c4ec0aa0,fb8b8c14,4,0,0,...) at userland_sysctl+0x151 __sysctl(c4ec0aa0,fb8b8cfc,18,fb8b8ca0,46,...) at __sysctl+0xec syscall(fb8b8d38) at syscall+0x345 Xint0x80_syscall() at Xint0x80_syscall+0x20 --- syscall (202, FreeBSD ELF32, __sysctl), eip = 0x28161bd3, esp = 0xbfbfe8cc, ebp = 0xbfbfe8f8 --- db:0:kdb.enter.default capture off Seems like I caught RELENG_7 during a bad time. Will update again. What cpufreq drivers do you have loaded and attached? This patch might work around the issue, but I suspect there is a bug in one of the cpufreq drivers. Hi John, sorry for the slow update, please bear with me. This is on a first generation Pentium-M (Banias core) with EST (and also p4tcc attached, as I just discovered): CPU: Intel(R) Pentium(R) M processor 1.50GHz (1495.15-MHz 686-class CPU) Origin = GenuineIntel Id = 0x6d6 Stepping = 6 Features=0xafe9f9bfFPU,VME,DE,PSE,TSC,MSR,MCE,CX8,SEP,MTRR,PGE,MCA,CMOV,PAT,CLFLUSH,DTS,ACPI,MMX,FXSR,SSE,SSE2,SS,TM,PBE Features2=0x180EST,TM2 .. cpu0: ACPI CPU on acpi0 est0: Enhanced SpeedStep Frequency Control on cpu0 p4tcc0: CPU Frequency Thermal Control on cpu0 dev.cpu.0.%desc: ACPI CPU dev.cpu.0.%driver: cpu dev.cpu.0.%location: handle=\_PR_.CPU0 dev.cpu.0.%pnpinfo: _HID=none _UID=0 dev.cpu.0.%parent: acpi0 dev.cpu.0.freq: 300 dev.cpu.0.freq_levels: 1500/-1 1312/-1 1200/-1 1050/-1 1000/-1 875/-1 800/-1 700/-1 600/-1 525/-1 450/-1 375/-1 300/-1 225/-1 150/-1 75/-1 dev.cpu.0.cx_supported: C1/1 C2/1 C3/85 C4/185 dev.cpu.0.cx_lowest: C1 dev.cpu.0.cx_usage: 100.00% 0.00% 0.00% 0.00% dev.acpi_perf.0.%parent: cpu0 dev.est.0.%desc: Enhanced SpeedStep Frequency Control dev.est.0.%driver: est dev.est.0.%parent: cpu0 dev.est.0.freq_settings: 1500/-1 1200/-1 1000/-1 800/-1 600/-1 dev.cpufreq.0.%driver: cpufreq dev.cpufreq.0.%parent: cpu0 dev.p4tcc.0.%desc: CPU Frequency Thermal Control dev.p4tcc.0.%driver: p4tcc dev.p4tcc.0.%parent: cpu0 dev.p4tcc.0.freq_settings: 1/-1 8750/-1 7500/-1 6250/-1 5000/-1 3750/-1 2500/-1 1250/-1 a kernel from 2008-06-13 is the last known working one. I just had the same crash with a kernel from sources at 2008-07-01 and am new recompiling for 2008-06-24. Your MFC of est.c rev 180044 might be the problem, I'll try a backout once I confirmed that the 2008-06-24 kernel is running stable. Index: kern_cpu.c === RCS file: /usr/cvs/src/sys/kern/kern_cpu.c,v retrieving revision 1.27.2.2 diff -u -r1.27.2.2 kern_cpu.c --- kern_cpu.c 9 May 2008 19:02:10 - 1.27.2.2 +++ kern_cpu.c 4 Aug 2008 20:07:41 - @@ -329,6 +329,8 @@ /* Next, set any/all relative frequencies via their drivers. */ for (i = 0; i level-rel_count; i++) { set = level-rel_set[i]; + if (set-dev == NULL) + continue; if (!device_is_attached(set-dev)) { error = ENXIO; goto out; Will try that one too, hopefully tomorrow. Will also disable p4tcc. This was not attaching during the RELENG_6 times but leads to ridiculous rates of 75 MHz. Cheers, Ulrich Spoerlein -- It is better to remain silent and be thought a fool, than to speak, and remove all doubt. ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: ddb(4) scripts not working in RELENG_7?
Hi Robert, On Sun, 03.08.2008 at 14:49:00 +0100, Robert Watson wrote: On Sun, 3 Aug 2008, Ulrich Spoerlein wrote: I was testing a patch and getting a panic (page fault while in kernel mode) in RELENG_7 running multiuser mode, but no scripts were automagically run, although I configured ddb_enable=YES in rc.conf. It simply dropped me to the interactive ddb(4) prompt, nothing more. Do you have any idea what I could be missing? I have been using DDB scripts on 7-STABLE without any problems, but I'm not sure I've tried it with a page fault, just regular panics. Could you try entering the debugger via sysctl debug.kdb.panic=1, which forces a panic, and see if your scripts run then? Perhaps there's some inconsistency in how we're entering the debugger. If things still appear not to be happening, try setting up a kdb.enter.default script and see if that works? Spot on! Entering via sysctl works as expected; the 'default' script will also be executed after a page fault, but not the panic-script. So either page faults should call the panic-script or some sort of kdb.enter.pfault should be introduced? Either way, I see another manpage update coming up :) Cheers, Ulrich Spoerlein -- It is better to remain silent and be thought a fool, than to speak, and remove all doubt. ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
ddb(4) scripts not working in RELENG_7?
Hi Robert, I was testing a patch and getting a panic (page fault while in kernel mode) in RELENG_7 running multiuser mode, but no scripts were automagically run, although I configured ddb_enable=YES in rc.conf. It simply dropped me to the interactive ddb(4) prompt, nothing more. Do you have any idea what I could be missing? Btw, you might wanna update the ddb(8) manpage's History section, as the feature seems to first appear in 7.1 :) % egrep ddb|dump /etc/rc.conf dumpdev=/dev/ad0s3 ddb_enable=YES % sysctl debug.ddb.scripting.scripts debug.ddb.scripting.scripts: lockinfo=show locks; show alllocks; show lockedvnods kdb.enter.panic=textdump set; capture on; run lockinfo; show pcpu; bt; ps; alltrace; capture off; call doadump; reset kdb.enter.witness=run lockinfo Cheers, Ulrich Spoerlein -- It is better to remain silent and be thought a fool, than to speak, and remove all doubt. ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
RELENG_6 regression: ums0: X report 0x0002 not supported
Hi, after updating an Intel S5000PAL system from 6.2 to 6.3, ums(4) is no longer attaching correctly. Here's an dmesg diff between 6.2 and 6.3 uhub3: Intel UHCI root hub, class 9/0, rev 1.00/1.00, addr 1 uhub3: 2 ports with 2 removable, self powered ehci0: EHCI (generic) USB 2.0 controller mem 0xe8d0-0xe8d003ff irq 23 at device 29.7 on pci0 ehci0: [GIANT-LOCKED] usb4: EHCI version 1.0 usb4: companion controllers, 2 ports each: usb0 usb1 usb2 usb3 usb4: EHCI (generic) USB 2.0 controller on ehci0 usb4: USB revision 2.0 uhub4: Intel EHCI root hub, class 9/0, rev 2.00/1.00, addr 1 uhub4: 8 ports with 8 removable, self powered ukbd0: Avocent Avocent Embedded DVC 1.0, rev 2.00/0.00, addr 2, iclass 3/1 kbd2 at ukbd0 ums0: Avocent Avocent Embedded DVC 1.0, rev 2.00/0.00, addr 2, iclass 3/1 -ums0: 3 buttons and Z dir. -uhid0: Avocent Avocent Embedded DVC 1.0, rev 2.00/0.00, addr 2, iclass 3/1 -uhid0: could not read endpoint descriptor -device_attach: uhid0 attach returned 6 +ums0: X report 0x0002 not supported +device_attach: ums0 attach returned 6 Attached is the full 6.3 dmesg. Looks weird to me, anything I can try on this hardware? Uli dmesg.boot Description: Binary data ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
RELENG_6 regression: panic: vm_fault on nofault entry, addr: c8000000
Hi, there's a regression going from 6.2 to 6.3, where it will panic upon booting the kernel within vm_fault. This problem has been discussed before, but I'm seeing it reliably on a RELENG_6 checkout from 5th of May. It affects multiple (but identical) systems, here's an verbose boot leading to the panic. Please note that 6.2 was running fine on these machines, they also boot normally if I disable ACPI (but this is not really an option). SMAP type=01 base= len=0009d800 SMAP type=02 base=0009d800 len=2800 SMAP type=02 base=000ce000 len=2000 SMAP type=02 base=000e4000 len=0001c000 SMAP type=01 base=0010 len=cfe6 SMAP type=03 base=cff6 len=9000 SMAP type=04 base=cff69000 len=00017000 SMAP type=02 base=cff8 len=0008 SMAP type=02 base=e000 len=1000 SMAP type=02 base=fec0 len=0001 SMAP type=02 base=fee0 len=1000 SMAP type=02 base=ff00 len=0100 SMAP type=01 base=0001 len=3000 786432K of memory above 4GB ignored Copyright (c) 1992-2008 The FreeBSD Project. Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994 The Regents of the University of California. All rights reserved. FreeBSD is a registered trademark of The FreeBSD Foundation. FreeBSD 6.3-20080505-SNAP #0: Mon May 5 11:42:32 UTC 2008 [EMAIL PROTECTED]:/usr/obj/usr/src/sys/GENERIC Preloaded elf kernel /boot/kernel/kernel at 0xc1051000. Preloaded mfs_root /boot/mfsroot at 0xc10511e8. Preloaded elf module /boot/modules/acpi.ko at 0xc105122c. MP Configuration Table version 1.4 found at 0xc009dd71 Table 'FACP' at 0xcff68e48 Table 'APIC' at 0xcff68ebc MADT: Found table at 0xcff68ebc APIC: Using the MADT enumerator. MADT: Found CPU APIC ID 0 ACPI ID 0: enabled MADT: Found CPU APIC ID 4 ACPI ID 1: enabled MADT: Found CPU APIC ID 2 ACPI ID 2: enabled MADT: Found CPU APIC ID 6 ACPI ID 3: enabled ACPI APIC Table: PTLTD APIC Calibrating clock(s) ... i8254 clock: 1193204 Hz CLK_USE_I8254_CALIBRATION not specified - using default frequency Timecounter i8254 frequency 1193182 Hz quality 0 Calibrating TSC clock ... TSC clock: 3000122064 Hz CPU: Intel(R) Xeon(TM) CPU 3.00GHz (3000.12-MHz 686-class CPU) Origin = GenuineIntel Id = 0xf64 Stepping = 4 Features=0xbfebfbffFPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CLFLUSH,DTS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE Features2=0xe4bdSSE3,RSVD2,MON,DS_CPL,VMX,EST,CNXT-ID,CX16,xTPR,PDCM AMD Features=0x2010NX,LM AMD Features2=0x1LAHF Cores per package: 2 Logical CPUs per core: 2 real memory = 3489005568 (3327 MB) Physical memory chunk(s): 0x1000 - 0x0009cfff, 638976 bytes (156 pages) 0x0010 - 0x003f, 3145728 bytes (768 pages) 0x01425000 - 0xcc488fff, 3406184448 bytes (831588 pages) avail memory = 3405979648 (3248 MB) bios32: Found BIOS32 Service Directory header at 0xc00f5960 bios32: Entry = 0xfd520 (c00fd520) Rev = 0 Len = 1 pcibios: PCI BIOS entry at 0xfd520+0x247 pnpbios: Found PnP BIOS data at 0xc00f59e0 pnpbios: Entry = f:af28 Rev = 1.0 Other BIOS signatures found: APIC: CPU 0 has ACPI ID 0 MADT: Found IO APIC ID 8, Interrupt 0 at 0xfec0 ioapic0: Routing external 8259A's - intpin 0 MADT: Found IO APIC ID 9, Interrupt 24 at 0xfec8 lapic0: Routing NMI - LINT1 lapic0: LINT1 trigger: edge lapic0: LINT1 polarity: high lapic4: Routing NMI - LINT1 lapic4: LINT1 trigger: edge lapic4: LINT1 polarity: high lapic2: Routing NMI - LINT1 lapic2: LINT1 trigger: edge lapic2: LINT1 polarity: high lapic6: Routing NMI - LINT1 lapic6: LINT1 trigger: edge lapic6: LINT1 polarity: high MADT: Interrupt override: source 0, irq 2 ioapic0: Routing IRQ 0 - intpin 2 MADT: Interrupt override: source 9, irq 9 ioapic0: intpin 9 trigger: level ioapic0 Version 2.0 irqs 0-23 on motherboard ioapic1 Version 2.0 irqs 24-47 on motherboard cpu0 BSP: ID: 0x VER: 0x00050014 LDR: 0xff00 DFR: 0x lint0: 0x00010700 lint1: 0x0400 TPR: 0x SVR: 0x01ff timer: 0x000100ef therm: 0x0200 err: 0x0001 pcm: 0x0001 ath_rate: version 1.2 SampleRate bit-rate selection algorithm wlan: 802.11 Link Layer null: null device, zero device random: entropy source, Software, Yarrow nfslock: pseudo-device io: I/O kbd: new array size 4 kbd1 at kbdmux0 mem: memory Pentium Pro MTRR support enabled ath_hal: 0.9.20.3 (AR5210, AR5211, AR5212, RF5111, RF5112, RF2413, RF5413) rr232x: RocketRAID 232x controller driver v1.02 (May 5 2008 11:42:16) hptrr: HPT RocketRAID controller driver v1.1 (May 5 2008 11:42:14) npx0: INT 16 interface acpi0: PTLTD RSDT on motherboard ioapic0: routing intpin 9 (ISA IRQ 9) to vector 48 acpi0: [MPSAFE] pci_open(1):mode 1 addr port (0x0cf8) is 0x80008058
Re: $HOME changed from 6.2 to 6.3 and 7.0 ?!
On Fri, 29.02.2008 at 13:58:27 -0800, Jeremy Chadwick wrote: On Fri, Feb 29, 2008 at 10:07:23PM +0100, Ulrich Spoerlein wrote: # $FreeBSD: src/etc/crontab,v 1.32 2002/11/22 16:13:39 tom Exp $ ... HOME=/var/log If this has changed from before, I guess it would be due to a new shell forking which always reset $HOME. Thus, it only worked before by sheer luck :) The HOME=/var/log entry in /etc/crontab was set **14 years ago**, so I don't know what the OP is talking about. Nothing has changed there. Yes, I wasn't implying the problem was with a change to /etc/crontab. I checked daily/999.local and it hasn't been touched in years, too. A (very!) wild guess would be that it has to do with the *env() changes done to the shell. Or were they not merged in time for 6.3-RELEASE? Cheers, Ulrich Spoerlein -- It is better to remain silent and be thought a fool, than to speak, and remove all doubt. ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: $HOME changed from 6.2 to 6.3 and 7.0 ?!
On Fri, 29.02.2008 at 12:52:11 +0100, Thomas Krause wrote: Dear list, after upgrading from 6.2R to 6.3R my daily jobs, which are normaly executed from /etc/daily.local, are not longer started. The entry in daily.local is $HOME/bin/save-conf.sh 6.2R executed /root/bin/save-conf.sh 6.3R (and 7.0R) tries to start /var/log/bin/save-conf.bin Why? I cannot find such a homedir in /etc/passwd! Wrong place to look, it is set via /etc/crontab: % more /etc/crontab # /etc/crontab - root's crontab for FreeBSD # # $FreeBSD: src/etc/crontab,v 1.32 2002/11/22 16:13:39 tom Exp $ # SHELL=/bin/sh PATH=/etc:/bin:/sbin:/usr/bin:/usr/sbin HOME=/var/log # # ... # # Perform daily/weekly/monthly maintenance. 1 3 * * * rootperiodic daily 15 4 * * 6 rootperiodic weekly 30 5 1 * * rootperiodic monthly If this has changed from before, I guess it would be due to a new shell forking which always reset $HOME. Thus, it only worked before by sheer luck :) Cheers, Ulrich Spoerlein -- It is better to remain silent and be thought a fool, than to speak, and remove all doubt. ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: finstall alpha3
On Wed, 06.02.2008 at 11:48:22 +0100, Ivan Voras wrote: On the other hand, here's what it *can* do currently: - it's a live CD environment, completely like an already installed FreeBSD system, only running from a read-only media (e.g. it's usable as a FixIt system) This is great, and I think it's the way to go. Since I had to repair my system the last days with a 'FixIt' CD, I think this mode could get even more improvement. It would be nice, if there where some additional system repair tools available on this CD (license permitting, of course). You'd just have to make sure to still install a clean FreeBSD. This could be accomplished by using unionfs for the 'enhanced fixit overlay' or something like that. Cheers, Ulrich Spoerlein -- It is better to remain silent and be thought a fool, than to speak, and remove all doubt. ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Reconstruct disklabel for UFS and GELI volumes
On Feb 7, 2008 1:05 AM, Torfinn Ingolfsen [EMAIL PROTECTED] wrote: Ulrich Spoerlein [EMAIL PROTECTED] wrote: There were three labels Actually, it is one label per slice, unless you are doing something unusual? s/labels/partitions/ , that's what I meant. - ad0s4a: UFS, exact size unknown. Is it possible to infer this from the UFS partition size? I can mount this already, as I simply wrote an 'a' label of maximum size to the disklabel - ad0s4b: GELI encrypted swap - ad0s4d: GELI encrypted ZVOL FWIW, I have had success with scan_ffs[1] as documented in this short article[2]. It will recover lost labels, or at least try to. If you are downloading binary packages from somewhere, be sure to double check that you get the one that fits your platform (i386 / amd64 or whatever) and version. Take it slowly, and double check all steps before comitting anything. I already made some progress. The GEOM classes place a label into the last sector (GEOM::GELI) in my case, so I could use this string to scan the whole slice overnight. Sadly, the geli swap partition has no such signature, only the geli ad0s4d partition has one. However, using geli dump I can get the original partition size. I now only have to adjust the offset/length in the bsdlabel and figure out the original size for ad0s4a (which I guess was 512MB). I should have this back running quickly, once I get home to the machine. Thanks for all the suggestions so far. Since I can't install any packages (I'm using the 6.2 fixit cd), how can I calculate the file system size from the ffsinfo output? Uli ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Reconstruct disklabel for UFS and GELI volumes
On Thu, 07.02.2008 at 20:49:10 +0100, Nikola Lečić wrote: Thanks for all the suggestions so far. Since I can't install any packages (I'm using the 6.2 fixit cd), how can I calculate the file system size from the ffsinfo output? I hate to be boring (since I already suggested the use of sysutils/testdisk), but it would be really helpful to give it a try. Please read this success story (5 mails): I would, if installing packages was more easily possible with the fixit mode. Looks like I need to take up my own live CD project again. http://lists.freebsd.org/pipermail/freebsd-questions/2007-December/164901.html Among others things, it contains some notes on how to use packages that are not included in rescue CD and how to recalculate your bsdlabel offsets and other parametres. And yes, with or without GELI your swap will appear just as a hole between normal partitions so its dimensions are probably the last thing you will reconstruct. Why is there no metadata for the GELI swap? Is it because the 'label' command is not used (would make sense to me). Anyway, I reconstructed my disklabel and everything is back to normal. A nice exercise it was, though I'd rather have done it on non critical data :) Cheers, Ulrich Spoerlein -- It is better to remain silent and be thought a fool, than to speak, and remove all doubt. ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Reconstruct disklabel for UFS and GELI volumes
Hi, Somehow[TM] an installation of 4.11 to ad0s3 managed to wipe out my existing disklabel for 7.0 on ad0s4. I now need to recover the disklabel to get my system to boot! There were three labels - ad0s4a: UFS, exact size unknown. Is it possible to infer this from the UFS partition size? I can mount this already, as I simply wrote an 'a' label of maximum size to the disklabel - ad0s4b: GELI encrypted swap - ad0s4d: GELI encrypted ZVOL I only need to find out the start of ad0s4d. Is the consumer size of an GELI device stored in the last 512 bytes metadata? Or are there some magic bytes in this 512 bytes so I could find out the exact end of ad0s4b and thus the start of ad0s4d? Any help or advice would be highly appreciated! Thanks, Uli ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: problems with LC_ALL
On Sun, 20.01.2008 at 12:53:54 -0800, Javier Elizondo wrote: Hi, I am using darwing in a mac book pro, when I open terminal I get the following message that appears only in my account, I would like to get help in order to fix it. Last login: Sun Jan 20 14:32:18 on ttys001 perl: warning: Setting locale failed. perl: warning: Please check that your locale settings: LC_ALL = (unset), LANG = UTF-8 are supported and installed on your system. perl: warning: Falling back to the standard locale (C). I have tried but without success. The languaje is EN_US with iso and the keyboard is in spanish, but not problem with it. Your LANG setting of UTF-8 is plain wrong. There is no UTF-8 language. Please check the output of locale and then set LANG to something that can be find in the output of locale -a. The keyboard is not affected by LANG, so if you want English error messages and are using UTF-8, you should place the following in your shell startup file export LANG=en_US.UTF-8 Cheers, Ulrich Spoerlein -- It is better to remain silent and be thought a fool, than to speak, and remove all doubt. ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Backup solution suggestions [ggated]
On Jan 18, 2008 9:11 AM, Johan Ström [EMAIL PROTECTED] wrote: Your no,barely, bad hell no seems to fit pretty good.. I did some testing during the night with the above (non-production) setup. What I did was doing some rsyncing over the night: while true ; do echo `date` Clearing vmail logfile rm -rf vmail echo `date` Starting rsync logfile rsync -vr /usr/var/vmail . |tee -a logfile echo `date` Rsync finished logfile done I started this at ~02.0. The results? A freshly rebooted 6.2 (6.2- RELEASE-p6 FreeBSD 6.2-RELEASE-p6 #0: Fri Jul 27 15:47:50 UTC 2007) box in the morning.. [...] What I dont have is a coredump, judging from dmesg -a savecore wasnt even run.. running it now, 5 hours later, didnt find any cores. The other end (7.0 server) wasnt affected at all. Not realy sure what it had been doing, because looking at my bandwidth graphs from the switch, nothing was done at all.. It didnt even go through one iteration of rsync... ~7.5k files/directorys seems to have been transfered, then the log doesnt say more. But according to the BW graph, after ~03.00 no traffic was sent at all... Some known bug with 6.2? There was some ggatec problems with TCP and/or sockets, I think they have been mostly resolved post-6.2. If you want to pursue this further (it *would* be a cool setup, no doubt) I'd suggest three things: - Update to 6.3 - Leave GELI out of the loop for now (only do ggate, with random data perhaps) - Build a kernel *without* options PREEMPTION hth, Uli ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Backup solution suggestions [ggated]
On Jan 17, 2008 1:31 AM, Johan Ström [EMAIL PROTECTED] wrote: Export the disk on the backup server with ggated. Bind it on the client with ggatec. Slap a GELI or GBDE encryption on top of it and then put a ZFS on top of it. You can mount/import this remote ZFS at will and do your zfs send/receive on your local box. Nothing ever leaves your box unencrypted. Now that is a cool solution! That actually sounds like something doable. I tried it out some at home between a 6.2 box (client) and 7.0 box (server), hosting the system in a ZFS sparse volume with a predefined size, exported that via ggated and connected ggatec on the client box. I then did some experimentation with just newfs, and it worked great! The only downside with this would be that the size is fixed. So I played around a bit with setting the volsize property in ZFS and it seemd to work just fine. zfs list reported the new, bigger, size. Restarted ggatec and did a growfs, and then remounted.. Yay bigger disk :) Then I went on do do some geli test, geli'ed /dev/ggate0 and newfs'ed, mounted and played around a bit. All fine.. Now came the problem, i unmounetd it, expanded the zfs volume a bit more, restarted ggatec and tried to attach it using geli again (note, I have no idea if this is supposed to work at all, I'm just testing. Havent read such things anywhere). Now I got Invalid argument. Im not realy sure about how GEOM works, but if I recall correct it uses the last sectors of the disk? If I moved X bytes of data from old end of disk to new end of disk, would that make GELI work? If I can get that to work, then this would be a kickass solution (all encryption stuff works great, I don't have to allocate all space immediatly, I can expand it later without destroying data and starting from scratch etc). I'm pretty certain that GELI cannot handle variable sized disks. But you could add GVIRSTOR into the mix. But I'd just allocate the necessary space and be done with it. Adding yet another layer is asking for trouble, imho. Some other questions, more related to ggated/c. Is this stable? Good working? how does it handle failure situations? Anyone using it for production systems? From my personal experience (which is rather limited): No, barely, bad, hell no. There were/are some open PRs about ggate. I had troubles with gmirror+ggate in that it would deadlock every other hour on SMP systems (try removing option PREEMPTION if that bug hits you). Yes this is for backup only so minor glitches might be acceptable for me, but I'd rather know about those beforehand. Give it a shot, if your systems stay up and stable, good. If the link breaks from time to time, I think ZFS should be able to recover most of it. Since it is your backup, I'd try to break it in interessting ways first, to get a feel for how robust it is. Uli ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Backup solution suggestions
On Wed, 16.01.2008 at 00:26:34 +0100, Johan Ström wrote: I create regular tarball (gziped maybee) with some files i want to backup, Then i encrypt this file with ie gpg. Then i send of this file using some unspecified network protocol to the storage server. Encrypted all the way, from my end to the remote disk.. The downside is that it is a static file.. not a dynamic filesystem, nothing I can mount and have easy access to individual files from. *Thats* what I'm looking for. Export the disk on the backup server with ggated. Bind it on the client with ggatec. Slap a GELI or GBDE encryption on top of it and then put a ZFS on top of it. You can mount/import this remote ZFS at will and do your zfs send/receive on your local box. Nothing ever leaves your box unencrypted. Cheers, Ulrich Spoerlein -- It is better to remain silent and be thought a fool, than to speak, and remove all doubt. ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: 7.0BETA4 desktop system also periodically freezes
On Wed, 09.01.2008 at 14:18:59 -0500, J.R. Oldroyd wrote: I believe this same problem may also be present on 7.0, at least on the BETA releases; BETA4 is the latest I have here. I have the same problem on two systems here: periodically the systems will stop dead (no mouse action, no ping responses from other systems, processes with windows on the screen also freeze); the hangs can be anything from a few seconds to several MINUTES; then it all comes back as if nothing happened except that keyboard input during the freeze is lost. Most of my freezes are a few seconds long, some are in the 15-60 second range, but (fortunately, rarely) I have seen some that lasted 10-15 MINUTES! [...] But the phone might ring, so I'll stop doing things, the system will become pretty-much idle, then I'll go to move the mouse and it might be frozen. When it comes back, the small load peak shows, but top and ps show nothing unusual. Sorry for the late reply, I'm behind with reading mails. But to let you know that you are not the only person, I witnessed pretty much the same thing. It happens *very* rarely but it does happen from time to time. I'm running 7.0-PRERELEASE with SCHED_ULE on i386 with 1GB RAM and ZFS on a GELI provider. The system is running Xorg7.3 with the radeon driver (I've come to blame Xorg 7.3 for all my recent problems ...) Anyway, it did *never* happen when I'm playing MP3s, but leaving mplayer paused for a couple of minutes I would return and the system was basically frozen. Sometimes after a couple of minutes it would unfreeze and replay all keystrokes and mouse movements but most of the time I'm too impatient to wait. I press the power button and nothing happens. A few minutes later it would do a regular shutdown. Strange things, indeed. The only way, I think I could track this down would be to hook up remote debugging via firewire and break to ddb, but I lack a second firewire laptop :( Cheers, Ulrich Spoerlein -- It is better to remain silent and be thought a fool, than to speak, and remove all doubt. ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: RELENG_7 2008/01/10 desktop system also periodically freezes
On Sun, 13.01.2008 at 17:25:24 -0500, J.R. Oldroyd wrote: David's suggestion re powerd may be relevant. I'd noticed that the problem seems to happen when the system is idle. I posted earlier that it seems like I can do all sorts of work without a problem then I stop for a phone call and when I resume it hangs. I tend to notice a lot of hangs when typing an email. Try with running/looping some MP3 or WAV files. My system never, ever froze during sound playback. Only when idle. But since I'm running multiple wmdocklets that update periodically idle is not really true. Cheers, Ulrich Spoerlein -- It is better to remain silent and be thought a fool, than to speak, and remove all doubt. ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
sbp(4) write error wedging GEOM mirror
shutting down, though GEOM_MIRROR: Device gm2: rebuilding provider da1 stopped. Waiting (max 60 seconds) for system process `vnlru' to stop...done Waiting (max 60 seconds) for system process `bufdaemon' to stop...done Waiting (max 60 seconds) for system process `syncer' to stop... Syncing disks, vnodes remaining...13 13 7 2 5 3 3 1 1 0 0 0 0 done All buffers synced. sbp0:1:0 request timeout(cmd orb:0x1195d28c) ... agent reset fwohci0: txd err=1e ack type_err sbp0:1:0 sbp_agent_reset_callback: resp=22 fwohci0: txd err=1e ack type_err sbp_orb_pointer_callback: xfer-resp = 22 sbp0:1:0 request timeout(cmd orb:0x1195d3c4) ... target reset fwohci0: txd err=1e ack type_err sbp0:1:0 sbp_agent_reset_callback: resp=22 fwohci0: txd err=1e ack type_err sbp_orb_pointer_callback: xfer-resp = 22 sbp0:1:0 request timeout(cmd orb:0x1195d634) ... reset start here I unplugged my laptop again Uptime: 2h57m3s fwohci0: BUS reset fwohci0: node_id=0xc800ffc2, gen=8, CYCLEMASTER mode firewire0: 3 nodes, maxhop = 2, cable IRM = 2 (me) firewire0: bus manager 2 (me) fwohci0: BUS reset fwohci0: node_id=0xc800ffc2, gen=8, CYCLEMASTER mode firewire0: 3 nodes, maxhop = 2, cable IRM = 2 (me) firewire0: bus manager 2 (me) GEOM_MIRROR: Device gm0: provider mirror/gm0 destroyed. GEOM_MIRROR: Device gm0 destroyed. Powering system off using ACPI Anything I can do to help debugging this Firewire issue? Cheers, Ulrich Spoerlein -- It is better to remain silent and be thought a fool, than to speak, and remove all doubt. ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: sbp(4) write error wedging GEOM mirror
On Fri, 28.12.2007 at 13:54:37 +0100, Ulrich Spoerlein wrote: [Ramblings about sbp(4) wedging geom mirror] Ok, it looks like sbp(4) is off the hook. I tried the rebuilding again, this time attaching da0 via umass(4) instead of sbp(4) and while it also eventually wedges, umass can recover from this situation by its own umass0: Prolific PL-3507C USB Storage Device, rev 2.00/0.01, addr 2 da0 at umass-sim0 bus 0 target 0 lun 0 da0: SAMSUNG SP2514N VF10 Fixed Direct Access SCSI-0 device da0: 40.000MB/s transfers da0: 238475MB (488397168 512 byte sectors: 255H 63S/T 30401C) GEOM_MIRROR: Component da0s1 (device gm0) broken, skipping. GEOM_MIRROR: Cannot add disk da0s1 to gm0 (error=22). GEOM_MIRROR: Component da0s2 (device gm1) broken, skipping. GEOM_MIRROR: Cannot add disk da0s2 to gm1 (error=22). GEOM_MIRROR: Component da0s1 (device gm0) broken, skipping. GEOM_MIRROR: Cannot add disk da0s1 to gm0 (error=22). GEOM_MIRROR: Component da0s1 (device gm0) broken, skipping. GEOM_MIRROR: Cannot add disk da0s1 to gm0 (error=22). GEOM_MIRROR: Device gm0: provider da0s1 detected. GEOM_MIRROR: Device gm0: provider da0s1 is stale. GEOM_MIRROR: Device gm1: provider da0s2 detected. GEOM_MIRROR: Device gm1: provider da0s2 is stale. GEOM_MIRROR: Device gm0: provider da0s1 disconnected. GEOM_MIRROR: Device gm0: provider da0s1 detected. GEOM_MIRROR: Device gm0: rebuilding provider da0s1. fwohci0: BUS reset fwohci0: node_id=0xc800ffc1, gen=2, CYCLEMASTER mode firewire0: 2 nodes, maxhop = 1, cable IRM = 1 (me) firewire0: bus manager 1 (me) fwohci0: txd err=14 ack busy_X fwohci0: txd err=14 ack busy_X fwohci0: txd err=14 ack busy_X fwohci0: BUS reset fwohci0: node_id=0xc800ffc1, gen=3, CYCLEMASTER mode firewire0: 2 nodes, maxhop = 1, cable IRM = 1 (me) firewire0: bus manager 1 (me) firewire0: New S400 device ID:0050770e013023f0 da1 at sbp0 bus 0 target 0 lun 0 da1: Prolific PL-3507C Drive 2804 Fixed Simplified Direct Access SCSI-4 device da1: 50.000MB/s transfers da1: 381554MB (781422768 512 byte sectors: 255H 63S/T 48641C) GEOM_MIRROR: Device gm2: provider da1 detected. GEOM_MIRROR: Device gm2: rebuilding provider da1. GEOM_MIRROR: Device gm0: rebuilding provider da0s1 finished. GEOM_MIRROR: Device gm0: provider da0s1 activated. GEOM_MIRROR: Device gm1: provider da0s2 disconnected. GEOM_MIRROR: Device gm1: provider da0s2 detected. GEOM_MIRROR: Device gm1: rebuilding provider da0s2. (14:08:27) [EMAIL PROTECTED]: ~# gmirror status umass0: BBB reset failed, IOERROR umass0: BBB bulk-in clear stall failed, IOERROR umass0: BBB bulk-out clear stall failed, IOERROR umass0: BBB reset failed, IOERROR umass0: BBB bulk-in clear stall failed, IOERROR umass0: BBB bulk-out clear stall failed, IOERROR umass0: BBB reset failed, IOERROR umass0: BBB bulk-in clear stall failed, IOERROR umass0: BBB bulk-out clear stall failed, IOERROR umass0: BBB reset failed, IOERROR umass0: BBB bulk-in clear stall failed, IOERROR umass0: BBB bulk-out clear stall failed, IOERROR umass0: BBB reset failed, IOERROR umass0: BBB bulk-in clear stall failed, IOERROR umass0: BBB bulk-out clear stall failed, IOERROR GEOM_MIRROR: CannotGEOM_MIRROR: Synchronization request failed (error=5). da0s2[WRITE(offset=23111270 write metadata on da0s1 (device=gm0, error=5). GEOM_MIRROR: Cannot update metada400, length=131072)] GEOM_MIRROR: Device gm1: provider da0s2 disconnected. GEOta on disk da0s1 (error=5). M_MIRROR: Device gm1: rebuilding provider da0s2 stopped. GEOM_MIRROR: Device gm0: provider da0s1 disconnected. umass0: BBB reset failed, IOERROR umass0: BBB bulk-in clear stall failed, IOERROR umass0: BBB bulk-out clear stall failed, IOERROR umass0: BBB reset failed, IOERROR umass0: BBB bulk-in clear stall failed, IOERROR umass0: BBB bulk-out clear stall failed, IOERROR umass0: BBB reset failed, IOERROR umass0: BBB bulk-in clear stall failed, IOERROR umass0: BBB bulk-out clear stall failed, IOERROR umass0: BBB reset failed, IOERROR umass0: BBB bulk-in clear stall failed, IOERROR umass0: BBB bulk-out clear stall failed, IOERROR Expumass0: BBB reset failed, IOERROR eumass0: BBB bulk-in clear stall failed, IOERROR nsumass0: BBB bulk-out clear stall failed, IOERROR ive timeout(9) function: 0xc09623a9(0xc32de800) 0.006188295 s umass0: BBB reset failed, IOERROR umass0: BBB bulk-in clear stall failed, IOERROR umass0: BBB bulk-out clear stall failed, IOERROR umass0: BBB reset failed, IOERROR umass0: BBB bulk-in clear stall failed, IOERROR ... (multiple pages) umass0: BBB bulk-in clear stall failed, IOERROR umass0: BBB bulk-out clear stall failed, IOERROR (da0:umass-sim0:0:0:0): Synchronize cache failed, status == 0x4, scsi status == 0x0 umass0: BBB reset failed, IOERROR umass0: BBB bulk-in clear stall failed, IOERROR ... (multiple pages) umass0: BBB bulk-in clear stall failed, IOERROR umass0: BBB bulk-out clear stall failed, IOERROR NameStatus Components mirror/gm2 DEGRADED ad1 da1 (12%) mirror/gm0 DEGRADED ad0s1 mirror/gm1 DEGRADED ad0s2 (14:14
Re: SMP on FreeBSD 6.x and 7.0: Worth doing?
On Fri, 21.12.2007 at 22:31:24 -0700, Brett Glass wrote: As has been reported in some other messages on this list, Linux is currently blowing FreeBSD away. It's taking as much as 20% less time to get through the benchmark, depending on exactly how the random shuffle came out. This is with 4 GB RAM, the GENERIC FreeBSD SMP kernel (using SCHED_ULE), and aufs as the storage schema for Squid. Apples and Oranges, I know, but if you're building a simple reverse cacheing proxy, have you considered Varnish? Would be very interessting how it would compare to a) FreeBSD+Squid b) Linux+Squid and c) Linux+Varnish. Cheers, Ulrich Spoerlein -- It is better to remain silent and be thought a fool, than to speak, and remove all doubt. ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Threads stuck in sbwait
Hi all, we are running the Jabber server Openfire on FreeBSD 6.1 and it doesn't close its sockets, forcing use to periodically recycle the java process. Here's some interesting output: # ps alxHp 51002 UID PID PPID CPU PRI NI VSZ RSS MWCHAN STAT TT TIME COMMAND 314 51002 1 0 20 0 492556 104812 ksesig Ss?? 10:03.35 /usr/local/jdk1.5.0/bin/java -server -jar -Xmx256M -Dopenfire.lib.dir= 314 51002 1 17 4 0 492556 104812 sbwait Ss?? 10:03.35 /usr/local/jdk1.5.0/bin/java -server -jar -Xmx256M -Dopenfire.lib.dir= 314 51002 1 17 4 0 492556 104812 sbwait Ss?? 10:03.35 /usr/local/jdk1.5.0/bin/java -server -jar -Xmx256M -Dopenfire.lib.dir= 314 51002 1 17 4 0 492556 104812 accept Ss?? 10:03.35 /usr/local/jdk1.5.0/bin/java -server -jar -Xmx256M -Dopenfire.lib.dir= 314 51002 1 17 4 0 492556 104812 sbwait Ss?? 10:03.35 /usr/local/jdk1.5.0/bin/java -server -jar -Xmx256M -Dopenfire.lib.dir= 314 51002 1 17 4 0 492556 104812 sbwait Ss?? 10:03.35 /usr/local/jdk1.5.0/bin/java -server -jar -Xmx256M -Dopenfire.lib.dir= 314 51002 1 17 4 0 492556 104812 sbwait Ss?? 10:03.35 /usr/local/jdk1.5.0/bin/java -server -jar -Xmx256M -Dopenfire.lib.dir= 314 51002 1 17 4 0 492556 104812 sbwait Ss?? 10:03.35 /usr/local/jdk1.5.0/bin/java -server -jar -Xmx256M -Dopenfire.lib.dir= 314 51002 1 17 4 0 492556 104812 sbwait Ss?? 10:03.35 /usr/local/jdk1.5.0/bin/java -server -jar -Xmx256M -Dopenfire.lib.dir= 314 51002 1 17 4 0 492556 104812 accept Ss?? 10:03.35 /usr/local/jdk1.5.0/bin/java -server -jar -Xmx256M -Dopenfire.lib.dir= 314 51002 1 17 4 0 492556 104812 sbwait Ss?? 10:03.35 /usr/local/jdk1.5.0/bin/java -server -jar -Xmx256M -Dopenfire.lib.dir= 314 51002 1 17 4 0 492556 104812 sbwait Ss?? 10:03.35 /usr/local/jdk1.5.0/bin/java -server -jar -Xmx256M -Dopenfire.lib.dir= 314 51002 1 17 4 0 492556 104812 sbwait Ss?? 10:03.35 /usr/local/jdk1.5.0/bin/java -server -jar -Xmx256M -Dopenfire.lib.dir= 314 51002 1 17 4 0 492556 104812 sbwait Ss?? 10:03.35 /usr/local/jdk1.5.0/bin/java -server -jar -Xmx256M -Dopenfire.lib.dir= 314 51002 1 17 4 0 492556 104812 sbwait Ss?? 10:03.35 /usr/local/jdk1.5.0/bin/java -server -jar -Xmx256M -Dopenfire.lib.dir= 314 51002 1 17 4 0 492556 104812 sbwait Ss?? 10:03.35 /usr/local/jdk1.5.0/bin/java -server -jar -Xmx256M -Dopenfire.lib.dir= ... # lsof -p 51002 | grep CANT ljava51002 openfire8u IPv4 0t0 TCP no PCB, CANTSENDMORE, CANTRCVMORE java51002 openfire 25u IPv4 0t0 TCP no PCB, CANTSENDMORE, CANTRCVMORE java51002 openfire 27u IPv4 0t0 TCP no PCB, CANTSENDMORE, CANTRCVMORE java51002 openfire 33u IPv4 0t0 TCP no PCB, CANTSENDMORE, CANTRCVMORE java51002 openfire 34u IPv4 0t0 TCP no PCB, CANTSENDMORE, CANTRCVMORE java51002 openfire 38u IPv4 0t0 TCP no PCB, CANTSENDMORE, CANTRCVMORE java51002 openfire 39u IPv4 0t0 TCP no PCB, CANTSENDMORE, CANTRCVMORE java51002 openfire 40u IPv4 0t0 TCP no PCB, CANTSENDMORE, CANTRCVMORE java51002 openfire 43u IPv4 0t0 TCP no PCB, CANTSENDMORE, CANTRCVMORE java51002 openfire 45u IPv4 0t0 TCP no PCB, CANTSENDMORE, CANTRCVMORE java51002 openfire 46u IPv4 0t0 TCP no PCB, CANTSENDMORE, CANTRCVMORE java51002 openfire 47u IPv4 0t0 TCP no PCB, CANTSENDMORE, CANTRCVMORE java51002 openfire 48u IPv4 0t0 TCP no PCB, CANTSENDMORE, CANTRCVMORE java51002 openfire 49u IPv4 0t0 TCP no PCB, CANTSENDMORE, CANTRCVMORE ... A ktrace of the process shows *lots* of kse_release() calls, but I'm not sure what to look for exactly. What I would try next, is to use libmap for java to use libthr instead of libpthread(libkse). Can anyone here confirm, are there known problems with java and libthr under 6.x? Thanks, Uli ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Threads stuck in sbwait
On Dec 4, 2007 1:01 PM, Ivan Voras [EMAIL PROTECTED] wrote: we are running the Jabber server Openfire on FreeBSD 6.1 and it doesn't close its sockets, forcing use to periodically recycle the java process. Here's some interesting output: Can you upgrade to FreeBSD 6.3? There were some fixes that might help you. Also, try using libthr instead of libpthread. Not easily, no. But I'll try libthr upon next restart of the process. Uli ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: pam_group vs. multiple group lines
On 8/22/07, Chuck Swiger [EMAIL PROTECTED] wrote: On Aug 21, 2007, at 2:02 PM, Richard Foulkes wrote: Ok, so how are you supposed to control membership of the wheel group via ldap? Ok, you COULD remove the local wheel entry in /etc/ group, but this would probably be a bad idea if the ldap server were unavailable. You've aptly summarized my thoughts on the matter-- I would not rely on LDAP to provide information about root or the wheel group. That is exactly the gist of my question. Of course I know that a group oneliner is the way to go. However, I saw people suggest splitting groups into multiple lines, if the lines are too long or too many groups per line (something to do with the /etc/group parser, I guess). Anyway, I want the LDAP groups to *augment* system groups. Removing wheel from /etc/group and relying on a complex network service not funny. Besides, it *does* work for file permissions etc. so some basic system calls *do* get this right. Uli ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: pam_group vs. multiple group lines
On Wed, 22.08.2007 at 10:28:40 +0200, Patrick M. Hausen wrote: On Wed, Aug 22, 2007 at 09:53:42AM +0200, Ulrich Spoerlein wrote: On 8/22/07, Chuck Swiger [EMAIL PROTECTED] wrote: On Aug 21, 2007, at 2:02 PM, Richard Foulkes wrote: Ok, so how are you supposed to control membership of the wheel group via ldap? Ok, you COULD remove the local wheel entry in /etc/ group, but this would probably be a bad idea if the ldap server were unavailable. You've aptly summarized my thoughts on the matter-- I would not rely on LDAP to provide information about root or the wheel group. That is exactly the gist of my question. Of course I know that a group oneliner is the way to go. However, I saw people suggest splitting groups into multiple lines, if the lines are too long or too many groups per line (something to do with the /etc/group parser, I guess). Anyway, I want the LDAP groups to *augment* system groups. Removing wheel from /etc/group and relying on a complex network service not funny. We do not use LDAP yet, but have been using NIS in our internal office network for years. If you use the magic + token to merge your NIS database with the static files for passwd and group information, then I'm not using the compat setting, my nsswitch.conf contains passwd: files ldap group: files ldap _if_ the group entry in the static file does not contain any users _then_ the information from NIS is merged in So you can keep a wheel group around as the _primary_ group for root, toor, whatnot ... and all the additional members that have wheel as an auxiliary group come from NIS. Possibly this works for LDAP, too? IMHO at least it should ;-)) THANK YOU! It is indeed working for LDAP too. But it fails for sudo(8). Luckily I could replace the %wheel directive with a few user id directives. It's still a shortcoming of some sort and I guess I'll file a PR if noone else has any more information on the issue. getent group now has the following wheel entries % getent group|grep wheel wheel:*:0 wheel:*:0:us,root As I said, su(1) is happy, sudo(8) not yet. Cheers, Ulrich Spoerlein -- It is better to remain silent and be thought a fool, than to speak, and remove all doubt. ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: pam_group vs. multiple group lines
On Wed, 22.08.2007 at 13:47:43 -0500, Scot Hetzel wrote: Does the following work for you: passwd: ldap [notfound=return] files group: ldap [notfound=return] files This sets ldap as the authoritative source for users and groups, unless the ldap service is down, then it will use the files for the source (useful when ldap server is down). This will require that you place all of the users/groups into the ldap server. (modified from the nis example in the nsswitch.conf(5) man page) Thanks for you suggestion! In the end, I did it the other way round, using: passwd: files ldap group: files [success=continue] ldap This has the effect of merging the multiple group sources into one, as can be seen here % getent group|grep wheel wheel:*:0:root,us I now have to play a little bit with bootup (no LDAP present) and what happens when LDAP goes offline, etc. Thanks again! Cheers, Ulrich Spoerlein -- It is better to remain silent and be thought a fool, than to speak, and remove all doubt. ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
pam_group vs. multiple group lines
Hi, I think I found a deficiency wrt. to pam_group (which also hits sudo(8) so this might be libc related instead). I found this while trying to migrate groups into LDAP, but you don't need LDAP to reproduce this, simply place the following in /etc/group wheel:*:0:root wheel:*:0:us % getent group|grep wheel;id wheel:*:0:root wheel:*:0:us uid=1001(us) gid=1000(us) groups=1000(us),0(wheel),80(www) As you can see, getent(1) and id(1) work fine. File access also works like expected, except for su(8) (because of pam_group group=wheel in pam.d/su) % su - su: Sorry Combine the wheel entries back into one line and su(8) suddenly starts working again. Same problem hits sudo(8) if your are using a %wheel line. Since there is no pam.d/sudo on my system I think the bug probably lies in libc itself. Is this expected behaviour? I'd classify it as bug ... Cheers, Ulrich Spoerlein -- It is better to remain silent and be thought a fool, than to speak, and remove all doubt. ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: dumping large partition to USB drive fails
On Wed, 27.06.2007 at 08:12:06 +0200, Roland Smith wrote: Unfortunately I can't check the drives with smartctl; they produce an SCSI error. I'll try 'camcontrol defects', and see if that turns up anything. Please try with atausb. Remove umass/da/scsi from your kernel and add atausb. Might be worth a try. Other than that, I wish FreeBSD could somehow translate those SMART commands, so it would work with USB/Firewire enclosures of all sorts. Cheers, Ulrich Spoerlein -- The trouble with the dictionary is you have to know how the word is spelled before you can look it up to see how it is spelled. -- Will Cuppy ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: OpenLDAP unix domain socket leak
On 6/14/07, Alexandre Biancalana [EMAIL PROTECTED] wrote: I change nss_ldap.conf again to access OpenLDAP via unix domain socket. Here is the connection counter before the change: Wed Jun 13 22:35:55 BRT 2007 unix sockets: 99 tcp sockets: 12 Here is the connection counter rigth before change connection method back to TCP socket: Wed Jun 13 22:56:01 BRT 2007 unix sockets: 2902 tcp sockets: 13 Hi, It looks like it is not actually a leak, per se. Letting slapd sit there idly for a while, it starts to close the unix domain sockets. However, if you constantly hit it with requests, it never recuperates. misctest1# while :; do echo -n `date` ; lsof 2/dev/null | awk '$1 ~ /imapd/{imapd+=1} $1 ~ /slapd/{slapd+=1} $3 ~ /postfix/{pf+=1} END{print imapd, pf, slapd}'; sleep 15;done Thu Jun 14 09:27:58 CEST 2007 1354 46 228 Thu Jun 14 09:28:13 CEST 2007 1354 341 516 Thu Jun 14 09:28:29 CEST 2007 1354 325 868 Thu Jun 14 09:28:45 CEST 2007 1308 337 1192 Thu Jun 14 09:29:01 CEST 2007 1308 323 1192 Thu Jun 14 09:29:17 CEST 2007 1308 337 1457 Thu Jun 14 09:29:33 CEST 2007 1308 323 1520 Thu Jun 14 09:29:49 CEST 2007 1262 321 1748 Thu Jun 14 09:30:04 CEST 2007 1262 329 1979 Thu Jun 14 09:30:20 CEST 2007 1262 333 2316 Thu Jun 14 09:30:37 CEST 2007 1262 333 2580 Thu Jun 14 09:30:53 CEST 2007 1262 335 3044 Thu Jun 14 09:31:09 CEST 2007 1262 393 3164 Thu Jun 14 09:31:25 CEST 2007 1262 393 2420 Thu Jun 14 09:31:41 CEST 2007 1262 395 2556 Thu Jun 14 09:31:57 CEST 2007 1262 393 2556 Thu Jun 14 09:32:13 CEST 2007 1262 393 2556 Thu Jun 14 09:32:29 CEST 2007 1262 391 2556 Thu Jun 14 09:32:45 CEST 2007 1262 391 2556 Thu Jun 14 09:33:01 CEST 2007 1262 391 888 Thu Jun 14 09:33:16 CEST 2007 1262 385 888 Thu Jun 14 09:33:32 CEST 2007 1262 94 228 Thu Jun 14 09:33:48 CEST 2007 1262 94 228 I think we really should take this up with the OpenLDAP guys. Btw, why is lsof printing lines multiple times? I ran lsof through sort and get almost every line four times: slapd 94403 ldipr 515uunix 0xc8f680000t0 /var/run/openldap/ldapi slapd 94403 ldipr 515uunix 0xc8f680000t0 /var/run/openldap/ldapi slapd 94403 ldipr 515uunix 0xc8f680000t0 /var/run/openldap/ldapi slapd 94403 ldipr 515uunix 0xc8f680000t0 /var/run/openldap/ldapi slapd 94403 ldipr 516uunix 0xc8f138580t0 /var/run/openldap/ldapi slapd 94403 ldipr 516uunix 0xc8f138580t0 /var/run/openldap/ldapi slapd 94403 ldipr 516uunix 0xc8f138580t0 /var/run/openldap/ldapi slapd 94403 ldipr 516uunix 0xc8f138580t0 /var/run/openldap/ldapi slapd 94403 ldipr 517uunix 0xc8f350000t0 /var/run/openldap/ldapi slapd 94403 ldipr 517uunix 0xc8f350000t0 /var/run/openldap/ldapi slapd 94403 ldipr 517uunix 0xc8f350000t0 /var/run/openldap/ldapi slapd 94403 ldipr 517uunix 0xc8f350000t0 /var/run/openldap/ldapi Uli ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Unix domain socket leak in 6-STABLE
Hi, as you are aware, there is a unix domain socket leak in 6-STABLE, which AFAIK is not yet fully fixed. I wanted to ask about the status or some possible fixes, as I know a way to reproduce the problem in a matter of minutes. We are running Cyrus and Postfix with the user DB in OpenLDAP. When using ldapi://%2fvar%2frun%2fopenldap%2fldapi/ as a connection URL for both Postfix' user lookup and cyrus' user lookup (via nss_ldap). slapd quickly runs out of filedescriptors as it is not closing any unix sockets (judging by ever increasing lsof output). Using TCP sockets is just fine. If there are patches I could try, don't hesitate to send them to me. Cheers, Uli ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re Unix domain socket leak in 6-STABLE
On 6/13/07, Ivan Voras [EMAIL PROTECTED] wrote: Can you perhaps isolate the bug / give more information on it? I'm asking because I'm currently using an application with unix domain sockets in production wich handles lots of connects/disconnects per second and it doesn't seem to show leakage. Ok, I'm not exactly sure what I should do. First of all, there are two LDAP consumers: postfix and cyrus-saslauthd. A fairly common setup, I suppose. If I bombard this setup with hundreds of mails, cyrus is at one point unable to process the mails further, stating that: Jun 13 18:27:22 misctest1 lmtpunix[47460]: IOERROR: opening /data/cyrus/spool/user/ulrspoe/cyrus.cache: Too many open files The error is misleading, though, as it is not cyrus that is out of file descriptors, but rather OpenLDAP. Restarting slapd will make cyrus work again. I logged the lsof output during the mail bomb and the slapd-lines are continually rising: misctest1# while :; do echo -n `date` ; lsof 2/dev/null | awk '$1 ~ /imapd/{imapd+=1} $1 ~ /slapd/{slapd+=1} $3 ~ /postfix/{pf+=1} END{print imapd, pf, slapd}'; sleep 60;done Wed Jun 13 18:21:55 CEST 2007 1378 71 272 Wed Jun 13 18:22:57 CEST 2007 1378 71 272 Wed Jun 13 18:23:58 CEST 2007 1378 216 316 Wed Jun 13 18:24:59 CEST 2007 1378 321 644 Wed Jun 13 18:26:01 CEST 2007 1378 333 1132 Wed Jun 13 18:27:02 CEST 2007 1378 329 1804 Wed Jun 13 18:28:04 CEST 2007 1378 417 2280 The third column never goes down significantly. I have the setup now sitting at 2k open files for the slapd process and will wait until tomorrow, if the count ever decreases again. Changing from ldapi://%2fvar%2frun%2fopenldap%2fldapi/ to ldap://127.0.0.1/ fixes the problem. It might be a genuine problem in OpenLDAP, though. We are using openldap-server-2.3.34_1 Cheers, Uli ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Unix domain socket leak in 6-STABLE
Marc G. Fournier wrote: 'k, just to ring in here ... I can definitely attest to there being a leak here, as it was me that was originally burned by it ... in my case, I eventually was able to isolate which VPS/jail was causing it and haven't run it since, but was never able to determine exactly what was causing it, since there wasn't really anything unusual running in that jail :( But ... based on the discussions that were had at the time, it was my understanding that if all applications were shut down on the server (to the bare minimal), eventually the kernel GC should clean up all residual sockets ... when I did this (shut down all applications but the very bare minimum) and waited for 10+ minutes, socket usage never drop'd below about 4k sockets in use, or something like that ... Hi Marc, was your leak a kernel leak or a user leak (if it actually makes a difference). Because I'm only hitting the problem within the slapd process itself. Restart it, every thing is good again. Other applications are also no affected. I think what's happening to me, is that slapd keeps unix domain sockets lingering too long. When blasting mails through the system, all those tiny ldap lookups then lead to slapd reaching it's process limit. I wonder though: maxfilesperproc is roughly 12k, but lsof needs to only count 2.5k lines of slapd output when the limit is hit. Is there a better way to check, how much fds/resources are open by a certain process? When using TCP sockets, the number of open files hardly changes. Ulrich Spoerlein -- The trouble with the dictionary is you have to know how the word is spelled before you can look it up to see how it is spelled. -- Will Cuppy ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Change in memory tracking in recent 6-STABLE?
Peter Jeremy wrote: On 2007-May-28 11:29:05 +0200, Ulrich Spoerlein [EMAIL PROTECTED] wrote: I'm using symon to monitor memory usage among several FreeBSD machines. After updating to a recent 6-STABLE, the amount of memory no longer adds up to the total physical memory. The inactive counter is way too small. As well as active, inactive and free, there is cache, wired and buffers. Check the following sysctls: vfs.bufspace (bytes) vm.stats.vm.v_active_count (pages) vm.stats.vm.v_inactive_count (pages) vm.stats.vm.v_wire_count (pages) vm.stats.vm.v_cache_count (pages) vm.stats.vm.v_free_count (pages) Hi Peter, Ok, adding up vm.stats gives me the total physical RAM (roughly). One question would be, where is the buffer cache counted towards? Or is it spread all over the place? Back to symon, it uses the following code to grab it's values. This has worked fine till some months ago. Now it is missing several MBytes. How should I fix the code? static int me_vm_mib[] = {CTL_VM, VM_TOTAL}; ... if (sysctl(me_vm_mib, 2, me_vmtotal, me_vmsize, NULL, 0) 0) { warning(%s:%d: sysctl failed, __FILE__, __LINE__); bzero(me_vmtotal, sizeof(me_vmtotal)); } /* convert memory stats to Kbytes */ me_stats[0] = pagetob(me_vmtotal.t_arm); me_stats[1] = pagetob(me_vmtotal.t_rm); me_stats[2] = pagetob(me_vmtotal.t_free); Why are these values not adding up to 256MB in my case? Ulrich Spoerlein -- The trouble with the dictionary is you have to know how the word is spelled before you can look it up to see how it is spelled. -- Will Cuppy ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Change in memory tracking in recent 6-STABLE?
Hi there, I'm using symon to monitor memory usage among several FreeBSD machines. After updating to a recent 6-STABLE, the amount of memory no longer adds up to the total physical memory. The inactive counter is way too small. Which recent changes could have caused this? Is it a bug in symon or in FreeBSD? An example of the difference can be found here: http://coyote.dnsalias.net/memory.png Ulrich Spoerlein -- The trouble with the dictionary is you have to know how the word is spelled before you can look it up to see how it is spelled. -- Will Cuppy ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: minimizing downtime on upgrades? (for example: mysql 4.1 - 5.0 or php)
Olivier Mueller wrote: Isn't there a better way? How do you handle such cases? We go to extra lengths and allow only pkg installs on servers. That way we are sure, that no random library pollution takes place. It also makes stuff better reproducable. Sadly packages are somewhat neclected and there is still no good pkg_update tool What I'm going to try is to prepare packages of the ports I have to upgrade on a dev/test server, and then install them with pkg_add: is that the right way ? A good way would be to test this very update with packages on a test box. That is, install mysql4, produce your mysql5 packages somewhere else (or use a chroot or jail). Then see if pkg-updating works for mysql. Ulrich Spoerlein -- The trouble with the dictionary is you have to know how the word is spelled before you can look it up to see how it is spelled. -- Will Cuppy ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Known memory leak in 6-STABLE from April 1st?
On 5/14/07, Marc G. Fournier [EMAIL PROTECTED] wrote: Now after doing some heavy IMAP testing (cyrus reconstruct of big maildirs) the system froze to a complete halt. Stupid me already rebooted the machine, tomorrow I'll try to break into DDB when it happens again. I also started recording top(1) memory output and sysctl vm.zone output. The main questions is: Were there any known memory leaks at the start of April? Any patches I should blindly try before spending several days on debugging this? Hrmmm ... long shot here, but what does: sysctl kern.ipc.numopensockets show over that period of time ... just wondering if we are somehow related on problems here, just different symptoms ... Sorry no, nothing suspicous there. It bounces up and down, after killing all amavis, cyrus and postfix processes it came down to about 80. Right now it's at 280 again, and the memory is so small, that I can no longer grep(1) a 600MB file or do other useful stuff. This is the last vm.zone output, anything suspicous? What commands should I run (in DDB?) to see where the memory is going? Tue May 15 10:33:38 CEST 2007 vm.zone: ITEMSIZE LIMIT USEDFREE REQUESTS FFS2 dinode: 256,0, 83286, 10179, 1273390 FFS1 dinode: 128,0, 0, 0,0 FFS inode: 132,0, 83286, 10761, 1273390 Mountpoints: 664,0, 7, 11,8 SWAPMETA:276, 121576, 11, 17, 22 pfosfp: 28,0, 0, 0,0 pfospfen:108,0, 0, 0,0 pfiaddrpl:92,0, 0, 0,0 pfstatescrub: 28,0, 0, 0,0 pffrcent: 12,50141, 0, 0,0 pffrcache:48,10062, 0, 0,0 pffrag: 48,0, 0, 0,0 pffrent: 16, 5075, 0, 0,0 pfrkentry2: 156,0, 0, 0,0 pfrkentry: 156,0, 0, 0,0 pfrktable: 1240,0, 0, 0,0 pfpooladdrpl: 68,0, 0, 0,0 pfaltqpl:128,0, 0, 0,0 pfstatepl: 260,10005, 0, 0,0 pfrulepl:604,0, 0, 0,0 pfsrctrpl: 100,0, 0, 0,0 rtentry: 132,0, 37,108, 3848 ripcb: 180,25608, 0,110, 33 sackhole: 20,0, 0,676, 3672 tcpreass: 20, 1690, 0,676, 1365 hostcache:76,15400, 5,245, 317 syncache:100,15366, 0,195,27209 tcptw:48, 5148, 0,624,34253 tcpcb: 464,25600, 29,203, 212261 inpcb: 180,25608, 29,389, 212261 udpcb: 180,25608, 19,179, 493092 ipq: 32, 904, 0, 0,0 unpcb: 144,25623,213, 3405, 300681 socket: 356,25608,263, 5567, 1006076 KNOTE:68,0, 0,616, 724767 PIPE:408,0, 10,818, 1544936 DIRHASH:1024,0,861,247, 4143 NFSNODE: 460,0, 1, 23,7 NFSMOUNT:480,0, 1, 15,2 L VFS Cache: 291,0,402,326, 6124 S VFS Cache: 68,0, 87410, 16638, 1831436 NAMEI: 1024,0, 0,660, 105375628 VNODEPOLL:76,0, 0,250,8 VNODE: 272,0, 83341, 12405, 1273709 ata_composit:196,0, 0, 0,0 ata_request: 204,0, 0, 76, 34 g_bio: 132,0, 0,696, 16904466 ACL UMA zone:388,0, 0, 0,0 mbuf_jumbo_1: 16384,0, 0, 0,0 mbuf_jumbo_9: 9216,0, 0, 0,0 mbuf_jumbo_p: 4096,0, 0, 0,0 mbuf_cluster: 2048,25600,929,325, 1276052 mbuf:256,0,931,554, 141685396 mbuf_packet: 256,0,806,679, 104799211 VMSPACE: 296,0,127,328, 1617479 UPCALL: 44,0, 5,229, 10 KSEGRP: 88,0,386,174, 556 THREAD: 376,0,394,286, 292710 PROC:536,0,170,215, 1617545 Files:72,0,580, 4296, 40984983 4096: 4096,0,229,542, 3118296 2048: 2048,0,229,547, 4814216 1024: 1024,0,347,389, 9329214 512: 512,0,186,398, 1319246 256: 256,0,919, 3941, 5484210 128: 128,0, 2516, 6034, 28789911 64: 64,0,
Re: mfs and buildworlds on the SunFire x4600
Oliver Fromme wrote: Mars G. Miro wrote: now we know buildworld on mfs dont really matter on high-end machines, No, we knew that before. I could have told you. :-) That was the first thing I tested when I first had access to a machine with sufficient RAM, about 10 years ago. I put /usr/src on an MFS disk, ran buildworld, and was disappointed. I'm not intimately familiar with the build process, but I reckon it reads several small files several times (ie, they are cached) runs a CPU bound process, then writes a few bigger files once (objects and binaries). Not a good MFS test scenario, indeed. Ulrich Spoerlein -- The trouble with the dictionary is you have to know how the word is spelled before you can look it up to see how it is spelled. -- Will Cuppy ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Socket leak (Was: Re: What triggers No Buffer Space) ?Available?
I'm slowly cathing up on FreeBSD related mails and found this mail ... Marc G. Fournier wrote: kern.ipc.numopensockets: 7400 kern.ipc.maxsockets: 12328 ps looks like: stuff deleted 2368 p2 Is+ Sat01PM 0:00.03 /bin/tcsh root2112 0.0 0.1 5220 2360 p3 Ss+ Sat01PM 0:00.04 /bin/tcsh root 91221 0.0 0.1 5140 2440 p4 Ss+ 11:49PM 0:00.12 -tcsh (tcsh) I don't think those processes should consume 7400 sockets. Indeed, this really looks like a leak in the kernel. Robert has sent me a suggestion to try that I'm in the process of putting together right now, involving backing out some work on uipc_usrreg.c ... How did the backing out work for you? Ulrich Spoerlein -- The trouble with the dictionary is you have to know how the word is spelled before you can look it up to see how it is spelled. -- Will Cuppy ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Known memory leak in 6-STABLE from April 1st?
Hi all, I observed something funny with our new cyrus/postfix/amavis installations running on 6.2-STABLE checked out on April 1st (no, I'm not joking). They are running symon to grab performance data and I saw the memory total becoming less and less. Now I know that adding up free+active+inactive != total ram BUT *all* other FreeBSD machines we are running show a more or less constant sum. I uploaded two pictures showing the trend here (They are i386 machines with 4GB RAM, FreeBSD reports 3.3GB as usable): http://coyote.dnsalias.net/ms1-day.png http://coyote.dnsalias.net/ms1-week.png Now after doing some heavy IMAP testing (cyrus reconstruct of big maildirs) the system froze to a complete halt. Stupid me already rebooted the machine, tomorrow I'll try to break into DDB when it happens again. I also started recording top(1) memory output and sysctl vm.zone output. The main questions is: Were there any known memory leaks at the start of April? Any patches I should blindly try before spending several days on debugging this? Thanks! Uli ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: FreeBSD vs Region Code DVDs
On 5/4/07, Scott Long [EMAIL PROTECTED] wrote: Why can I read and mount the DVD, but mplayer/xine are still unable to play the DVD? (It works fine on the internal, ATA attached, crappy NEC drive.) No idea, sorry. Do you have umass, atapicam, and ata-usb all involved here? If so, you've made the room a little crowded, and they are all arguing with each other. I know that ata-usb was inspired by the ata author having problems with umass and not wanting to fix them there, but I don't know exactly what was broken or what was fixed. I only tested one subsystem at a time, and it is not that one subsystem is broken per se, it is only in combination with this single external Plextor drive. I had another external DVD drive (can't remember the brand) a few months ago and this also was working just fine. I'll try to sum it all up: Internal NEC drive, attached via ata(4): Can read all kinds of CD/DVD Internal NEC drive, attached via atapicam(4): dito Unknown Brand external DVD, attached via umass(4): dito External Plextor, attached via umass(4): Can read CDs, DVD-Rs, unable to do _anything_ with retail DVD(-Video) External Plextor, attached via firewire/sbp(4): dito External Plextor, attached via atausb(4): Can read CDs, DVD-Rs, can mount/read retail DVD(-Video), produces some errors, tough. The CSS decoder seems to fail, as I can't watch the video on the drive. I can at least _access_ the bytes though, something not possible with umass/sbp. I don't know the code, but it looks like this Plextor and cd(4) don't get along when DVD copy protection is involved. I also read in the OpenBSD 4.1 release notes, that they made changes to their cd(4) to work better with region protected DVDs. I didn't know that the OS was involved in this, I thought this was a thing left to the drive firmware or the DVD player software. Anyway, how can I tell cd(4) to give me more error output? How can I access the DVD at the bottom-most layer? Something line sending a Test Unit Ready command? Or checking if the drive recognizes an inserted medium? Uli ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: FreeBSD vs Region Code DVDs
On 5/4/07, Craig Boston [EMAIL PROTECTED] wrote: This is a new drive, correct? It's possible that the firmware has never been told what region it's in, and is refusing to read any protected discs from outside its region (which would be all of them). I already tried Windows and Linux, to check if the drive would actually work with retail DVDs at all. Windows told me the RC was set to 2 (IIRC) and I *can* access the media (more or less) with atausb(4), so I really think it is CAMs fault. But I'll give the region-code program from Tijl a try this weekend. Thanks for all the suggestions so far, let's see which one will get me further on this quest :) Uli ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: FreeBSD vs Region Code DVDs
Tijl Coosemans wrote: On Thursday 03 May 2007 20:16:46 Ulrich Spoerlein wrote: I can not even read a single sector from such a DVD with the external drive, but it's working just fine with the internal one. It's really driving me nuts. Maybe you have to change the drive region code (RPC 2). I had to do this a couple years ago with a laptop's internal drive. Either that or you need to find a patched firmware to make the drive region free (RPC 1). Sadly, your programs don't work. Neither for the interal drive, nor for the external one. No matter which media I have inserted: May 4 21:09:43 roadrunner kernel: (cd0:umass-sim0:0:0:0): Vendor Specific Command. CDB: a4 0 0 0 0 0 0 0 0 8 8 0 May 4 21:09:43 roadrunner kernel: (cd0:umass-sim0:0:0:0): CAM Status: SCSI Status Error May 4 21:09:43 roadrunner kernel: (cd0:umass-sim0:0:0:0): SCSI Status: Check Condition May 4 21:09:43 roadrunner kernel: (cd0:umass-sim0:0:0:0): ILLEGAL REQUEST asc:24,0 May 4 21:09:43 roadrunner kernel: (cd0:umass-sim0:0:0:0): Invalid field in CDB May 4 21:09:43 roadrunner kernel: (cd0:umass-sim0:0:0:0): Unretryable error May 4 21:10:22 roadrunner kernel: acd0: FAILURE - REPORT_KEY ILLEGAL REQUEST asc=0x24 ascq=0x00 May 4 21:10:22 roadrunner kernel: (cd1:ata1:0:0:0): Vendor Specific Command. CDB: a4 0 0 0 0 0 0 0 0 8 8 0 May 4 21:10:22 roadrunner kernel: (cd1:ata1:0:0:0): CAM Status: SCSI Status Error May 4 21:10:22 roadrunner kernel: (cd1:ata1:0:0:0): SCSI Status: Check Condition May 4 21:10:22 roadrunner kernel: (cd1:ata1:0:0:0): ILLEGAL REQUEST asc:24,0 May 4 21:10:22 roadrunner kernel: (cd1:ata1:0:0:0): Invalid field in CDB May 4 21:10:22 roadrunner kernel: (cd1:ata1:0:0:0): Unretryable error Try to mount an ordinary data DVD results in May 4 21:13:04 roadrunner kernel: (cd0:umass-sim0:0:0:0): READ TOC/PMA/ATIP {MMC Proposed}. CDB: 43 0 0 0 0 0 1 0 c 0 May 4 21:13:04 roadrunner kernel: (cd0:umass-sim0:0:0:0): CAM Status: SCSI Status Error May 4 21:13:04 roadrunner kernel: (cd0:umass-sim0:0:0:0): SCSI Status: Check Condition May 4 21:13:04 roadrunner kernel: (cd0:umass-sim0:0:0:0): ILLEGAL REQUEST asc:24,0 May 4 21:13:04 roadrunner kernel: (cd0:umass-sim0:0:0:0): Invalid field in CDB May 4 21:13:04 roadrunner kernel: (cd0:umass-sim0:0:0:0): Unretryable error May 4 21:13:04 roadrunner kernel: g_vfs_done():cd0[READ(offset=32768, length=2048)]error = 5 Ulrich Spoerlein -- The trouble with the dictionary is you have to know how the word is spelled before you can look it up to see how it is spelled. -- Will Cuppy ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
FreeBSD vs Region Code DVDs
Hi all, I'm having a hard time getting my external (USB, Firewire) Plextor PX-755UF to read any retail DVDs at all. I can read any kind of CDs and also DVD-Rs. But mastered DVDs are invisible to FreeBSD. I can not even read a single sector from such a DVD with the external drive, but it's working just fine with the internal one. It's really driving me nuts. umass0: PLEXTOR DVDR PX-755A, class 0/0, rev 2.00/4.35, addr 126 umass0: 8070i (ATAPI) over Bulk-Only; quirks = 0x umass0:1:0:-1: Attached to scbus1 cd0 at umass-sim0 bus 0 target 0 lun 0 cd0: PLEXTOR DVDR PX-755A 1.06 Removable CD-ROM SCSI-0 device cd0: 40.000MB/s transfers cd0: Attempt to query device size failed: NOT READY, Medium not present - tray closed cd1 at ata1 bus 0 target 0 lun 0 cd1: _NEC DVD_RW ND-5500A 1.51 Removable CD-ROM SCSI-0 device cd1: 33.000MB/s transfers cd1: Attempt to query device size failed: NOT READY, Medium not present The NEC drive can read DVDs just fine (although it sucks). % recoverdisk /dev/cd1 startsize len state done remaining% done 0 10485767411912704 0 07411912704 0.000^C (130)% recoverdisk /dev/cd0 recoverdisk: DIOCGMEDIASIZE failed: No such file or directory (1)% dd if=/dev/cd0 bs=2048 0+0 records in 0+0 records out 0 bytes transferred in 0.93 secs (0 bytes/sec) If I attach the device via Firewire, it's just the same. Perhaps it requires some sort of quirk? Where should I start looking for debug output? Which test should I run. Any help would be greatly appreciated. Bye, Ulrich Spoerlein -- The trouble with the dictionary is you have to know how the word is spelled before you can look it up to see how it is spelled. -- Will Cuppy ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: FreeBSD vs Region Code DVDs
Sean C. Farley wrote: On Thu, 3 May 2007, Ulrich Spoerlein wrote: I had an issue with ripping some DVD's to my laptop before a trip I made (note: no distribution occurred (for the lawyers :))). I wanted to just use dd to do it, but dd would fail after a small amount of data was read. If I first played a little of the DVD with mplayer, then dd would work afterwards. It probably had something to do with mplayer whispering sweet nothings to the DVD player. Wouldn't help in my case, as the disc cannot be accessed in anyway. But ... atausb(4) to the rescue! I recompiled my kernel with atausb(4) to rule out problems inside CAM, lo' and behold: umass0: PLEXTOR DVDR PX-755A, class 0/0, rev 2.00/4.35, addr 121 umass0: 8070i (ATAPI) over Bulk-Only; quirks = 0x umass0:3:0:-1: Attached to scbus3 cd1 at umass-sim0 bus 0 target 0 lun 0 cd1: PLEXTOR DVDR PX-755A 1.06 Removable CD-ROM SCSI-0 device cd1: 40.000MB/s transfers cd1: cd present [3614880 x 2048 byte records] (no glabel tasting, no reading from the device possible) umass0: at uhub3 port 1 (addr 121) disconnected (cd1:umass-sim0:0:0:0): lost device (cd1:umass-sim0:0:0:0): removing device entry umass0: detached atausb0: PLEXTOR DVDR PX-755A, class 0/0, rev 2.00/4.35, addr 121 atausb0: using ATAPI over Bulk-Only ata2: USB lun 0 on atausb0 acd1: DEVICE_RESET unsupported acd1: DVDR DVDR PX-755A/1.06 at ata2-master USB2 acd1: FAILURE - INQUIRY ILLEGAL REQUEST asc=0x24 ascq=0x00 cd1 at ata2 bus 0 target 0 lun 0 cd1: PLEXTOR DVDR PX-755A 1.06 Removable CD-ROM SCSI-0 device cd1: 3.300MB/s transfers cd1: Attempt to query device size failed: NOT READY, Logical unit is inprocess of becoming ready GEOM_LABEL: Label for provider acd1 is iso9660/FIREFLY_DISC2. acd1: FAILURE - READ_TOC ILLEGAL REQUEST asc=0x24 ascq=0x00 acd1: FAILURE - READ_TOC ILLEGAL REQUEST asc=0x24 ascq=0x00 acd1: FAILURE - READ_TOC ILLEGAL REQUEST asc=0x24 ascq=0x00 acd1: FAILURE - READ_TOC ILLEGAL REQUEST asc=0x24 ascq=0x00 acd1: FAILURE - READ_TOC ILLEGAL REQUEST asc=0x24 ascq=0x00 Might these ILLEGAL REQUESTs give a clue to what is going wrong when trying to access this device with cd(4)? Why is it only reporting/using 3.3MB/s transfers? Why can I read and mount the DVD, but mplayer/xine are still unable to play the DVD? (It works fine on the internal, ATA attached, crappy NEC drive.) Perhaps Scott can share some SCSI wisdom on this matter. I really need to use this drive via Firewire, ie. cd(4), so atausb(4) is no permanent solution. Ulrich Spoerlein PS: why is iostat(1) not working for acd(4) devices? -- The trouble with the dictionary is you have to know how the word is spelled before you can look it up to see how it is spelled. -- Will Cuppy ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
make: parallel jobs broken when using -f -
Hi Hartmut, there is an annoying bug in 6-STABLE make(1), where -f - seems to serialize the target making. Consider the following Makefile all: a b c d a b c d: @echo Makeing ${.TARGET} @sleep 4 And observe the following behaviour: $ make -j4 Makeing a Makeing b Makeing c Makeing d pause $ make -j4 -f- Makefile Makeing b Makeing d pause Makeing a pause Makeing c pause $ The make(1) on -CURRENT has this fixed already, is there any chance of this getting MFCed? AFAICS the following revisions are not up to date (wrt to CURRENT): $FreeBSD: src/usr.bin/make/job.c,v 1.122.2.1 2005/07/20 19:05:23 harti Exp $ $FreeBSD: src/usr.bin/make/main.c,v 1.155 2005/05/24 16:05:51 harti Exp $ $FreeBSD: src/usr.bin/make/parse.c,v 1.108.2.1 2005/11/16 08:25:19 ru Exp $ $FreeBSD: src/usr.bin/make/str.c,v 1.45.2.1 2006/10/16 11:51:18 ru Exp $ $FreeBSD: src/usr.bin/make/var.c,v 1.159 2005/05/24 16:05:51 harti Exp $ Ulrich Spoerlein -- The trouble with the dictionary is you have to know how the word is spelled before you can look it up to see how it is spelled. -- Will Cuppy ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Weird NFS behaviour
Hi, we have performance problems with our FreeBSD 6.2 based NFS server. Picture the following setup: FreeBSD Client --- Samba-Server --- NFS-Server all three machines are running FreeBSD 6.2 (the same image). The NFS server is configured with 16 nfsd. sysctl.conf has net.inet.tcp.sendspace=65536 net.inet.tcp.recvspace=65536 Now, what's the problem: The Samba-Server mounts shares via NFS. All servers are on Gigabit Ethernet and I get read transfer rates exceeding 50MB/s from the NFS server. This is all good and well, but if I copy a file via scp(1) (sic!) to the samba server into the NFS mounted directory, not only do I seldomly exceed 12MB/s but I also get a very strange traffic pattern on the em0 interface of the samba server. I get _twice_ as much incoming traffic on the em0 interface as outgoing traffic. systat -if on samba: em0 in 24.726 MB/s 25.905 MB/s3.046 GB out12.941 MB/s 13.558 MB/s1.994 GB systat -if on nfs-server em0 in 11.497 MB/s 12.999 MB/s3.727 GB out11.878 MB/s 13.423 MB/s 995.485 MB To stress, this is running: gigabit-client:# scp large-file [EMAIL PROTECTED]:/mnt/nfs-share/ The wicked part is this: If I copy a file from the samba server directly to the NFS share (not as a passthrough), I get these traffic patterns: systat -if on samba: em0 in432.724 KB/s432.724 KB/s3.772 GB out12.399 MB/s 12.399 MB/s2.481 GB systat -if on nfs: em0 in 12.091 MB/s 15.791 MB/s 184.766 MB out 440.939 KB/s562.521 KB/s1.339 GB This is running: samba:# cp large-file /mnt/nfs-share/ What on earth is causing each received NFS packet to be _bounced_ to the samba server when using ssh, scp, smbd, etc. And not when generating the traffic locally? nfsstat -s is showing an increase in READ calls similar to WRITE calls when using the samba machine as pass-through. It is showing _no_ increase in READ calls when copying the files directly. NB: All these test were run _without_ smbd running, it's just that this server is designated to become our samba server. Setting vfs.nfsrv.async=1 doubled write performance, but the weird traffic pattern remains. (Am I asking for too much trouble by setting async NFS?) Thanks for any pointers! Uli ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Weird NFS behaviour
Hi, we have performance problems with our FreeBSD 6.2 based NFS server. Picture the following setup: FreeBSD Client --- Samba-Server --- NFS-Server all three machines are running FreeBSD 6.2 (the same image). The NFS server is configured with 16 nfsd. sysctl.conf has net.inet.tcp.sendspace=65536 net.inet.tcp.recvspace=65536 Now, what's the problem: The Samba-Server mounts shares via NFS. All servers are on Gigabit Ethernet and I get read transfer rates exceeding 50MB/s from the NFS server. This is all good and well, but if I copy a file via scp(1) (sic!) to the samba server into the NFS mounted directory, not only do I seldomly exceed 12MB/s but I also get a very strange traffic pattern on the em0 interface of the samba server. I get _twice_ as much incoming traffic on the em0 interface as outgoing traffic. systat -if on samba: em0 in 24.726 MB/s 25.905 MB/s3.046 GB out12.941 MB/s 13.558 MB/s1.994 GB systat -if on nfs-server em0 in 11.497 MB/s 12.999 MB/s3.727 GB out11.878 MB/s 13.423 MB/s 995.485 MB To stress, this is running: gigabit-client:# scp large-file [EMAIL PROTECTED]:/mnt/nfs-share/ The wicked part is this: If I copy a file from the samba server directly to the NFS share (not as a passthrough), I get these traffic patterns: systat -if on samba: em0 in432.724 KB/s432.724 KB/s3.772 GB out12.399 MB/s 12.399 MB/s2.481 GB systat -if on nfs: em0 in 12.091 MB/s 15.791 MB/s 184.766 MB out 440.939 KB/s562.521 KB/s1.339 GB This is running: samba:# cp large-file /mnt/nfs-share/ What on earth is causing each received NFS packet to be _bounced_ to the samba server when using ssh, scp, smbd, etc. And not when generating the traffic locally? nfsstat -s is showing an increase in READ calls similar to WRITE calls when using the samba machine as pass-through. It is showing _no_ increase in READ calls when copying the files directly. NB: All these test were run _without_ smbd running, it's just that this server is designated to become our samba server. Setting vfs.nfsrv.async=1 doubled write performance, but the weird traffic pattern remains. (Am I asking for too much trouble by setting async NFS?) Thanks for any pointers! Uli ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Some days, it doesn't pay to upgrade ...
Marc G. Fournier wrote: I don't know how critical this is, but I just thought about it ... this is my only system running gmirror ... everything seems fine according ot gmirror status, but maybe something iswron gthere I'm not seeing: You should tell us, in which state those processes hung. It might also be good to use DDB and showalllocks to see if it is a deadlock. I for one had several deadlocks with gmirror on an SMP machine. Ulrich Spoerlein ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: sysutils/fusefs-ntfs working for anyone?
Wang Yi wrote: I'm using ntfs-3g now. the version is same to yours. But only the difference is the disk I used is a physical disk. I also had no luck using it on my existing NTFS partition, though I'd like to experiment on a clean partition first. Could you please run a test with mdconfig and mkfs.ntfs (you have to use the -F flag)? Jan Henrik Sylvester wrote: On 6.2-RELEASE using fusefs-kmod-0.3.0_4, fusefs-libs-2.6.2, and fusefs-ntfs-0.20070207RC1, I can mount my existing (Windows XP) NTFS partition with 'ntfs-3g /dev/ad0s1 /mnt/ad0s1'. The following error messages about missing /proc/filesystems and modprobe can be ignored, since defaults are assumed in case of missing information. (I read about it on a fusefs mailing list concerning Darwin.) The critical part seems to be the seekscript. Could one of you guys provide me with a ktrace/kdump output, so I can investigate this further? You should run ktrace with the -i flag and probably send the output off-list. Thanks! Ulrich Spoerlein -- A: Yes. Q: Are you sure? A: Because it reverses the logical flow of conversation. Q: Why is top posting frowned upon? ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
sysutils/fusefs-ntfs working for anyone?
Hi there, I've been trying to mount my NTFS partitions with the NTFS-3g project's FUSE implementation but am unable to mount anything. I'm on 6-STABLE and have the latest versions of FUSE installed: fusefs-kmod-0.3.0_4 Kernel module for fuse fusefs-libs-2.6.2 FUSE allows filesystem implementation in userspace fusefs-ntfs-0.20070207RC1 Mount NTFS partitions and disk images I use the sysutils/ntfsprogs port to create a NTFS filesystem. I can also mount this filesystem using mount.ntfs, yet I fail to get anywhere with ntfs-3g. What's that darn seekscript about anyway? # mkfs.ntfs -fF /dev/md7 /dev/md7 is not a block device. mkntfs forced anyway. The sector size was not specified for /dev/md7 and it could not be obtained automatically. It has been set to 512 bytes. The partition start sector was not specified for /dev/md7 and it could not be obtained automatically. It has been set to 0. The number of sectors per track was not specified for /dev/md7 and it could not be obtained automatically. It has been set to 0. The number of heads was not specified for /dev/md7 and it could not be obtained automatically. It has been set to 0. Cluster size has been automatically set to 512 bytes. To boot from a device, Windows needs the 'partition start sector', the 'sectors per track' and the 'number of heads' to be set. Windows will not be able to boot from this device. Creating NTFS volume structures. mkntfs completed successfully. Have a nice day. # ntfs-3g /dev/md7 /mnt Failed to open /proc/filesystems: No such file or directory modprobe: not found Failed to open /proc/filesystems: No such file or directory # mount_fusefs: seekscript failed The fuse module is loaded, of course. A ktrace of the ntfs-3g is, umm, interesting, to say the least. Lot's of sh(1), awk(1) and fstat(1) calls. It even tries to load modprobe, as you can see from the output above too. So, the basic question is: Has _anybody_ used ntfs-3g successfully on RELENG_6? Ulrich Spoerlein -- A: Yes. Q: Are you sure? A: Because it reverses the logical flow of conversation. Q: Why is top posting frowned upon? ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Failover-HA-Setup
Richard wrote: There is no need to make any changes to the script. Put whatever other options you want for mysql in rc.conf, and set the _enable variable to no. Then you can run /usr/local/etc/rc.d/mysql-server onestart and it will start normally just one time. Yes, and mysql will be started at bootup time on both nodes, wouldn't it? So one node would fail miserably since the lack of mounted diskspace... No, he wrote to set mysql_enable=NO, ie, the usual startup procedure will NOT start it. This doesn't work with heartbeat, however. heartbeat always calls the resource scripts with either 'start' or 'stop', you can't make it pass 'onestart'. Only two options remain: modify existing mysql-server script (bad idea, will be overwritten on update) or go through a proxy script which transforms start|stop - onestart|onestop You could also alter the environment of heartbeat (it's really just a bunch of poorly written shell scripts) and set mysql_enable=YES there, but that'd be just as fragile as rewriting the existing mysql-server script. But the nostart-solution sounds like working... Till you update the port and forget about your local modification ... Ulrich Spoerlein -- A: Yes. Q: Are you sure? A: Because it reverses the logical flow of conversation. Q: Why is top posting frowned upon? ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Page Fault in 6.2-PRE RELEASE
Christopher Harper (05056409) wrote: The system freezes randomly and no longer accepts any input and after a minute of being 'frozen' reboots. (kgdb) backtrace #0 doadump () at pcpu.h:165 #1 0xc051a6ca in boot (howto=260) at /usr/src/sys/kern/kern_shutdown.c:409 #2 0xc051a9f1 in panic (fmt=0xc06d94cf %s) at /usr/src/sys/kern/kern_shutdown.c:565 #3 0xc06a795c in trap_fatal (frame=0xe6dc6bac, eva=4) at /usr/src/sys/i386/i386/trap.c:837 #4 0xc06a769b in trap_pfault (frame=0xe6dc6bac, usermode=0, eva=4) at /usr/src/sys/i386/i386/trap.c:745 #5 0xc06a72d5 in trap (frame= {tf_fs = 8, tf_es = -963641304, tf_ds = -961019864, tf_edi = -963964928, tf_esi = 0, tf_ebp = -421762056, tf_isp = -421762088, tf_ebx = -963961644, tf_edx = -421655072, tf_ecx = -964085760, tf_eax = 0, tf_trapno = 12, tf_err = 0, tf_eip = -1067808807, tf_cs = 32, tf_eflags = 66118, tf_esp = -963961644, tf_ss = -963615744}) at /usr/src/sys/i386/i386/trap.c:435 #6 0xc069342a in calltrap () at /usr/src/sys/i386/i386/exception.s:139 #7 0xc05a87d9 in ieee80211_free_node (ni=0x0) at /usr/src/sys/net80211/ieee80211_node.c:1605 #8 0xc04b1923 in ural_txeof (xfer=0xc6b82d00, priv=0xc68b1cd4, status=USBD_NORMAL_COMPLETION) at /usr/src/sys/dev/usb/if_ural.c:888 #9 0xc04c9b1a in usb_transfer_complete (xfer=0xc6b82d00) at /usr/src/sys/dev/usb/usbdi.c:863 #10 0xc04acbae in ehci_idone (ex=0xc6b82d00) at /usr/src/sys/dev/usb/ehci.c:852 #11 0xc04acaeb in ehci_check_intr (sc=0xc6893800, ex=0xc6b82d00) at /usr/src/sys/dev/usb/ehci.c:759 #12 0xc04aca25 in ehci_softintr (v=0xc6893800) at /usr/src/sys/dev/usb/ehci.c:693 #13 0xc04c6e55 in usb_schedsoftintr (bus=0x0) at /usr/src/sys/dev/usb/usb.c:871 #14 0xc04ac806 in ehci_intr1 (sc=0xc6893800) at /usr/src/sys/dev/usb/ehci.c:593 #15 0xc04ac746 in ehci_intr (v=0xc6893800) at /usr/src/sys/dev/usb/ehci.c:552 #16 0xc0505059 in ithread_execute_handlers (p=0xc68ff648, ie=0xc67e3b00) at /usr/src/sys/kern/kern_intr.c:682 #17 0xc0505169 in ithread_loop (arg=0xc68b7480) at /usr/src/sys/kern/kern_intr.c:765 #18 0xc0503e0d in fork_exit (callout=0xc0505114 ithread_loop, arg=0xc68b7480, frame=0xe6dc6d38) at /usr/src/sys/kern/kern_fork.c:821 #19 0xc069348c in fork_trampoline () at /usr/src/sys/i386/i386/exception.s:208 This is the same as kern/92083 [1] I could suggest, that you try with the new USB stack by Hans-Petter Selasky. But there is a different bug in his ural(4), that makes it unusable too. http://www.freebsd.org/cgi/query-pr.cgi?pr=kern/92083 Ulrich Spoerlein -- A: Yes. Q: Are you sure? A: Because it reverses the logical flow of conversation. Q: Why is top posting frowned upon? ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
acquiring duplicate lock when mounting nullfs
Hi, this is on a RELENG_6 while mounting /usr/src and /usr/obj via nullfs and doing 'make installkernel installworld' It is similar to LOR #083, but not quite the same acquiring duplicate lock of same type: vnode interlock 1st vnode interlock @ /usr/src/sys/kern/vfs_vnops.c:806 2nd vnode interlock @ /usr/src/sys/kern/vfs_subr.c:2036 KDB: stack backtrace: kdb_backtrace(0,ff,c09816d0,c09816d0,c0907904,...) at kdb_backtrace+0x29 witness_checkorder(c30d56dc,9,c089bd90,7f4) at witness_checkorder+0x578 _mtx_lock_flags(c30d56dc,0,c089bd90,7f4,c218d830,...) at _mtx_lock_flags+0x78 vrefcnt(c30d5660) at vrefcnt+0x1d null_checkvp(c2a8daa0,c08894b8,215) at null_checkvp+0x56 null_lock(cd689a80) at null_lock+0x62 VOP_LOCK_APV(c0900480,cd689a80) at VOP_LOCK_APV+0x87 vn_lock(c2a8daa0,1002,c27a3180,c2a8daa0,c31bbc2c,...) at vn_lock+0xa8 nullfs_root(c246d7c8,2,cd689af8,c27a3180,0,8,0,c09beca0,0,c089b632,3dd) at nullfs_root+0x26 vfs_domount(c27a3180,c261c550,c28f7100,0,c239eb10,c09707e0,0,c089b632,2a3) at vfs_domount+0x91d vfs_donmount(c27a3180,0,c2a12e80,c2a12e80,0,...) at vfs_donmount+0x2ef nmount(c27a3180,cd689d04) at nmount+0x8b syscall(3b,3b,3b,bfbfe424,bfbfec7c,...) at syscall+0x25b Xint0x80_syscall() at Xint0x80_syscall+0x1f --- syscall (378, FreeBSD ELF32, nmount), eip = 0x280ba4d7, esp = 0xbfbfe3ac, ebp = 0xbfbfec28 --- Uli ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: acquiring duplicate lock when mounting nullfs
On 12/29/06, Ulrich Spoerlein [EMAIL PROTECTED] wrote: It is similar to LOR #083, but not quite the same acquiring duplicate lock of same type: vnode interlock 1st vnode interlock @ /usr/src/sys/kern/vfs_vnops.c:806 2nd vnode interlock @ /usr/src/sys/kern/vfs_subr.c:2036 It seems the issue is known: http://lists.freebsd.org/pipermail/freebsd-amd64/2006-March/007824.html Uli ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: kern/92785: Using exported filesystem on OS/2 NFS client causes filesystem freeze
Hi, we too, ran into this problem. OS/2 Clients kill our NFS server. It is running a RELENG_6 snapshot from 2006-11-14. rpc.lockd and rpc.statd are running. I'll conduct a test without those two services shortly. You can still log in the system with ssh and cruse around, but mountd is stuck in ufs state and is no longer serving requests. [EMAIL PROTECTED]:~# ps axl | grep ufs 0 39370 1 0 -4 0 3052 2200 ufsDs??0:00.01 /usr/sbin/mountd -r db show lockedvnods Locked vnodes 0xc87b9414: tag ufs, type VDIR usecount 0, writecount 0, refcount 4 mountedhere 0 flags (VV_ROOT) v_object 0xc8c43c60 ref 0 pages 1 lock type ufs: EXCL (count 4) by thread 0xc8bac300 (pid 6926) with 1 pending#0 0xc0668bf9 at lockmgr+0x4ed #1 0xc078572e at ffs_lock+0x76 #2 0xc0838287 at VOP_LOCK_APV+0x87 #3 0xc06d663c at vn_lock+0xac #4 0xc06ca4ca at vget+0xc2 #5 0xc06c24a9 at vfs_hash_get+0x8d #6 0xc07844af at ffs_vget+0x27 #7 0xc078b253 at ufs_lookup+0xa4b #8 0xc083641b at VOP_CACHEDLOOKUP_APV+0x9b #9 0xc06bf499 at vfs_cache_lookup+0xb5 #10 0xc0836347 at VOP_LOOKUP_APV+0x87 #11 0xc06c3626 at lookup+0x46e #12 0xc0734fba at nfs_namei+0x40e #13 0xc0726d81 at nfsrv_lookup+0x1dd #14 0xc0736765 at nfssvc_nfsd+0x3d9 #15 0xc07360b4 at nfssvc+0x18c #16 0xc0825a07 at syscall+0x25b #17 0xc0811f7f at Xint0x80_syscall+0x1f ino 2, on dev da1s2e db trace 6926 Tracing pid 6926 tid 100106 td 0xc8bac300 sched_switch(c8bac300,0,1) at sched_switch+0x177 mi_switch(1,0) at mi_switch+0x270 sleepq_switch(c8678200) at sleepq_switch+0xc1 sleepq_wait_sig(c8678200) at sleepq_wait_sig+0x1d msleep(c8678200,c09c9f00,158,c088bec9,0,...) at msleep+0x26a nfssvc_nfsd(c8bac300) at nfssvc_nfsd+0xe5 nfssvc(c8bac300,eafd4d04) at nfssvc+0x18c syscall(3b,3b,3b,1,0,...) at syscall+0x25b Xint0x80_syscall() at Xint0x80_syscall+0x1f --- syscall (155, FreeBSD ELF32, nfssvc), eip = 0x280bd1b7, esp = 0xbfbfe90c, ebp = 0xbfbfe928 --- db trace 39370 Tracing pid 39370 tid 100102 td 0xc8bac900 sched_switch(c8bac900,0,1) at sched_switch+0x177 mi_switch(1,0) at mi_switch+0x270 sleepq_switch(c87b946c,c0973440,0,c089798c,211,...) at sleepq_switch+0xc1 sleepq_wait(c87b946c,0,c87b94dc,b7,c08929b8,...) at sleepq_wait+0x46 msleep(c87b946c,c0972500,50,c089c1c1,0,...) at msleep+0x279 acquire(eafe094c,40,6,c8bac900,0,...) at acquire+0x76 lockmgr(c87b946c,2002,c87b94dc,c8bac900) at lockmgr+0x44e ffs_lock(eafe09a4) at ffs_lock+0x76 VOP_LOCK_APV(c0943320,eafe09a4) at VOP_LOCK_APV+0x87 vn_lock(c87b9414,2002,c8bac900,c87b9414) at vn_lock+0xac vget(c87b9414,2002,c8bac900) at vget+0xc2 vfs_hash_get(c86cf2e4,2,2,c8bac900,eafe0abc,0,0) at vfs_hash_get+0x8d ffs_vget(c86cf2e4,2,2,eafe0abc) at ffs_vget+0x27 ufs_root(c86cf2e4,2,eafe0b00,c8bac900,0,...) at ufs_root+0x19 lookup(eafe0ba0) at lookup+0x743 namei(eafe0ba0) at namei+0x39a kern_lstat(c8bac900,bfbfd2a0,0,eafe0c74) at kern_lstat+0x47 lstat(c8bac900,eafe0d04) at lstat+0x1b syscall(3b,3b,3b,281512fb,bfbfc9f1,...) at syscall+0x25b Xint0x80_syscall() at Xint0x80_syscall+0x1f --- syscall (190, FreeBSD ELF32, lstat), eip = 0x2813d427, esp = 0xbfbfc5ac, ebp = 0xbfbfd268 --- I was under the impression, that you are not allowed to sleep while holding a lock in the FreeBSD kernel. Doesn't this also apply to the lockmgr itself? Upon shutting down the system, I had a panic coming in: panic: userret: Returning with 4 locks held. cpuid = 1 KDB: stack backtrace: kdb_backtrace(100,c8bac300,c8bac3c8,c8bad218,c8bac300,...) at kdb_backtrace+0x29 panic(c089806f,4,0,c8bac300,c8bad218,...) at panic+0x114 userret(c8bac300,eafd4d38,0,2,0,...) at userret+0x183 syscall(3b,3b,3b,1,0,...) at syscall+0x321 Xint0x80_syscall() at Xint0x80_syscall+0x1f --- syscall (0, FreeBSD ELF32, nosys), eip = 0x280bd1b7, esp = 0xbfbfe90c, ebp = 0xbfbfe928 --- KDB: enter: panic [thread pid 6926 tid 100106 ] Stopped at kdb_enter+0x2b: nop db bt Tracing pid 6926 tid 100106 td 0xc8bac300 kdb_enter(c0894aec) at kdb_enter+0x2b panic(c089806f,4,0,c8bac300,c8bad218,...) at panic+0x127 userret(c8bac300,eafd4d38,0,2,0,...) at userret+0x183 syscall(3b,3b,3b,1,0,...) at syscall+0x321 Xint0x80_syscall() at Xint0x80_syscall+0x1f --- syscall (0, FreeBSD ELF32, nosys), eip = 0x280bd1b7, esp = 0xbfbfe90c, ebp = 0xbfbfe928 --- db show lockedvnods Locked vnodes 0xc8761c3c: tag ufs, type VDIR usecount 1, writecount 0, refcount 1 mountedhere 0xc86cf2e4 flags () lock type ufs: EXCL (count 1) by thread 0xc8bac780 (pid 59934)#0 0xc0668bf9 at lockmgr+0x4ed #1 0xc078572e at ffs_lock+0x76 #2 0xc0838287 at VOP_LOCK_APV+0x87 #3 0xc06d663c at vn_lock+0xac #4 0xc06c5eba at dounmount+0x62 #5 0xc06c5e31 at unmount+0x1e5 #6 0xc0825a07 at syscall+0x25b #7 0xc0811f7f at Xint0x80_syscall+0x1f ino 8260, on dev ufs/root 0xc87b9414: tag ufs, type VDIR usecount 0, writecount 0, refcount 4 mountedhere 0 flags (VV_ROOT) v_object 0xc8c43c60 ref 0 pages 1 lock type ufs: EXCL (count 4) by thread 0xc8bac300 (pid 6926) with 1 pending#0 0xc0668bf9 at
Re: kern/92785: Using exported filesystem on OS/2 NFS client causes filesystem freeze
On 12/15/06, Kostik Belousov [EMAIL PROTECTED] wrote: This looks like lock leak in nfsd. Could you supply the tcpdump of the session that causes the problem ? Also, it would be very helpful if you could note exact rpc that wedges the server. That would have been my next step. I ran only rpcbind, nfsd and mountd on the file server (no rpc.lockd/rpc.statd). I then had an OS/2 Client mount the filesystem, issue a readdir and then tried to mount the same share from an Linux client. This last mount request never came back, immediately after issueing the mount request the mountd got stuck in state 'ufs' as shown in the backtrace. A tcpdump of the session can be found at: http://coyote.dnsalias.net/rpc.pcap (9kB) Uli PS: Please trim the Email when responding to the GNATS DB as that makes the PR-Trail rather unreadable. Thanks! ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: kern/92785: Using exported filesystem on OS/2 NFS client causes filesystem freeze
On 12/15/06, Kostik Belousov [EMAIL PROTECTED] wrote: Am I right that all you did was ls -l root of nfs mount ? Does OS/2 supports the notion of .. directory ? Could you do just ls -l .. from nfs client and then try stat root of exported fs on the server (i think it shall hang) ? Yes, you are right about the symptoms. We tried the following on the OS/2 Client mount export umount export mount export umount export this is all working fine, then we do a dir on the mounted FS mount i: /export/foo dir i: umount -- haning, as mountd can't process the RPC. My hypothesis is that LOOKUP RPC for .. causes directory vnode lock leak in nfs_namei. After that, mountd hang is just consequence. So, I mounted from the OS/2 Client, ran a dir on the i: drive and then an stat(1) to the exported partition on the server. This stat would hang, here's the backtraces: db ps pid ppid pgrp uid state wmesg wchancmd 33017 88035 33017 0 S+ ufs 0xc8771880 stat 23627 55476 23627 0 S+ bpf 0xc8e16c00 tcpdump 88035 87505 88035 0 S+ pause0xc882bcc4 tcsh 87505 72558 87505 1000 S+ wait 0xc86f9218 su 72558 89630 72558 1000 Ss+ pause0xc873867c tcsh 21229 1 21229 0 Ss select 0xc09c10c4 mountd 91293 79042 79042 0 S -0xc8668200 nfsd 88479 79042 79042 0 S -0xc8668600 nfsd 86952 79042 79042 0 S -0xc847cc00 nfsd 83659 79042 79042 0 S -0xc8678200 nfsd 79042 1 79042 0 Ss accept 0xc8d649f6 nfsd 55476 52005 55476 0 S+ pause0xc8bcc24c tcsh 52005 95193 52005 1000 S+ wait 0xc8734648 su ... db show lockedvnods Locked vnodes 0xc8771828: tag ufs, type VDIR usecount 0, writecount 0, refcount 4 mountedhere 0 flags (VV_ROOT) v_object 0xc8a8a084 ref 0 pages 1 lock type ufs: EXCL (count 1) by thread 0xc882f900 (pid 83659) with 1 pending#0 0xc0668bf9 at lockmgr+0x4ed #1 0xc078572e at ffs_lock+0x76 #2 0xc0838287 at VOP_LOCK_APV+0x87 #3 0xc06d663c at vn_lock+0xac #4 0xc06ca4ca at vget+0xc2 #5 0xc06c24a9 at vfs_hash_get+0x8d #6 0xc07844af at ffs_vget+0x27 #7 0xc078b253 at ufs_lookup+0xa4b #8 0xc083641b at VOP_CACHEDLOOKUP_APV+0x9b #9 0xc06bf499 at vfs_cache_lookup+0xb5 #10 0xc0836347 at VOP_LOOKUP_APV+0x87 #11 0xc06c3626 at lookup+0x46e #12 0xc0734fba at nfs_namei+0x40e #13 0xc0726d81 at nfsrv_lookup+0x1dd #14 0xc0736765 at nfssvc_nfsd+0x3d9 #15 0xc07360b4 at nfssvc+0x18c #16 0xc0825a07 at syscall+0x25b #17 0xc0811f7f at Xint0x80_syscall+0x1f ino 2, on dev da1s2e db tr 33017 Tracing pid 33017 tid 100125 td 0xc86fd600 sched_switch(c86fd600,0,1) at sched_switch+0x177 mi_switch(1,0) at mi_switch+0x270 sleepq_switch(c8771880,c0973440,0,c089798c,211,...) at sleepq_switch+0xc1 sleepq_wait(c8771880,0,c87718f0,b7,c08929b8,...) at sleepq_wait+0x46 msleep(c8771880,c0972590,50,c089c1c1,0,...) at msleep+0x279 acquire(eb01694c,40,6,c86fd600,0,...) at acquire+0x76 lockmgr(c8771880,2002,c87718f0,c86fd600) at lockmgr+0x44e ffs_lock(eb0169a4) at ffs_lock+0x76 VOP_LOCK_APV(c0943320,eb0169a4) at VOP_LOCK_APV+0x87 vn_lock(c8771828,2002,c86fd600,c8771828) at vn_lock+0xac vget(c8771828,2002,c86fd600) at vget+0xc2 vfs_hash_get(c87115c8,2,2,c86fd600,eb016abc,0,0) at vfs_hash_get+0x8d ffs_vget(c87115c8,2,2,eb016abc) at ffs_vget+0x27 ufs_root(c87115c8,2,eb016b00,c86fd600,0,...) at ufs_root+0x19 lookup(eb016ba0) at lookup+0x743 namei(eb016ba0) at namei+0x39a kern_lstat(c86fd600,bfbfed99,0,eb016c74) at kern_lstat+0x47 lstat(c86fd600,eb016d04) at lstat+0x1b syscall(3b,3b,3b,0,bfbfebf0,...) at syscall+0x25b Xint0x80_syscall() at Xint0x80_syscall+0x1f --- syscall (190, FreeBSD ELF32, lstat), eip = 0x2812d427, esp = 0xbfbfeb9c, ebp = 0xbfbfec68 --- db tr 83659 Tracing pid 83659 tid 100115 td 0xc882f900 sched_switch(c882f900,0,1) at sched_switch+0x177 mi_switch(1,0) at mi_switch+0x270 sleepq_switch(c8678200) at sleepq_switch+0xc1 sleepq_wait_sig(c8678200) at sleepq_wait_sig+0x1d msleep(c8678200,c09c9f00,158,c088bec9,0,...) at msleep+0x26a nfssvc_nfsd(c882f900) at nfssvc_nfsd+0xe5 nfssvc(c882f900,eaf8ad04) at nfssvc+0x18c syscall(3b,3b,3b,1,0,...) at syscall+0x25b Xint0x80_syscall() at Xint0x80_syscall+0x1f --- syscall (155, FreeBSD ELF32, nfssvc), eip = 0x280bd1b7, esp = 0xbfbfe90c, ebp = 0xbfbfe928 --- Do you think you can fix it? Any idea why this seems to only happen with OS/2 Clients? Uli ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: ggate still broken on 6.2-RC1 for amd64.
Craig Boston wrote: Have you tried increasing the send/receive buffer size? In my local ggate setup I'm running both the client and server with the options -R 196608 -S 196608. I added it a while back after discovering that the default buffer size was inadequate in certain situations and would sometimes cause large block sized I/O to hang. Heh, this is funny. I have reports from another source, who _decreases_ bufsize to 8kB, because that is giving him the most performance. Since I'm using HPS' USB stack I can't use my uplcom device and therefore cannot usefully test some more ggate/gmirror scenarios on -CURRENT ... But I'll whip up a ggate test case. Ulrich Spoerlein -- A: Yes. Q: Are you sure? A: Because it reverses the logical flow of conversation. Q: Why is top posting frowned upon? ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: ggate still broken on 6.2-RC1 for amd64.
Ulrich Spoerlein wrote: But I'll whip up a ggate test case. Very strange ... I thought I would work through different buffer sizes, starting with some low value. Here's what gives: igor# ggated -a localhost -v -R8k -S8k /tmp/ggate_exports igor# ggatec create -v -R8k -S8k localhost /tmp/ggate_test info: Reading exports file (/tmp/ggate_exports).info: Connected to the server: localhost:3080. debug: Added 127.0.0.1/32 /tmp/ggate_test RW to exports list. debug: Sending version packet. info: Exporting 1 object(s). info: Listen on port: 3080. info: Connection from: 127.0.0.1. debug: Receiving version packet. debug: Version packet received. debug: Receiving initial packet. VERY LONG PAUSE debug: Initial packet received. debug: Sending initial packet. debug: Connection created [127.0.0.1, /tmp/ggate_test]. debug: Receiving initial packet. debug: New connection created (token=226910802).debug: Received initial packet. debug: Sending initial packet. info: Connected to the server: localhost:3080. debug: Sending version packet. VERY LONG PAUSE g_gate_send: EAGAIN g_gate_send: EAGAIN g_gate_send: EAGAIN g_gate_send: EAGAIN info: Connection from: 127.0.0.1. ^C debug: Receiving version packet. ^C Now try with 16k. igor# ggated -a localhost -v -R16k -S16k /tmp/ggate_exports igor# ggatec create -v -R16k -S16k localhost /tmp/ggate_test info: Reading exports file (/tmp/ggate_exports).info: Connected to the server: localhost:3080. debug: Added 127.0.0.1/32 /tmp/ggate_test RW to exports list. debug: Sending version packet. info: Exporting 1 object(s). info: Listen on port: 3080. info: Connection from: 127.0.0.1. debug: Receiving version packet. debug: Version packet received. debug: Receiving initial packet. LONG PAUSE debug: Initial packet received. debug: Sending initial packet. debug: Connection created [127.0.0.1, /tmp/ggate_test]. debug: Receiving initial packet. debug: New connection created (token=2294332471). debug: Received initial packet. debug: Sending initial packet. info: Connected to the server: localhost:3080. info: Connection from: 127.0.0.1. debug: Sending version packet. debug: Receiving version packet. debug: Version packet received. debug: Receiving initial packet. LONG PAUSE debug: Initial packet received. debug: Sending initial packet. debug: Found existing connection (token=2294332471).debug: Receiving initial packet. debug: Connection added [127.0.0.1, /tmp/ggate_test]. debug: Received initial packet. debug: Sending initial packet. ggate5 debug: Connection removed [127.0.0.1 /tmp/ggate_test]. notice: send_thread: started! debug: Process created [/tmp/ggate_test]. notice: recv_thread: started! notice: disk_thread: started [/tmp/ggate_test]! notice: send_thread: started [/tmp/ggate_test]! notice: recv_thread: started [/tmp/ggate_test]! debug: Process 1140 exiting. ^C I wanted to use something like the following, for first draft of a benchmark, but I just I/O deadlocked the system (6.2 and CURRENT). Simply by running ggated/ggatec in various combinations. db show alllocks Process 1333 (ggatel) thread 0xc2767510 (100081) exclusive sx sysctl lock r = 0 (0xc078c420) locked @ /vol/src/sys/kern/kern_sysctl.c:1376 db trace 1333 Tracing pid 1333 tid 100081 td 0xc2767510 sched_switch(c2767510,0,1) at sched_switch+0xe7 mi_switch(1,0) at mi_switch+0x27c sleepq_switch(c2b3e680,c078bdd0,0,c070e413,236,...) at sleepq_switch+0xc9 sleepq_timedwait(c2b3e680) at sleepq_timedwait+0x4a msleep(c2b3e680,0,4c,c07028f3,64) at msleep+0x281 g_waitfor_event(c050d120,c2b6c300,2,0,0,0,0,1) at g_waitfor_event+0x73 sysctl_kern_geom_confxml(c07485e0,0,0,d1781b9c,c07485e0,...) at sysctl_kern_geom_confxml+0x26 sysctl_root(0,d1781c1c,3,d1781b9c) at sysctl_root+0x12f userland_sysctl(c2767510,d1781c1c,3,830,bfbfe3d8,0,0,0,d1781c18,0,c078bde8,0,c070bc1f,522) at userland_sysctl+0xf4 __sysctl(c2767510,d1781d04) at __sysctl+0x77 syscall(3b,3b,3b,3,bfbfe3d8,...) at syscall+0x27e Xint0x80_syscall() at Xint0x80_syscall+0x1f --- syscall (202, FreeBSD ELF32, __sysctl), eip = 0x2816ba7f, esp = 0xbfbfe2bc, ebp = 0xbfbfe2f8 --- db ps pid ppid pgrp uid state wmesg wchancmd 1348 800 800 0 S sysctl l 0xc078c444 cron 1347 800 800 0 S sysctl l 0xc078c444 cron 1346 800 800 0 S sysctl l 0xc078c444 cron 1345
Re: ggate still broken on 6.2-RC1 for amd64.
David Gilbert wrote: GGate is still broken on 6.2-RC1 for amd64. I have verified that the patch in kern/104829 has been applied (it's in the tree). I have also added the patch in amd64/91799 --- without it, ggated doesn't work at all. This should definately make it into 6.2 But the ggated/ggatec in 6.2-RC1 connects now (and is happy about that). In fact, the tasting on the ggatec side that happens due to new disks showing up works, too. However, any attempt to pass significant traffic causes ggatec to seeminly lock up. In my configuration, I have a gmirror running with a local disk (already) and I want to gmirror insert the ggate disk. When I do so, I get 50 write requests queued (I upped the gmirror buffer count to 50 to make syncronization happen faster) and things never move from there. /me too. Though I tested this on two FreeBSD/i386 SMP machines with gmirror + ggated combination. There *is* traffic going on, but it is somewhere around 50kB/s (sic! no kidding!). Also, forcefully removing the ggate0 provider (ggatec destroy -fu0), which should not impact the mirror operation in any way, panic'ed the system. I can't rebuild this test scenario on -CURRENT right now, but will do so time permitting. Maybe this is related to the gmirror deadlock I reported. But I no longer have SMP hardware to play with ... Ulrich Spoerlein -- A: Yes. Q: Are you sure? A: Because it reverses the logical flow of conversation. Q: Why is top posting frowned upon? ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: gmirror and quota corruption
Jason Vance wrote: I have a FreeBSD 5.5-STABLE box that is setup with a gmirror RAID 1 using two identical harddrives. ... The system boots up but as soon as I do any disk access ie 'repquota -a' or write a file to the harddrive, the system hangs. I can still connect to the various services via telnet to their port, but none of them respond. ... Is there a known conflict between gmirror and a quota enabled filesystem? How can I properly set these up? Could you please re-test this setup with a kernel *without* option PREEMTION and share your results? Also, is this a UP or SMP machine? Ulrich Spoerlein -- A: Yes. Q: Are you sure? A: Because it reverses the logical flow of conversation. Q: Why is top posting frowned upon? ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
geom/gstat diplay bug?
Hi all, one of our servers running FreeBSD 5.5 was seriously swapping (1.9GB of 2GB swap used) and to see the performance of the ad0s1b device, I fired up gstat. This is the current output (it has stopped swapping) dT: 0.510 flag_I 50us sizeof 240 i -1 L(q) ops/sr/s kBps ms/rw/s kBps ms/w %busy Name 0 31 0 00.0 312550.41.3| ad0 0 31 0 00.0 312550.41.3| ad0s1 1 49 0 00.0 49 62744.5 22.7| ad2 0 0 0 00.0 0 00.00.0| ad0s1a 4294967287 0 0 00.0 0 00.00.0| ad0s1b 0 0 0 00.0 0 00.00.0| ad0s1c 0 0 0 00.0 0 00.00.0| ad0s1d 0 31 0 00.0 312550.51.4| ad0s1e ... There are two possible explanations, AFAICT: a) This is a dual CPU machine, so the L(q)++ and L(q)-- operations were not strictly atomic, causing the counter to go -1. b) or, the L(q) is computed by some addition/multiplication (doubtful) and since the queue length was very, very long we got a integer overflow. Interesting thing is, that gstate decodes the queue length as an uint64_t value. Ah, I see now, that L(q) is computed by end_count - start_count of struct devstat. Of course, I had lots of swap_pager_getswapspace(9): failed errors on the console, as the system was running out of swap space. Are these transactions somehow counted wrong? Uli ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
systat -vm output showing negative total virtual memory
Hi all, this is on a two week old RELENG_6. The machine has 4GB RAM, SMP CPU: Intel(R) Xeon(TM) CPU 3.00GHz (3012.12-MHz 686-class CPU) Origin = GenuineIntel Id = 0xf43 Stepping = 3 Features=0xbfebfbffFPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CLFLUS H,DTS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE Features2=0x641dSSE3,RSVD2,MON,DS_CPL,CNTX-ID,CX16,b14 AMD Features=0x2010NX,LM real memory = 3489071104 (3327 MB) avail memory = 3414265856 (3256 MB) ACPI APIC Table: PTLTD APIC FreeBSD/SMP: Multiprocessor System Detected: 2 CPUs cpu0 (BSP): APIC ID: 0 cpu1 (AP): APIC ID: 6 Mem:KBREALVIRTUAL VN PAGER SWAP PAGER Tot Share TotShareFree in out in out Act 1198620 115040 1480676 289860 153004 count All 3330652 116920 -956751k 293960 pages vm.vmtotal has this to say System wide totals computed every five seconds: (values in kilobytes) === Processes: (RUNQ: 3 Disk Wait: 1 Page Wait: 1 Sleep: 40) Virtual Memory: (Total: 815944K, Active 355288K) Real Memory:(Total: 2558540K Active 150424K) Shared Virtual Memory: (Total: 11460K Active: 7856K) Shared Real Memory: (Total: 6916K Active: 5044K) Free Memory Pages: 890092K Uli ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: systat -vm output showing negative total virtual memory
Ruslan Ermilov wrote: sysctl(8) knows that t_vm is in bytes, but for the other stats it thinks they are in pages. systat -vm thinks they are all in bytes. Here's a fix: Thanks!, I applied your patch to RELENG_6 # sysctl vm.vmtotal ; ./sysctl vm.vmtotal vm.vmtotal: System wide totals computed every five seconds: (values in kilobytes) === Processes: (RUNQ: 1 Disk Wait: 0 Page Wait: 0 Sleep: 45) Virtual Memory: (Total: 797461K, Active 92512K) Real Memory:(Total: 3327992K Active 48124K) Shared Virtual Memory: (Total: 11856K Active: 7772K) Shared Real Memory: (Total: 7644K Active: 5364K) Free Memory Pages: 145964K vm.vmtotal: System wide totals computed every five seconds: (values in kilobytes) === Processes: (RUNQ: 1 Disk Wait: 0 Page Wait: 0 Sleep: 45) Virtual Memory: (Total: 797461K, Active 22K) Real Memory:(Total: 3327992K Active 48128K) Shared Virtual Memory: (Total: 2K Active: 1K) Shared Real Memory: (Total: 7644K Active: 5364K) Free Memory Pages: 145964K 22K active VM and 1K shared? Seems pretty low to me... Here's the systat -vm output Mem:KBREALVIRTUAL Tot Share TotShareFree Act 48384542492800 7844 145692 count All 33282647704-1028565k11928 pages Mem:KBREALVIRTUAL Tot Share TotShareFree Act 484645372 221 145692 count All 33282647652 7974612 pages The total value seems more sane, but I doubt the active total value. top(1) says 106 processes: 3 running, 80 sleeping, 1 zombie, 22 waiting CPU states: 8.9% user, 0.0% nice, 11.4% system, 0.8% interrupt, 78.9% idle Mem: 48M Active, 2834M Inact, 239M Wired, 133M Cache, 112M Buf, 4680K Free Swap: 1024M Total, 36K Used, 1024M Free Yes, the system is totally idle, that may explain the values above. If your fix is correct (sorry, but I'm not in a position to judge your work), would it be possible to have a quick MFC to RELENG_6 and RELENG_6_2? Ulrich Spoerlein -- A: Yes. Q: Are you sure? A: Because it reverses the logical flow of conversation. Q: Why is top posting frowned upon? ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
ntpd vs nss_ldap: Crashing in getaddrinfo
Hi, I needed to test the ntpd from ports (net/ntp, net/ntp-devel, net/ntp-stable), but they always crashed with a SIGBUS error. Investigation lead to nss_ldap being the culprit. With nss_ldap installed and NO keyword ldap in /etc/nsswitch.conf, ntpd will run fine. If you either add ldap to passwd or group or both, ntpd will crash calling gethostaddr (even though LDAP is only used for passwd/group) /etc/nsswitch.conf: group: files ldap hosts: files dns networks: files passwd: files ldap shells: files [EMAIL PROTECTED]:/usr/ports/net/ntp-stable/work/ntp-4.2.2p4-RC4/ntpd# gdb ./ntpd GNU gdb 6.1.1 [FreeBSD] Copyright 2004 Free Software Foundation, Inc. GDB is free software, covered by the GNU General Public License, and you are welcome to change it and/or distribute copies of it under certain conditions. Type show copying to see the conditions. There is absolutely no warranty for GDB. Type show warranty for details. This GDB was configured as i386-marcel-freebsd... (gdb) r -d Starting program: /usr/ports/net/ntp-stable/work/ntp-4.2.2p4-RC4/ntpd/ntpd -d ntpd [EMAIL PROTECTED] Wed Nov 15 09:56:13 UTC 2006 (1) addto_syslog: precision = 1.117 usec create_sockets(123) addto_syslog: no IPv6 interfaces found addto_syslog: ntp_io: estimated max descriptors: 10951, initial socket boundary: 20 bind() fd 20, family 2, port 123, addr 0.0.0.0, flags=9 Added addr 0.0.0.0 to list of addresses addto_syslog: Listening on interface wildcard, 0.0.0.0#123 Disabled bind() fd 21, family 2, port 123, addr 16.30.58.127, flags=25 Added addr 16.30.58.127 to list of addresses addto_syslog: Listening on interface xl0, 16.30.58.127#123 Enabled bind() fd 22, family 2, port 123, addr 127.0.0.1, flags=21 Added addr 127.0.0.1 to list of addresses addto_syslog: Listening on interface lo0, 127.0.0.1#123 Enabled init_io: maxactivefd 22 local_clock: time 0 base 0.00 offset 0.00 freq 0.000 state 0 Program received signal SIGBUS, Bus error. 0x280a98c8 in memset () from /libexec/ld-elf.so.1 (gdb) bt #0 0x280a98c8 in memset () from /libexec/ld-elf.so.1 #1 0x280c2100 in ?? () #2 0x2809f039 in map_object () from /libexec/ld-elf.so.1 #3 0x2809c115 in elf_hash () from /libexec/ld-elf.so.1 #4 0x2809c21c in elf_hash () from /libexec/ld-elf.so.1 #5 0x2809de8c in dlopen () from /libexec/ld-elf.so.1 #6 0x2828140c in _nsdbtaddsrc () from /lib/libc.so.6 #7 0x2827cb92 in ___toupper () from /lib/libc.so.6 #8 0x2827d1b4 in _nsyyparse () from /lib/libc.so.6 #9 0x2828179e in nsdispatch () from /lib/libc.so.6 #10 0x28271776 in getaddrinfo () from /lib/libc.so.6 #11 0x0804bfee in getnetnum (num=0xbfbfe537 ntp0..com, addr=0xbfbfe9d0, complain=0, a_type=t_UNK) at ntp_config.c: #12 0x0804cb5f in getconfig (argc=2, argv=0xbfbfebcc) at ntp_config.c:652 #13 0x0805246e in ntpdmain (argc=2, argv=0xbfbfebcc) at ntpd.c:744 #14 0x080527bb in main (argc=2, argv=0xbfbfebcc) at ntpd.c:274 (gdb) f 11 #11 0x0804bfee in getnetnum (num=0xbfbfe537 ntp0..com, addr=0xbfbfe9d0, complain=0, a_type=t_UNK) at ntp_config.c: retval = getaddrinfo(num, ntp, hints, ptr); (gdb) l 2217hints.ai_socktype = SOCK_DGRAM; 2218#ifdef DEBUG 2219if (debug 3) 2220printf(getaddrinfo %s\n, num); 2221#endif retval = getaddrinfo(num, ntp, hints, ptr); 2223if (retval != 0 || 2224 (ptr-ai_family == AF_INET6 isc_net_probeipv6() != ISC_R_SUCCESS)) { 2225if (complain) 2226msyslog(LOG_ERR, (gdb) What's happening? Uli ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
dump(8): how many bytes written to tape?
Hi, I'm trying to figure out how much bytes were written to a tape by dump(8). I'm using a blocksize of 64kB to maximize throughput to the tape drive. Initially, I thought I could just add up the number of tape blocks written by dump and multiply by 64kB. But it looks like dump is still reporting those values as 1kB blocks. Here's some sample output: DUMP: Date of this level 1 dump: Wed Nov 15 09:46:37 2006 DUMP: Date of last level 0 dump: the epoch DUMP: Cache 256 MB, blocksize = 65536 DUMP: DUMP: 30676 tape blocks on 1 volume DUMP: finished in 1 seconds, throughput 30676 KBytes/sec DUMP: Date of this level 1 dump: Wed Nov 15 10:25:38 2006 DUMP: Date of last level 0 dump: the epoch DUMP: DUMP: 4650864 tape blocks on 1 volume DUMP: finished in 132 seconds, throughput 35233 KBytes/sec DUMP: Date of this level 1 dump: Wed Nov 15 10:50:36 2006 DUMP: Date of last level 0 dump: the epoch DUMP: DUMP: 328548 tape blocks on 1 volume DUMP: finished in 14 seconds, throughput 23467 KBytes/sec DUMP: Date of this level 1 dump: Wed Nov 15 11:00:14 2006 DUMP: Date of last level 0 dump: the epoch DUMP: DUMP: 36925423 tape blocks on 1 volume DUMP: finished in 973 seconds, throughput 37950 KBytes/sec If I add the time*throughput, I get 41GB. If I add the number of tape blocks and assume a block size of 1kB, I get 41GB, too. So, how exactly is the '-b64' parameter to dump(8) affecting the block size on tape? Uli ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: problems with shutdown after dump on a large partition
Anatoliy Dmytriyev wrote: I got problems with shutdown after dump with ???-L??? (with spashots) on a large partition: We have large partition with 872G on ???df ???H??? report. Exactly before shutdown the ???dump ???Lau??? was finished without any problems. After dump finished I run command ???shutdown ???h now??? and in the result shutdown was incorrect because disk sync was terminated by timeout and fsck was run on the next boot. I'm not entirely shure, but this looks like the snapshot generated by dump -L was not yet cleaned up. You should wait a couple of minutes (depending on the snapshot size and I/O turnover) before shutting down the system or umounting the partition. I don't know of a way to decide if the snapshot has been fully cleaned up. Ulrich Spoerlein -- A: Yes. Q: Are you sure? A: Because it reverses the logical flow of conversation. Q: Why is top posting frowned upon? ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: RELENG_6: I/O deadlock under load
On 10/28/06, Christian S.J. Peron [EMAIL PROTECTED] wrote: It almost looks as if a user frequently runs gmirror(8) to query the status of their array. Under a high load situation, the worker is busy, so at one un-lucky momment, gmirror(8) is run: (1) gmirror(8) waits for sc-sc_lock owned by the worker (2) The worker then drops the lock (3) gmirror(8) proceeds (4) Worker wakes up and waits for sc-sc_lock (5) Only gmirror never will because it's waiting on a resource (presumably owned by the worker thread)? I am not certain this is correct, so I have included pjd in the CC loop, hoping he can help shed some light on the subject :) This is just a followup to report that the problem seems unreproducable on an identical kernel if I leave out option PREEMPTION. Performance sucks that way, but at least it's stable now. Pawel seems to be rather busy with his GJOURNAL work and his ZFS port, is someone else able to reproduce the problem? Uli ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: panic: vfs_getopt: caller passed 'opts' as NULL
On 10/31/06, Kris Kennaway [EMAIL PROTECTED] wrote: Note that they'll be demand-loaded if requested (e.g. if you try to mount_nullfs). Maybe you or something else tried to mount such a filesystem by accident? But the point is mood anyway, since I could not reproduce the problem. I tried again after rebooting the machine and everything went just fine ... I have to use the nullfs mounts on another machine shortly, let's see what happens there. It reliably paniced in single user mode, with no other modules loaded at the time. But, I see now that nullfs.ko is loaded as a module, which might explain everything. I assumed it was built in. I rebooted to a kernel without DEBUG_VFS_LOCKS and it's happily using nullfs. I'll try once more with a debugging kernel, that has nullfs built in, but I'll guess the panic vanishes. Ok, with the attached kernel config, which includes nullfs, I get a duplicate lock, instead of a panic Trying to mount root from ufs:/dev/da0s1a acquiring duplicate lock of same type: vnode interlock 1st vnode interlock @ /usr/src/sys/kern/vfs_vnops.c:806 2nd vnode interlock @ /usr/src/sys/kern/vfs_subr.c:2036 KDB: stack backtrace: kdb_backtrace(3,c894fa80,c0a47110,c0a47110,c09cb524,...) at kdb_backtrace+0x29 witness_checkorder(c8622d04,9,c0951b38,7f4) at witness_checkorder+0x578 _mtx_lock_flags(c8622d04,0,c0951b38,7f4,c840b590,...) at _mtx_lock_flags+0x78 vrefcnt(c8622c3c) at vrefcnt+0x20 null_checkvp(c8a7ed98,c093f5ae,215) at null_checkvp+0x56 null_lock(eb0bba80) at null_lock+0x66 VOP_LOCK_APV(c09c40a0,eb0bba80) at VOP_LOCK_APV+0x87 vn_lock(c8a7ed98,1002,c894fa80,c8a7ed98,c8a89224,...) at vn_lock+0xac nullfs_root(c88052e4,2,eb0bbaf8,c894fa80,0,8,0,c0a84040,0,c09513da,3dd) at nullfs_root+0x26 vfs_domount(c894fa80,c83e64c0,c8493490,0,c83fdad0,c0a38380,0,c09513da,2a3) at vfs_domount+0x975 vfs_donmount(c894fa80,0,c87f4e00,c87f4e00,0,...) at vfs_donmount+0x2ef nmount(c894fa80,eb0bbd04) at nmount+0x8b syscall(3b,3b,3b,bfbfe435,bfbfecc8,...) at syscall+0x25b Xint0x80_syscall() at Xint0x80_syscall+0x1f --- syscall (378, FreeBSD ELF32, nmount), eip = 0x280ba4d7, esp = 0xbfbfe3fc, ebp = 0xbfbfec78 --- I grepped /sys for DEBUG_VFS_LOCKS and it seems to only add some additional KASSERTs, but not the one which triggered in the original panic. Nullfs seems more fragile than I initially thought ... Uli DEBUG Description: Binary data ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: panic: vfs_getopt: caller passed 'opts' as NULL
Kris Kennaway wrote: Nullfs seems more fragile than I initially thought ... It's just that compiling in the extra debugging (it might be DEBUG_LOCKS or DEBUG_VFS_LOCKS, I forget which), causes the sizes of structures to change, so when the module tries to fondle the structure at a certain offset thinking it's accessing a certain field, it's really fondling something else entirely and the kernel gets a nasty surprise and panics. It is DEBUG_LOCKS. The DEBUG_VFS_LOCKS macro only enables additional code at runtime, it does not alter the ABI. Ironically, it is even documented in conf/NOTES. For the future, I have to remember that nullfs is a module. Ulrich Spoerlein -- A: Yes. Q: Are you sure? A: Because it reverses the logical flow of conversation. Q: Why is top posting frowned upon? ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
panic: vfs_getopt: caller passed 'opts' as NULL
RELENG_6 from 30th October, trying to do two nullfs mounts from two amd-mounted directories (i.e., NFS mounts). Funny thing is, this amd/nfs/mount_nullfs is working on several other machines from a RELENG_6 checkout of 25th October. panic: vfs_getopt: caller passed 'opts' as NULL cpuid = 1 KDB: stack backtrace: kdb_backtrace(100,c8506780,c852c870,c8df3450,e8d0ca5c,...) at kdb_backtrace+0x29 panic(c089c395,c852c870,c8721b90,e8d0ca80,e8d0cadc,...) at panic+0x114 vfs_getopt(0,c8df3450,e8d0ca58,e8d0ca5c,0,...) at vfs_getopt+0x1d nullfs_mount(c8721b90,c8506780,0,c8df46c0,c8cd1c3c,...) at nullfs_mount+0x70 vfs_domount(c8506780,c852c870,c8433a40,0,c851cc50,c0971700,0,c089be7a,2a3) at vfs_domount+0x687 vfs_donmount(c8506780,0,c86ffd00,c86ffd00,0,...) at vfs_donmount+0x2ef nmount(c8506780,e8d0cd04) at nmount+0x8b syscall(3b,3b,3b,bfbfe3b4,bfbfec0c,...) at syscall+0x25b Xint0x80_syscall() at Xint0x80_syscall+0x1f --- syscall (378, FreeBSD ELF32, nmount), eip = 0x280ba4d7, esp = 0xbfbfe33c, ebp = 0xbfbfebb8 --- KDB: enter: panic [thread pid 60225 tid 100085 ] Stopped at kdb_enter+0x2b: nop ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Fwd: panic: vfs_getopt: caller passed 'opts' as NULL
On 10/30/06, Kris Kennaway [EMAIL PROTECTED] wrote: panic: vfs_getopt: caller passed 'opts' as NULL This can happen if you are using filesystem modules but your kernel is built with nonstandard options (DEBUG_*_LOCKS is a culprit, I think). Interesting, but no filesystem modules were involved. Infact, even geom_mirror and geom_label were statically built into the kernel. But the point is mood anyway, since I could not reproduce the problem. I tried again after rebooting the machine and everything went just fine ... I have to use the nullfs mounts on another machine shortly, let's see what happens there. Uli ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: RELENG_6: I/O deadlock under load
Ulrich Spoerlein wrote: Our fileserver deadlocked, again. It is running RELENG_6 checked out yesterday. I have enabled DDB, WITNESS and INVARIANTS and have it hooked up via serial console. Happend again, now I have DEBUG_LOCKS and DEBUG_VFS_LOCK included. There are hundreds of cron processes waiting on wmesg 'sysctl' (they seem to have piled up prior to me entering the debugger). db show pcpu cpuid= 0 curthread= 0xc8326780: pid 11 idle: cpu0 curpcb = 0xe6f1fd90 fpcurthread = none idlethread = 0xc8326780: pid 11 idle: cpu0 APIC ID = 0 currentldt = 0x50 spin locks held: db show allpcpu Current CPU: 0 cpuid= 0 curthread= 0xc8326780: pid 11 idle: cpu0 curpcb = 0xe6f1fd90 fpcurthread = none idlethread = 0xc8326780: pid 11 idle: cpu0 APIC ID = 0 currentldt = 0x50 spin locks held: cpuid= 1 curthread= 0xc8326600: pid 10 idle: cpu1 curpcb = 0xe6f1cd90 fpcurthread = none idlethread = 0xc8326600: pid 10 idle: cpu1 APIC ID = 6 currentldt = 0x50 spin locks held: db show alllocks Process 60935 (gmirror) thread 0xc88ce780 (100122) exclusive sx sysctl lock r = 0 (0xc0971dc0) locked @ /usr/src/sys/kern/kern_sysctl.c:1375 Process 50 (g_mirror gm0) thread 0xc86b7600 (100062) exclusive sx gmirror:lock r = 0 (0xc84b282c) locked @ /usr/src/sys/geom/mirror/g_mirror.c:1809 'gm0' is the mirror where the OS resides on. It is 8GB in size and spans across da0s1 and da1s1 which are RAID5 volumes attached through two twa(4) controllers. db show lockedvnods Locked vnodes 0xcb4a4984: tag ufs, type VREG usecount 1, writecount 0, refcount 3 mountedhere 0 flags () v_object 0xcc804e70 ref 0 pages 1 lock type ufs: SHARED (count 1)#0 0xc0667314 at lockmgr+0x160 #1 0xc0783fea at ffs_lock+0x76 #2 0xc083688f at VOP_LOCK_APV+0x87 #3 0xc06d50b8 at vn_lock+0xac #4 0xc06d478e at vn_read+0x132 #5 0xc0697a89 at dofileread+0x85 #6 0xc0697922 at kern_readv+0x36 #7 0xc069784d at read+0x45 #8 0xc0824037 at syscall+0x25b #9 0xc08106af at Xint0x80_syscall+0x1f ino 8315, on dev ufs/root 0xc87682b8: tag ufs, type VDIR usecount 1, writecount 0, refcount 4 mountedhere 0 flags () v_object 0xcb4b6630 ref 0 pages 1 lock type ufs: EXCL (count 1) by thread 0xc850b000 (pid 43987)#0 0xc06676a1 at lockmgr+0x4ed #1 0xc0783fea at ffs_lock+0x76 #2 0xc083688f at VOP_LOCK_APV+0x87 #3 0xc06d50b8 at vn_lock+0xac #4 0xc06c8f46 at vget+0xc2 #5 0xc06bd9be at cache_lookup+0x34a #6 0xc06bdef2 at vfs_cache_lookup+0x92 #7 0xc083494f at VOP_LOOKUP_APV+0x87 #8 0xc06c20a2 at lookup+0x46e #9 0xc06c19b6 at namei+0x39a #10 0xc06d3e9f at vn_open_cred+0x5b #11 0xc06d3e42 at vn_open+0x1e #12 0xc06cd342 at kern_open+0xb6 #13 0xc06cd256 at open+0x1a #14 0xc0824037 at syscall+0x25b #15 0xc08106af at Xint0x80_syscall+0x1f ino 94210, on dev ufs/var 0xc87746cc: tag ufs, type VREG usecount 1, writecount 1, refcount 3 mountedhere 0 flags () v_object 0xc876a210 ref 0 pages 3 lock type ufs: EXCL (count 1) by thread 0xc86b7000 (pid 14753)#0 0xc06676a1 at lockmgr+0x4ed #1 0xc0783fea at ffs_lock+0x76 #2 0xc083688f at VOP_LOCK_APV+0x87 #3 0xc06d50b8 at vn_lock+0xac #4 0xc06d4a54 at vn_write+0x138 #5 0xc0697d5f at dofilewrite+0x77 #6 0xc0697c03 at kern_writev+0x3b #7 0xc0697bac at writev+0x30 #8 0xc0824037 at syscall+0x25b #9 0xc08106af at Xint0x80_syscall+0x1f ino 94280, on dev ufs/var 0xca357414: tag ufs, type VDIR usecount 1, writecount 0, refcount 2 mountedhere 0 flags () lock type ufs: EXCL (count 1) by thread 0xc8cdf480 (pid 20101)#0 0xc06676a1 at lockmgr+0x4ed #1 0xc0783fea at ffs_lock+0x76 #2 0xc083688f at VOP_LOCK_APV+0x87 #3 0xc06d50b8 at vn_lock+0xac #4 0xc06c8f46 at vget+0xc2 #5 0xc06bd9be at cache_lookup+0x34a #6 0xc06bdef2 at vfs_cache_lookup+0x92 #7 0xc083494f at VOP_LOOKUP_APV+0x87 #8 0xc06c20a2 at lookup+0x46e #9 0xc06c19b6 at namei+0x39a #10 0xc06cf3f1 at kern_stat+0x35 #11 0xc06cf39f at stat+0x1b #12 0xc0824037 at syscall+0x25b #13 0xc08106af at Xint0x80_syscall+0x1f ino 94211, on dev ufs/var 0xc875c15c: tag syncer, type VNON usecount 1, writecount 0, refcount 2 mountedhere 0 flags () lock type syncer: EXCL (count 1) by thread 0xc84ce480 (pid 46)#0 0xc06676a1 at lockmgr+0x4ed #1 0xc06c00e1 at vop_stdlock+0x21 #2 0xc083688f at VOP_LOCK_APV+0x87 #3 0xc06d50b8 at vn_lock+0xac #4 0xc06c8703 at sync_vnode+0xe3 #5 0xc06c89a1 at sched_sync+0x1ed #6 0xc065e864 at fork_exit+0xa0 #7 0xc08106bc at fork_trampoline+0x8 0xc8771d98: tag ufs, type VREG usecount 3, writecount 0, refcount 4 mountedhere 0 flags (VV_TEXT) v_object 0xc88ddbdc ref 1 pages 7 lock type ufs: EXCL (count 1) by thread 0xc84ce480 (pid 46)#0 0xc06676a1 at lockmgr+0x4ed #1 0xc0783fea at ffs_lock+0x76 #2 0xc083688f at VOP_LOCK_APV+0x87 #3 0xc06d50b8 at vn_lock+0xac #4 0xc06c8f46 at vget+0xc2 #5 0xc0782ab5 at ffs_sync+0x1c1 #6 0xc06caaa0 at sync_fsync+0x164 #7 0xc0835c1f at VOP_FSYNC_APV+0x9b #8
RELENG_6: I/O deadlock under load
Hi all, Our fileserver deadlocked, again. It is running RELENG_6 checked out yesterday. I have enabled DDB, WITNESS and INVARIANTS and have it hooked up via serial console. I can not give out shell access, but I can run any command you might consider useful, here's more details: The system has two 3Ware controllers with a big RAID5 volume each: 3ware device driver for 9000 series storage controllers, version: 3.60.02.012 twa0: 3ware 9000 series Storage Controller port 0x3000-0x303f mem 0xdc00-0xddff,0xd830-0xd8300fff irq 48 at device 1.0 on pci3 twa0: [FAST] twa0: INFO: (0x15: 0x1300): Controller details:: Model 9550SXU-8LP, 8 ports, Firmware FE9X 3.04.00.005, BIOS BE9X 3.04.00.002 em0: Intel(R) PRO/1000 Network Connection Version - 6.1.4 port 0x3040-0x307f mem 0xd832-0xd833 irq 54 at device 2.0 on pci3 em0: Ethernet address: 00:30:48:30:11:a2 em0: [FAST] em1: Intel(R) PRO/1000 Network Connection Version - 6.1.4 port 0x3080-0x30bf mem 0xd834-0xd835 irq 55 at device 2.1 on pci3 em1: Ethernet address: 00:30:48:30:11:a3 em1: [FAST] pci1: base peripheral, interrupt controller at device 0.3 (no driver attached) pcib4: ACPI PCI-PCI bridge irq 16 at device 4.0 on pci0 pci4: ACPI PCI bus on pcib4 pcib5: ACPI PCI-PCI bridge irq 16 at device 6.0 on pci0 pci5: ACPI PCI bus on pcib5 pcib6: ACPI PCI-PCI bridge at device 0.0 on pci5 pci6: ACPI PCI bus on pcib6 pci5: base peripheral, interrupt controller at device 0.1 (no driver attached) pcib7: ACPI PCI-PCI bridge at device 0.2 on pci5 pci7: ACPI PCI bus on pcib7 twa1: 3ware 9000 series Storage Controller port 0x4000-0x403f mem 0xde00-0xdfff,0xd850-0xd8500fff irq 96 at device 1.0 on pci7 twa1: [FAST] twa1: INFO: (0x15: 0x1300): Controller details:: Model 9550SXU-8LP, 8 ports, Firmware FE9X 3.04.00.005, BIOS BE9X 3.04.00.002 da0 at twa0 bus 0 target 0 lun 0 da0: AMCC 9550SXU-8L DISK 3.04 Fixed Direct Access SCSI-3 device da0: 100.000MB/s transfers da0: 1430448MB (2929557504 512 byte sectors: 255H 63S/T 182356C) da1 at twa1 bus 0 target 0 lun 0 da1: AMCC 9550SXU-8L DISK 3.04 Fixed Direct Access SCSI-3 device da1: 100.000MB/s transfers da1: 1430448MB (2929557504 512 byte sectors: 255H 63S/T 182356C) SMP: AP CPU #1 Launched! GEOM_MIRROR: Device gm0 created (id=3977032851). GEOM_MIRROR: Device gm0: provider da0s1 detected. GEOM_MIRROR: Device gm0: provider da1s1 detected. GEOM_MIRROR: Device gm0: provider da1s1 activated. GEOM_MIRROR: Device gm0: provider da0s1 activated. GEOM_MIRROR: Device gm0: provider mirror/gm0 launched. The base OS is sitting on a 8GB GMIRROR device across those two volumes. There were multiple processes running at the time of the deadlock: Two dd if=/dev/urandom were writing to the filesystems on each volume. An rsync was pumping data to a different server. This server also exposed a part of the volume via GEOM_GATE to the deadlocked host. This ggate device and a local device formed another gmirror, which was just rebuilding. I startet a dump of this gmirrored filesystem, but had to abort because the tape drive was not recognized. I aborted the dump, ran camcontrol rescan to get my /dev/sa0 device. mksnap_ffs was still running, and as I was inpatient, I restarted my dump script. dump(8) was blocking, because another mksnap_ffs was running. It looks like as soon as the first mksnap_ffs was finished, the system deadlocked. Yeah, this is pretty much, but the system has deadlocked before, with *only* mksnap_ffs running, so I suspect this is the only culprit. I could still enter the debugger via serial break (pinging the host still works, switching virtual console work, BUT pressing enter on any console produces nothing). It also continues to push out syslog messages to the console ... db ps pid ppid pgrp uid state wmesg wchancmd 74669 82674 8267425 N sendmail 35897 80497 80497 0 N sendmail 13932 81866 9485 0 SL vnread 0xdc38a690 grep 81866 64561 9485 0 S wait 0xc89a5000 sh 54507 32103 32103 0 SL+ pfault 0xc096db18 sleep 64561 9485 9485 0 S piperd 0xc9cce4c8 perl5.8.8 9485 24955 9485 0 Ss wait 0xca6ad000 sh 24955 3564 3564 0 S piperd 0xc9c85b28 cron 24201 10966 75715 0 SL+ physrd 0xdc38f600 dump 72560 10966 75715 0 SL+ physrd 0xdc389c50 dump 31224 10966 75715 0 SL+ physrd 0xdc38ae40 dump 10966 5349 75715 0 S+ sbwait 0xc86a7370 dump 5349 43148 75715 0 S+ wait 0xca690430 dump 43148 75715 75715 0 S+ wait 0xcc284c90 sh 95955 59838 59838 0 S nfslockd 0xc0967f08 rpc.lockd 59838 1 59838 0 Ss select 0xc0964224 rpc.lockd 11779 1 11779 0 Ss select 0xc0964224 rpc.statd 53756 59946 59946 0 S -0xc84fbc00 nfsd 50902 59946 59946 0 S -0xcc812200 nfsd 97900 59946 59946 0 S -0xca9e3000 nfsd
panic: softdep_deallocate_dependancies
Hi, Following setup: Two identical fileservers connected directly via their em1 interfaces. Both running RELENG_6 from early October. fs2 exports a 924GB volume via ggated which is imported by fs1. fs1 spans a gmirror across its da1s2d and this ggate0 (-fs2) device. It was just rebuilding the gmirror, when I figured, I'd try to backup all (rather empty) volumes to tape. So dump(8) was running on the gmirrored filesystem and was in the process of snapshotting the device. It did spit out several of these lines 2006-10-24T15:25:34+0200 kern.crit fs1 kernel: GEOM_MIRROR: Request failed (error=5). ggate0[WRITE(offset=1017607733248, length=16384)] 2006-10-24T15:25:34+0200 kern.crit fs1 kernel: g_vfs_done():mirror/share[WRITE(offset=1017607733248, length=16384)]error = 5 2006-10-24T15:25:34+0200 kern.crit fs1 kernel: g_vfs_done():mirror/share[WRITE(offset=1017800425472, length=16384)]error = 5 2006-10-24T15:25:34+0200 kern.crit fs1 kernel: g_vfs_done():mirror/share[WRITE(offset=1017993117696, length=16384)]error = 5 2006-10-24T15:25:34+0200 kern.crit fs1 kernel: g_vfs_done():mirror/share[WRITE(offset=1018185809920, length=16384)]error = 5 and paniced several seconds later: panic: softdep_deallocate_dependancies cpuid = 1 It is an SMP machine, running a rather generic kernel, but with options QUOTA (quotas are active on a different volume, though). Sadly no DDB was configured, I'll try to reproduce this. Is snapshotting/dumping supposed to work on ggate/gmirrored devices? Uli ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: ppp redial unsuccessful
Nick Gustas wrote: Oct 4 19:03:09 xxx ppp[55]: tun0: Phase: bundle: Authenticate Oct 4 19:03:09 xxx ppp[55]: tun0: Phase: deflink: his = PAP, mine = none Oct 4 19:03:09 xxx ppp[55]: tun0: Phase: Pap Output: [EMAIL PROTECTED] Oct 4 19:03:09 xxx ppp[55]: tun0: LCP: deflink: RecvCodeRej(127) state = Opened Oct 4 19:03:11 xxx ppp[55]: tun0: Phase: Pap Input: SUCCESS () The real question is, is there's a way to work around your provider's brokenness without killing the ppp process? Hi Nick, I cranked up the debug logging, and compared my ppp login attempts with your logfile. I get multiple Oct 6 18:29:43 coyote ppp[67945]: tun0: IPCP: deflink: RecvConfigReq(12) state = Initial Oct 6 18:29:43 coyote ppp[67945]: tun0: IPCP: IPADDR[6] 213.191.89.20 Oct 6 18:29:43 coyote ppp[67945]: tun0: IPCP: deflink: Oops, RCR in Initial. Oct 6 18:29:46 coyote ppp[67945]: tun0: IPCP: deflink: RecvConfigReq(13) state = Initial Oct 6 18:29:46 coyote ppp[67945]: tun0: IPCP: IPADDR[6] 213.191.89.20 Oct 6 18:29:46 coyote ppp[67945]: tun0: IPCP: deflink: Oops, RCR in Initial. Using Google Search then led me to the follow posts [1], that describe the problem in more detail. 'disable ipv6cp' should do the trick, I'll check this ASAP. Thanks for your pointer! [1] http://www.freebsd.de/archive/de-bsd-questions/de-bsd-questions.200506/0029.html http://tech.barwick.de/openbsd/deflink-oops-rcr-in-initial.html Ulrich Spoerlein -- A: Yes. Q: Are you sure? A: Because it reverses the logical flow of conversation. Q: Why is top posting frowned upon? ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: ppp redial unsuccessful
cpghost wrote: On Fri, Oct 06, 2006 at 08:02:02PM +0200, Ulrich Spoerlein wrote: I cranked up the debug logging, and compared my ppp login attempts with your logfile. I get multiple Oct 6 18:29:43 coyote ppp[67945]: tun0: IPCP: deflink: RecvConfigReq(12) state = Initial Oct 6 18:29:43 coyote ppp[67945]: tun0: IPCP: IPADDR[6] 213.191.89.20 Oct 6 18:29:43 coyote ppp[67945]: tun0: IPCP: deflink: Oops, RCR in Initial. Oct 6 18:29:46 coyote ppp[67945]: tun0: IPCP: deflink: RecvConfigReq(13) state = Initial Oct 6 18:29:46 coyote ppp[67945]: tun0: IPCP: IPADDR[6] 213.191.89.20 Oct 6 18:29:46 coyote ppp[67945]: tun0: IPCP: deflink: Oops, RCR in Initial. Using Google Search then led me to the follow posts [1], that describe the problem in more detail. 'disable ipv6cp' should do the trick, I'll check this ASAP. Yesterday, I've had a brand new 6.2-PRERELEASE Oct 4th box installed on T-Com ADSL, using the same ppp.conf from my previous post. I've just logged into this box and seen a successful disconnect/reconnect, as always after 24hrs. Everything seems all right with ppp and T-Com ADSL. I guess it depends on the actual hardware on the other side. Different POPs have different hardware (versions) and software (configuration). Let's wait for another 24h to see if I found the solution. Ulrich Spoerlein -- A: Yes. Q: Are you sure? A: Because it reverses the logical flow of conversation. Q: Why is top posting frowned upon? ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Start system with 'downed' carp interfaces
Hello, I'm looking for a generic way to create and configure carp interfaces upon boot (so daemons can bind against the IP address), but keep the carp interfaces 'down'. This is to allow the administrator to first check every service after the failure, and if deemed ready, put the system back into production by simply issuing: ifconfig carp0 up But there are several problems: ifconfig_carp0=foo bar will always up the interface first via /etc/rc.d/netif ifconfig carp0 foo bar down will ignore the 'down' and up the interface. This is especially announing. I wish ifconfig would honour the down statement, even though the manpage says the interface will always be brought up when assigned its first address. Using a start_if.carp0 with the following contents ifconfig carp0 vhid 1 1.2.3.4/24 ifconfig carp0 down and ifconfig_carp0=down in rc.conf will result in an 'up' interface. I also disabled devd, as it seems to be running pccard_ether carp0 start as a result of the interface creation. Although it is started sometime after the interface has been created. How are other people handling the startup of carp interfaces? Uli ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
ppp redial unsuccessful
set device PPPoE:dc0 set dial #set redial 40+10-10.90 0 set redial 90.91 0 set crtscts off set speed sync set mru 1492 set mtu 1492 set authname XX set authkey XX add default HISADDR How are other people circumventing this? I know that I could just forcefully restart ppp at 3 o'clock in the morning, but I'm more interested in a permanent fix. And why is it that ppp *completely* ignores the redial timeout? It should wait either 90 or 91 seconds, but instead goes on flooding my /var/log/ppp.log Any help or hints would be appreciated. Ulrich Spoerlein -- A: Yes. Q: Are you sure? A: Because it reverses the logical flow of conversation. Q: Why is top posting frowned upon? ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: ppp redial unsuccessful
cpghost wrote: On Wed, Oct 04, 2006 at 08:51:48PM +0200, Ulrich Spoerlein wrote: Hello all, with my ADSL provider (a reseller of the german Telekom), I'm unable to make ppp redial after the link has been lost. With Telekom, you usually get disconnected every 24h hours, but you can simply reconnect if our ppp would support it. Have you added this to /etc/rc.conf? ppp_mode=ddial Yes of course, as you can see, ppp(8) is not exiting, but entering an redial endless loop ... Ulrich Spoerlein -- A: Yes. Q: Are you sure? A: Because it reverses the logical flow of conversation. Q: Why is top posting frowned upon? ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: ppp redial unsuccessful
cpghost wrote: On Wed, Oct 04, 2006 at 03:37:37PM -0400, Nick Gustas wrote: Not that it helps you much, but I do see working pppoe redial behavior with Yahoo/ATT dsl at a client site in the US. I can unhook the dsl line and it will autoreconnect as soon as it's plugged in again. In the event of a provider outage it comes back up on its own. The current ppp session has been running for 59 days, longest session was 353 days, but the server had to be moved for remodeling. Same here. I've got some 6.1-STABLE boxes running since 70 days uninterrupted on german T-Com ADSL (PPPoE). ppp redials automatically without any problems there. I maintain three FreeBSD boxes from 4.11 to 6.1-RELEASE and 6-STABLE. They have been showing this for at least 1 or 2 years. So it is/was also present in the 5.x line. I usually work around this by having a cron job that restarts ppp every day at 04:00 or somewhere around that. So either I'm just unlucky or I'm doing something fundamentally wrong. Could someone paste me the snippet from ppp.log of a successful 24h disconnect + redial? Thanks. Ulrich Spoerlein -- A: Yes. Q: Are you sure? A: Because it reverses the logical flow of conversation. Q: Why is top posting frowned upon? ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
altq on tun0: queueing works, prioritization not?
Hello all, I tried to set up TCP ACK prioritization with pf/altq as has been described on various places of the internet. It doesn't work as expected. I have a 16Mb/1Mb DSL link, the modem is connected to a dc(4) device, I'm working with the tun0 device for my firewall rules. Here they are: ext_if=tun0 scrub in all altq on tun0 priq bandwidth 400Kb queue { std, http, ssh, dns, tcp_ack } queue std priority 1 priq(default) queue tcp_ack priority 6 pass out on $ext_if proto tcp from any to any queue(std, tcp_ack) Please note that I tried various bandwidth settings, for testing purposes I set it to a very very low 400kb. When downloading from ftp.de.freebsd.org, I'm able to achieve roughly 950kB/s. If I then start an FTP upload (which will reach some 42kB/s, so the 400kb bandwidth is in effect), the interface throughput drops down to a mere 120kB/s. The 400kb limit should also be low enough, as I'm able to upload to that same ftp with up to 100kB/s if I turn off queueing. This is definitely not what I would expect. Where is my error? Ulrich Spoerlein -- A: Yes. Q: Are you sure? A: Because it reverses the logical flow of conversation. Q: Why is top posting frowned upon? ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: altq on tun0: queueing works, prioritization not?
Ulrich Spoerlein wrote: This is definitely not what I would expect. Where is my error? Oh well, I should have tried 'cbq' earlier. With the following settings (renamed the queues) altq on $ext_if cbq bandwidth 800Kb queue { q_pri, q_std } queue q_pri priority 6 cbq(borrow) queue q_std priority 1 cbq(default borrow) I'm actually able to achieve some effect. The upload is capped at 70-80kB/s and the download will fluctuate between 580 and 750 kB/s. Much better than the plain priority queuing. As soon as I cut the upload, the download will jump back to 950-1000kB/s. Is this discrepancy (pri vs. cbq) known? Ulrich Spoerlein -- A: Yes. Q: Are you sure? A: Because it reverses the logical flow of conversation. Q: Why is top posting frowned upon? ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Fwd: Loadable SMBus modules regression in 6-STABLE - 6-BETA
On 9/27/06, Dmitry Pryanishnikov [EMAIL PROTECTED] wrote: On Tue, 26 Sep 2006, John Baldwin wrote: I've just found it and fixed it if you upgrade to the newest smbus.c. Thanks, the problem has indeed been fixed. I'm sorry to hijack this thread, but what's the recommended way to read out temperature values via SMB? [EMAIL PROTECTED]:31:3: class=0x0c0500 card=0x618015d9 chip=0x24d38086 rev=0x02 hdr=0x00 vendor = 'Intel Corporation' device = '82801EB/ER (ICH5/ICH5R) SMBus Controller' class= serial bus subclass = SMBus ichsmb0: Intel 82801EB (ICH5) SMBus controller port 0x1100-0x111f irq 17 at device 31.3 on pci0 ichsmb0: [GIANT-LOCKED] smbus0: System Management Bus on ichsmb0 # mbmon -d ioctl(smb0:open): No such file or directory SMBus[Intel8XX(ICH/ICH2/ICH3/ICH4/ICH5/ICH6)] found, but No HWM available on it!! No Hardware Monitor found!! InitMBInfo: Bad file descriptor # ls /dev/smb* ls: No match. # sysctl -a|grep smb dev.ichsmb.0.%desc: Intel 82801EB (ICH5) SMBus controller dev.ichsmb.0.%driver: ichsmb dev.ichsmb.0.%location: slot=31 function=3 handle=\_SB_.PCI0.SMBS dev.ichsmb.0.%pnpinfo: vendor=0x8086 device=0x24d3 subvendor=0x15d9 subdevice=0x6180 class=0x0c0500 dev.ichsmb.0.%parent: pci0 dev.smbus.0.%desc: System Management Bus dev.smbus.0.%driver: smbus dev.smbus.0.%parent: ichsmb0 Uli ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Loadable SMBus modules regression in 6-STABLE - 6-BETA
On 9/27/06, Ulrich Spoerlein [EMAIL PROTECTED] wrote: I'm sorry to hijack this thread, but what's the recommended way to read out temperature values via SMB? [EMAIL PROTECTED]:31:3: class=0x0c0500 card=0x618015d9 chip=0x24d38086 rev=0x02 hdr=0x00 vendor = 'Intel Corporation' device = '82801EB/ER (ICH5/ICH5R) SMBus Controller' class= serial bus subclass = SMBus ichsmb0: Intel 82801EB (ICH5) SMBus controller port 0x1100-0x111f irq 17 at device 31.3 on pci0 ichsmb0: [GIANT-LOCKED] smbus0: System Management Bus on ichsmb0 # mbmon -d ioctl(smb0:open): No such file or directory SMBus[Intel8XX(ICH/ICH2/ICH3/ICH4/ICH5/ICH6)] found, but No HWM available on it!! No Hardware Monitor found!! InitMBInfo: Bad file descriptor # ls /dev/smb* ls: No match. # sysctl -a|grep smb dev.ichsmb.0.%desc: Intel 82801EB (ICH5) SMBus controller dev.ichsmb.0.%driver: ichsmb dev.ichsmb.0.%location: slot=31 function=3 handle=\_SB_.PCI0.SMBS dev.ichsmb.0.%pnpinfo: vendor=0x8086 device=0x24d3 subvendor=0x15d9 subdevice=0x6180 class=0x0c0500 dev.ichsmb.0.%parent: pci0 dev.smbus.0.%desc: System Management Bus dev.smbus.0.%driver: smbus dev.smbus.0.%parent: ichsmb0 Ok, forget about the 'ls /dev/smb', the /dev/smb0 device is actually there, it's just devfs that set me up again. I also found sysutils/lmmon to give me some sane values, however the temperature values are way off: MB temp: 254C / 489F / 527K Fans: 1 :0 rpm 2 :0 rpm 3 :0 rpm Voltages: Vcore1 : +2.703V Vcore2 : +2.750V + 3.3V : +2.750V + 5.0V : +4.906V +12.0V : +11.812V -12.0V : -11.938V - 5.0V : -5.114V ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: 6.2 SHOWSTOPPER - em completely unusable on 6.2
On 9/27/06, Martin Nilsson [EMAIL PROTECTED] wrote: mailbox# uname -a FreeBSD mailbox 6.2-PRERELEASE FreeBSD 6.2-PRERELEASE #0: Fri Sep 22 00:31:29 CEST 2006 [EMAIL PROTECTED]:/usr/obj-local/usr/src/sys/SMP amd64 I get tons of these: em0: watchdog timeout -- resetting em0: link state changed to DOWN em0: link state changed to UP mailbox# pciconf -lv [EMAIL PROTECTED]:0:0: class=0x02 card=0x108c15d9 chip=0x108c8086 rev=0x03 hdr=0x00 vendor = 'Intel Corporation' device = 'PRO/1000 PM' class= network subclass = ethernet [EMAIL PROTECTED]:0:0: class=0x02 card=0x109a15d9 chip=0x109a8086 rev=0x00 hdr=0x00 vendor = 'Intel Corporation' class= network subclass = ethernet em0: flags=8843UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST mtu 1500 options=bRXCSUM,TXCSUM,VLAN_MTU inet6 fe80::230:48ff:fe89:c958%em0 prefixlen 64 scopeid 0x1 inet 192.168.10.2 netmask 0xff00 broadcast 192.168.10.255 ether 00:30:48:89:c9:58 media: Ethernet autoselect (1000baseTX full-duplex) status: active We have several SMP systems with onboard em0/em1 Interfaces running on a RELENG_6 snapshot taken at 2006-09-20 00:00+0. They are not in production yet, so the load is not that much. However I haven't seen any watchdog timeouts on them. Only annoyance is, that the em(4) interfaces take too long for the interface to come up, ie, the system will boot, run ifconfig, the interface still has no link so syslogd/ntpdate/ntpd will complain about 'no route to host'. A 'sleep 5' fixes that problem, though I'd like to avoid such hacks. Anyway, here's the data: [EMAIL PROTECTED]:2:0: class=0x02 card=0x117a8086 chip=0x10798086 rev=0x03 hdr=0x00 vendor = 'Intel Corporation' device = '82546EB Dual Port Gigabit Ethernet Controller' class= network subclass = ethernet [EMAIL PROTECTED]:2:1: class=0x02 card=0x117a8086 chip=0x10798086 rev=0x03 hdr=0x00 vendor = 'Intel Corporation' device = '82546EB Dual Port Gigabit Ethernet Controller' class= network subclass = ethernet em0: Intel(R) PRO/1000 Network Connection Version - 6.1.4 port 0x3040-0x307f mem 0xd832-0xd833 irq 54 at device 2.0 on pci3 em0: Ethernet address: XX em0: [FAST] em1: Intel(R) PRO/1000 Network Connection Version - 6.1.4 port 0x3080-0x30bf mem 0xd834-0xd835 irq 55 at device 2.1 on pci3 em1: Ethernet address: XX em1: [FAST] em0: link state changed to UP em0: flags=8843UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST mtu 1500 options=bRXCSUM,TXCSUM,VLAN_MTU inet 1.2.3.4 netmask 0xff00 broadcast 1.2.3.4 ether X media: Ethernet autoselect (100baseTX full-duplex) status: active Hope this helps to narrow down the problem. Uli ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
make release vs. installworld
Hi all, I am building my own releases for FreeBSD. When upgrading a server to the new release, I'd like to use the 'make installworld' procedure. Therefore I'm mounting the /usr/src and /usr/obj from the release build via NFS onto the server in question. However, installworld will fail, as it looks like some binaries are not built inside the chrooted make release build. First missing binary is cat(1). After manually building it, the installworld stops at chmod(1) === bin/chio (install) install -s -o root -g wheel -m 555 chio /bin install -o root -g wheel -m 444 chio.1.gz /usr/share/man/man1 === bin/chmod (install) install -s -o root -g wheel -m 555 chmod /bin install: chmod: No such file or directory *** Error code 71 So, what's the recommended way to a) build own releases and b) update your servers with it. Uli PS: And why is the FreeBSD release build process so complex? ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: wine: ld-elf.so.1 not found
Andresen, Jason wrote: I'm having a very strange problem with Wine. It apparently refuses to see ld when starting: escaflowne/p7 (72 ~): wine ELF interpreter /libexec/ld-elf.so.1 not found [...] I'm really stumped as to what the problem is. Search the archives, I had that problem too. I traced it back to kern.maxdsiz 1GB. Please check your local data size limit. Ulrich Spoerlein PS: This is not a bug in Wine itself, but in our ELF handling. Running ldd(1) on the wine binary will result in an ELF interpreter error too. -- A: Yes. Q: Are you sure? A: Because it reverses the logical flow of conversation. Q: Why is top posting frowned upon? ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: ural(4) deassociates if no activity (possible wpa_supplicant problem)
Niki Denev wrote: Well, after a few more moments investigating the problem it seems that dhclient is to blame. If i don't start it i don't get disconnected, and also i noticed that the five minute interval matches the dhclient renewal period of 300 seconds. So the logical question is, why dhclient makes my ural(4) adapter deassociate, and what i can do to prevent this :) Hmm, interesting. I'm using ural(4) as an AP and connect to it via ipw(4) and simple WEP. It is very unstable and will wedge the AP (running 6.1) after several minutes. I can't give you more details, as it is a rather complex setup and I would have to isolate the problem first (is it WEP, is it bridge(4), etc.) Ulrich Spoerlein -- PGP Key ID: 20FEE9DD Encrypted mail welcome! Fingerprint: AEC9 AF5E 01AC 4EE1 8F70 6CBD E76E 2227 20FE E9DD Which is worse: ignorance or apathy? Don't know. Don't care. pgpQN4AR98Ozl.pgp Description: PGP signature
Re: unmounting a filesystem safely that doesn't exist anymore
Björn König wrote: Hello, I did a mistake: I unplugged my digital camera accidentally before I unmounted the filesystem. *doh* This happens very often, because I'm very scatterbrained. =) The kernel will panic and all filesystems remain unclean in any case now. I know that this is a well know issue and in past discussions you stated that this behaviour is intended and won't be changed ad hoc. I just want to know if somebody knows a workaround or small trick that prevents the other filesystems from being unclean on next boot-up. You might give the automounter (am-utils) a whirl. They are very confusing to set up, but you can set the unmount-if-unused timeout to something like 5 seconds. This could narrow the window enough to not panic you system frequently :) Ulrich Spoerlein -- PGP Key ID: 20FEE9DD Encrypted mail welcome! Fingerprint: AEC9 AF5E 01AC 4EE1 8F70 6CBD E76E 2227 20FE E9DD Which is worse: ignorance or apathy? Don't know. Don't care. pgptDKFK0qUnN.pgp Description: PGP signature
Re: How can I know which files a proccess is accessing?
Dan Nelson wrote: In the last episode (Jun 09), Ulrich Spoerlein said: Sadly, ktrace(1) seems to be rather useless in RELENG_6 right now. Every medium sized app will result in an out of ktrace objects error. I remember that some improvements to ktrace(1) went into -CURRENT. Time for an MFC? Just raise the kern.ktrace.request_pool sysctl; 4096 works for me. Heh, I didn't know that sysctl existed. Why is the default value (100) so low? I set it to 4096, but it only survives three seconds when running 'ktrace find ~' Anyway, next time I need ktrace, I'll remember to bump the pool size. Thanks! Ulrich Spoerlein -- PGP Key ID: 20FEE9DD Encrypted mail welcome! Fingerprint: AEC9 AF5E 01AC 4EE1 8F70 6CBD E76E 2227 20FE E9DD Which is worse: ignorance or apathy? Don't know. Don't care. pgpktFWAqYki3.pgp Description: PGP signature
Re: How can I know which files a proccess is accessing?
Robert Watson wrote: A lot of people have answered and told you about lsof, which is a great tool, and can give you a momentary snapshot of the files a process has open. You might also be interested in getting a log of accesses, which you can do using ktrace(1). This tracks system calls and you can see what paths are being accessed at time of open. As of 7.x (and hopefully 6.2 once the MFC happens) you'll also be able to use audit(4) to track access of files by processes. Sadly, ktrace(1) seems to be rather useless in RELENG_6 right now. Every medium sized app will result in an out of ktrace objects error. I remember that some improvements to ktrace(1) went into -CURRENT. Time for an MFC? Ulrich Spoerlein -- PGP Key ID: 20FEE9DD Encrypted mail welcome! Fingerprint: AEC9 AF5E 01AC 4EE1 8F70 6CBD E76E 2227 20FE E9DD Which is worse: ignorance or apathy? Don't know. Don't care. pgpqxY87H4unN.pgp Description: PGP signature