Re: fail to boot snapshot 5.5 on MBPro8,2
Hi, Am 05.02.2014 um 15:08 schrieb Sven-Volker Nowarra peb.nowa...@bluewin.ch: I tried to install 5.5 snapshots from 2. Feb and 3. Feb onto my laptop MacBook Pro 8,2 - both failed. Then used an older snapshot from spacehopper.org/mirrmon, which claimed to be 12 days old. Failed as well. MacBook Pro runs perfectly well with 5.4, and 5.5 bsd.rd lets me go into the installation. After a reboot (bsd.mp), the system hangs after the line where the disk should be mounted, and the nvram/clock message normally appears (actually before going into userland). The screen then turns shortly but steady into white/grey, so the blue kernel lines can't be read anymore. I then tried to boot with bsd -c, with an external USB keyboard - but it hangs as well (as such no difference to the internal keyboard). Looks like I have two prompts (underlines) on the screen, that are flickering. I see the same issue with my MacBook Air 4,2 With this status I cannot provide any dmesg:-( You may at least provide the dmesg from the working 5.4 bsd and the snapshot bsd.rd. Regards, Joerg The first line after boot -c would show the booting hd0a:bsd: 7635180+1660460+1097336... line, and then entry point at The kernels blue lines would say: kbc: cmd word write error [ using 926384 bytes of bsd ELF symbol table ] Copyright (c) 1982, 1986, 1989, 1991, 1993 The Regents of the University ... OpenBSD 5.5-beta (GENERIC.MP) #284: Mon Feb 3 07:57:32 MST 2014 t...@amd64.openbsd.org:/usr/src/sys/arch/amd64/compile/GENERIC.MP RTC BIOS diagnostic error ffclock_battery,ROM_cksum,config_unit,memory_size,fixed_disk,invalid_time real mem = 4185079808 (3991MB) avail mem = 4065452032 (3877MB) User Kernel Config UKC The errors on RTC BIOS line also appears in the fully functional 5.4 version. But I couldn't see the line kbc: cmd word write error in other dmesgs on OpenBSD-tech. As mentioned, no dmesg possible, anyone else with MBPro issue? Any idea were to get old snapshots? I could try to build the kernel from scratch, I successfully built 5.4, and building with -current shows same behaviour as snapshots. Any ideas where to go from here? (No serial port obviously). Volker
Re: quick fix for uvm deadlocks
Date: Wed, 05 Feb 2014 23:03:09 -0500 From: Ted Unangst t...@tedunangst.com On Wed, Feb 05, 2014 at 17:53, Bob Beck wrote: On Wed, Feb 5, 2014 at 3:17 PM, Ted Unangst t...@tedunangst.com wrote: We are missing back pressure channels from uvm to the buf cache. The buf cache will happily sit on 9000 free pages while uvm churns around trying to scavenge up one more page. Or are you in a situation here where the cache has *not* backed off? Talked to Bob and hashed out better ideas of the problem. The page daemon does tell the buffer cache to make some room, but... If you have a huge mmap file, the pdaemon will try to flush it out via VOP_WRITE, which circles back via ffs into buf_get, which eats those previously freed pages, and then some, as the pagedaemon continues pushing more and more of the mmap file out. We discussed some other changes and fixes that this situation has clearly highlighted, but here's a slightly revised diff. It now uses the correct bufbackoff() function to communicate uvm's needs. Any other fix is rather precarious for this release, but as stated before, this keeps the change to the deadlock paths. You were already dead, but now you have a second chance. (We don't currently use the pmemrange argument; we'll have to adjust accordingly when the bufcache becomes range aware.) The approach taken here makes sense to me. And I don't really want to hold up a fix for this. But there are some things I'd like you to consider before committing this. I believe the scenario you sketched should only land you in uvm_wait_pla(), but not in uvm_wait(). Perhaps with the current code we can end up in uvm_wait(), but I think those would be bugs where the driver I/O paths are doing memory allocations when they really shouldn't. Therefore, I'm not sure we should add the bufbackoff() call in uvm_wait(). At the very least I'd like to have a printf *before* the bufbackoff() call there. I'd also like to see a comment on the bufbackoff() call in uvm_wait_pla() explaining that even though we pushed back the buffer cache in the page daemon, we might still consume the freed pages when paging out dirty pages while scanning. Index: uvm_pdaemon.c === RCS file: /cvs/src/sys/uvm/uvm_pdaemon.c,v retrieving revision 1.64 diff -u -p -r1.64 uvm_pdaemon.c --- uvm_pdaemon.c 30 May 2013 16:29:46 - 1.64 +++ uvm_pdaemon.c 6 Feb 2014 03:09:53 - @@ -117,6 +117,8 @@ uvm_wait(const char *wmsg) */ if (curproc == uvm.pagedaemon_proc) { + if (bufbackoff(NULL, 4) == 0) + return; /* * now we have a problem: the pagedaemon wants to go to * sleep until it frees more memory. but how can it Index: uvm_pmemrange.c === RCS file: /cvs/src/sys/uvm/uvm_pmemrange.c,v retrieving revision 1.36 diff -u -p -r1.36 uvm_pmemrange.c --- uvm_pmemrange.c 29 Jan 2013 19:55:48 - 1.36 +++ uvm_pmemrange.c 6 Feb 2014 03:10:32 - @@ -22,6 +22,7 @@ #include sys/malloc.h #include sys/proc.h/* XXX for atomic */ #include sys/kernel.h +#include sys/mount.h /* * 2 trees: addr tree and size tree. @@ -1883,6 +1884,13 @@ uvm_wait_pla(paddr_t low, paddr_t high, const char *wmsg = pmrwait; if (curproc == uvm.pagedaemon_proc) { + uvm_unlock_fpageq(); + if (bufbackoff(NULL, atop(size)) == 0) { + uvm_lock_fpageq(); + return 0; + } + uvm_lock_fpageq(); + /* * XXX detect pagedaemon deadlock - see comment in * uvm_wait(), as this is exactly the same issue.
Re: quick fix for uvm deadlocks
On Thu, Feb 06, 2014 at 12:34, Mark Kettenis wrote: I believe the scenario you sketched should only land you in uvm_wait_pla(), but not in uvm_wait(). Perhaps with the current code we can end up in uvm_wait(), but I think those would be bugs where the driver I/O paths are doing memory allocations when they really shouldn't. Therefore, I'm not sure we should add the bufbackoff() call in uvm_wait(). At the very least I'd like to have a printf *before* the bufbackoff() call there. Sure. I didn't observe that path, but since the current fix of printf and msleep is used both places, I added bufbackoff both places. I'd also like to see a comment on the bufbackoff() call in uvm_wait_pla() explaining that even though we pushed back the buffer cache in the page daemon, we might still consume the freed pages when paging out dirty pages while scanning. Sure.
Re: exp() / expl() on Linux and OpenBSD (expl() bug?)
David Coppa wrote: Take the following reduced test-case, adapted from what R's code does: ---8--- #include stdio.h #include stdlib.h #include math.h int main(void) { double theta = 1; long double lambda, pr, pr2; lambda = (0.5*theta); pr = exp(-lambda); pr2 = expl(-lambda); printf(theta == %g, pr == %Lg, pr2 == %Lg\n, theta, pr, pr2); exit(0); } ---8--- This produces the following output on Linux (x86_64): theta == 1, pr == 0.606531, pr2 == 0.606531 While on OpenBSD -current amd64: theta == 1, pr == 0.606531, pr2 == nan FWIW, it looks even stranger on loongson: $ cc -o expl expl.c -O2 -pipe -lm $ ./expl theta == 1, pr == -9.15569e-2474, pr2 == 6.10667e-4944 $ ./expl theta == 1, pr == 0.606531, pr2 == 0.606531 $ ./expl theta == 1, pr == -9.15569e-2474, pr2 == 6.10667e-4944 $ sysctl kern.version kern.version=OpenBSD 5.5-beta (GENERIC) #106: Mon Feb 3 01:47:15 MST 2014 t...@loongson.openbsd.org:/usr/src/sys/arch/loongson/compile/GENERIC
Re: ip6opt.c
On Tue, Feb 4, 2014 at 8:54 PM, Alexander Bluhm alexander.bl...@gmx.net wrote: On Tue, Feb 04, 2014 at 08:35:02PM -0500, Eitan Adler wrote: Hi all, The following bug was recently fixed in DragonFlyBSD and FreeBSD: libc/net: Fix issue in inet6_opt_init() (from RFC 3542): * The RFC says (in section 10.1) that only when extbuf is not NULL, extlen shall be checked, so don't perform this check when NULL is passed. I understand the RFC and the man page the same way, OK bluhm@ Is anyone willing to commit this? -- Eitan Adler
Re: exp() / expl() on Linux and OpenBSD (expl() bug?)
I think I recently ran into a similar issue but I suspect the root cause might be the same. I think the floorl function is wrong for numbers slightly larger than -1 to numbers slightly below 0. In this range floorl returns -0 instead of -1. On Feb 5, 2014, at 3:57 AM, David Coppa dco...@openbsd.org wrote: Hi! I hit this problem while working on updating math/R from version 2.15.3 to the latest version (3.0.2). It started happening since upstream switched from double functions to C99 long double functions (expl, fabsl, ...), during the R-3 development cycle. Take the following reduced test-case, adapted from what R's code does: ---8--- #include stdio.h #include stdlib.h #include math.h int main(void) { double theta = 1; long double lambda, pr, pr2; lambda = (0.5*theta); pr = exp(-lambda); pr2 = expl(-lambda); printf(theta == %g, pr == %Lg, pr2 == %Lg\n, theta, pr, pr2); exit(0); } ---8--- This produces the following output on Linux (x86_64): theta == 1, pr == 0.606531, pr2 == 0.606531 While on OpenBSD -current amd64: theta == 1, pr == 0.606531, pr2 == nan And indeed R-3's testsuite fails with the error message NaNs produced: Warning in pchisq(1e-300, df = 0, ncp = lam) : NaNs produced stopifnot(all.equal(p00, exp(-lam/2)), + all.equal(p.0, exp(-lam/2))) Error: all.equal(p.0, exp(-lam/2)) is not TRUE Execution halted Is this a bug in our expl() ? Ciao, David
Re: exp() / expl() on Linux and OpenBSD (expl() bug?)
Yup.Does this diff fix it for you? On 2/6/14, Daniel Dickman didick...@gmail.com wrote: I think I recently ran into a similar issue but I suspect the root cause might be the same. I think the floorl function is wrong for numbers slightly larger than -1 to numbers slightly below 0. In this range floorl returns -0 instead of -1. On Feb 5, 2014, at 3:57 AM, David Coppa dco...@openbsd.org wrote: Hi! I hit this problem while working on updating math/R from version 2.15.3 to the latest version (3.0.2). It started happening since upstream switched from double functions to C99 long double functions (expl, fabsl, ...), during the R-3 development cycle. Take the following reduced test-case, adapted from what R's code does: ---8--- #include stdio.h #include stdlib.h #include math.h int main(void) { double theta = 1; long double lambda, pr, pr2; lambda = (0.5*theta); pr = exp(-lambda); pr2 = expl(-lambda); printf(theta == %g, pr == %Lg, pr2 == %Lg\n, theta, pr, pr2); exit(0); } ---8--- This produces the following output on Linux (x86_64): theta == 1, pr == 0.606531, pr2 == 0.606531 While on OpenBSD -current amd64: theta == 1, pr == 0.606531, pr2 == nan And indeed R-3's testsuite fails with the error message NaNs produced: Warning in pchisq(1e-300, df = 0, ncp = lam) : NaNs produced stopifnot(all.equal(p00, exp(-lam/2)), + all.equal(p.0, exp(-lam/2))) Error: all.equal(p.0, exp(-lam/2)) is not TRUE Execution halted Is this a bug in our expl() ? Ciao, David Index: s_floorl.c === RCS file: /cvs/src/lib/libm/src/ld80/s_floorl.c,v retrieving revision 1.2 diff -u -r1.2 s_floorl.c --- s_floorl.c 25 Jul 2011 16:20:09 - 1.2 +++ s_floorl.c 7 Feb 2014 07:01:59 - @@ -35,10 +35,12 @@ jj0 = (se0x7fff)-0x3fff; if(jj031) { if(jj00) { /* raise inexact if x != 0 */ - if(huge+x0.0) {/* return 0*sign(x) if |x|1 */ - if(sx==0) {se=0;i0=i1=0;} - else if(((se0x7fff)|i0|i1)!=0) - { se=0xbfff;i0=i1=0;} + if(huge+x0.0) { + if(sx==0) { +return 0.0L; +} else if(((se0x7fff)|i0|i1)!=0) { + return -1.0L; +} } } else { i = (0x7fff)jj0;