Re: pipe read returning EAGAIN
On Mon, Feb 08, 2016 at 11:47:44AM +0100, Manuel Bouyer wrote: > > Now the question is why is the POLLIN flag set when there's no data to read ? > zeroing out revents before callin poll(2) doens't help. > > The man page says: > This implementation differs from the historical one in that a given file > descriptor may not cause poll() to return with an error. In cases where > this would have happened in the historical implementation (e.g. trying to > poll a revoke(2)d descriptor), this implementation instead copies the > events bitmask to the revents bitmask. Attempting to perform I/O on this > descriptor will then return an error. This behaviour is believed to be > more useful. That sounds broken. I think POLLERR should be set after revoke(). However, nothing in the pollfd[] array should cause the poll() call itself to fail. > Does it do so if the file descriptor's error is EAGAIN ? > If so that's no very usefull ... You are confused, that'll be for errors looking up the relevant driver. Any error from a previous system call is not remembered. (or better not be). It look as though the poll support for pipes is somehow returning 'readable' when no data is available. Of course there might be some uninitialised memory lurking. David -- David Laight: da...@l8s.co.uk
Re: Bad sleep time resolution of nanosleep(2)
On Tue, Nov 24, 2015 at 01:58:15AM +0100, Rhialto wrote: > > > > Well, it is rounded up first to whole ticks, that's the easy part. Next > > the callout is scheduled at the tick boundary and then the LWP is > > unblocked and scheduled again. It will run in the next scheduling cycle > > unless nothing else is running? > > I tried it on some fairly idle machines, and the result was quite > consistent. It really looks like there is something in there that > inadvertently always causes an extra tick delay. The extra tick is added to ensure that the minumum sleep time is met. Otherwise the sleep will be too short if called just before s tick. David -- David Laight: da...@l8s.co.uk
Re: schizophrenic GCC versions in -current?
On Wed, Sep 16, 2015 at 03:27:32PM -0500, John D. Baker wrote: > On Wed, 16 Sep 2015, John D. Baker wrote: > > > Just in case there was a snafu due to my preference for update builds, > > I'm rebuilding in non-update mode to see if the two strings can be made > > to agree. > > This seems to have been the case. Following a non-update build, the > version string reported by the "--version" option matches the internal > symbol. For some other projects I add a make dependency for any object files that contain the version against all the other object files. That ensures the version (and build date) is always correct. David -- David Laight: da...@l8s.co.uk
Re: Help needed with a stubborn Gateway box!
On Fri, Mar 27, 2015 at 08:44:07PM +0800, Paul Goyette wrote: So, any suggestions on how to proceed? 1) Is my plan to use the ubuntu-installed copy of grub to boot the NetBSD boot.iso media successful? 2) Is my plan to leave the ubuntu-installed copy of grub on the disk (rather than writing new boot blocks) going to work? ubuntu will have installed grub2. AFAICT grub2 is only of any use if you want to do exactly what 'they' expect you to do with it. Which basically assumes you are running linux. 3) Is there some other way of getting this beast booted into NetBSD? Boot from USB? you might find the bios will boot a cd image written to a usb memory stick. David -- David Laight: da...@l8s.co.uk
Re: firefox eats threads
On Wed, Mar 18, 2015 at 05:17:29PM +, Eric Haszlakiewicz wrote: On March 18, 2015 11:01:15 AM EDT, Tobias Nygren t...@netbsd.org wrote: Firefox names all it's threads by type with pthread_setname_np(3). The following command is useful to find out what kind of threads are in use: $ ps -sp 12501 -O lname Firefox after startup pools 45 threads so that's one third of your available LWPs. My opinion is that the default ulimits on amd64 have not caught up with the times. 1024 would be a more reasonable figure than 128 160 for open files and lwps. Fwiw, chrome/chromium fires up 69 threads, although I've only examined it running on a linux box. Those limits are clearly inadequate. They should probably be calculated based on the machine resources, such as the amount of memory available. The 'hard' ulimit values also need reducing. But yes, most of the 'system wide' limits (even processes for root) could be usefully replaced my checks against free kva, swap and physical memory. The problem is picking the values. Look at what MAXYSERS does :-) David -- David Laight: da...@l8s.co.uk
Re: gpt booting status?
On Wed, Dec 24, 2014 at 09:30:31PM -0500, Greg Troxel wrote: and a further question: I know /boot (with MBR) can skip the raidframe header. So given a disk with a single MBR partition of type RAID, and an inside-the-RAID1 disklabel with raid0a starting at 0, booting works. (I'm sure because I do this all the time.) But, if I have gpt, with a RAID partition, and in the RAID1 have another gpt label, and in that a partition, is there any way to boot from that? Basicallly I'm thinking sd0 A=gpt partition 1, type raid, starts at 1024, big raid0 (so starts at 1024*64) B=gpt partition 1, starts at say 1024+64+64 and would like the bootxx_ffsv2 code written to the beginning of A to see type 'raid', skip 64, and then interpret gpt vs mbr and find the active inner gpt partition. The code that reads /boot (last time I looked) doesn't inspect any inner labels (of any type). It just had a nasty hack to look for a filesytem a further 64 sectors down the disk if it doesn't find /boot in the expected place. Maybe a gpt disk has space for a larger boot image? In which case it might be possible to have more code to find /boot. David -- David Laight: da...@l8s.co.uk
Re: 4k sector disks
On Wed, Sep 03, 2014 at 11:07:32AM +0100, Robert Swindells wrote: Is there any special configuration needed to use 4k sector disks efficiently ? I have a couple of SATA drives with 4k sectors, the disklabels for them give a sector size of 512 bytes but 'atactl identify' shows the true sector size. I used 'newfs -S 4096' on one of them, a new SSD, but am wondering whether to copy stuff off the other one and repartition. (as stated elsewhere, label it with 512 byte sectors) If it is an SSD the actual sector size is likely to be much higher than 4k. I can't actually imagine an SSD emulating 512 byte sectors in 4k ones and then doing the required RMW cycles (with wear leveling) that the actual memory requires. OTOH doing larger (aligned) transfers will help. I'd certainly ensure that everything is aligned and that the fragment and block sizes are large. But remember the boot code doesn't have enough memory for very large blocks. David -- David Laight: da...@l8s.co.uk
Re: cpuctl panic(!)
On Wed, Jun 18, 2014 at 12:43:57PM +0100, Patrick Welche wrote: Surprise (-current/amd64): # cpuctl identify 0 cpu0: highest basic info 000d cpu0: highest extended info 8008 cpu0: Intel(R) Core(TM) i5-2520M CPU @ 2.50GHz cpu0: Intel Xeon E3-12xx, 2nd gen i7, i5, i3 2xxx (686-class), 2492.10 MHz cpu0: family 0x6 model 0x2a stepping 0x7 (id 0x206a7) ... cpu0: xsave features 0x7x87,SSE,AVX cpu0: xsave instructions 0x1XSAVEOPT cpu0: xsave area size: current 832, maximum 832, xgetbv enabled [1] Segmentation fault (core dumped) cpuctl identify 0 Program terminated with signal 11, Segmentation fault. #0 0x004053d0 in x86_xgetbv () (gdb) bt #0 0x004053d0 in x86_xgetbv () #1 0x0040467d in identifycpu (fd=3, cpuname=0x7f7fdb30 cpu0) at /usr/src/usr.sbin/cpuctl/arch/i386.c:1824 #2 0x00401cd0 in cpu_identify (argv=0x7f7fdbd8) at /usr/src/usr.sbin/cpuctl/cpuctl.c:277 #3 0x00401644 in main (argc=2, argv=0x7f7fdbd0) at /usr/src/usr.sbin/cpuctl/cpuctl.c:116 The cpu features indicate that xgetbv is available, but when it is executes there cpu faults. Clearly that shouldn't happen. IIRC qemu is buggy - is that bare metal? David -- David Laight: da...@l8s.co.uk
Re: USB 3.0 status in NetBSD-current?
On Sun, Jun 15, 2014 at 06:30:22PM -0500, Jonathan A. Kollasch wrote: On Wed, Jun 11, 2014 at 05:47:28AM +, Thomas Mueller wrote: Is there, or is there supposed to be, USB 3.0 support in the current kernel? I see xhci in kernel config, but have not yet been able to access anything on a USB 3.0 port. Use a USB 2.0 cable in between to force USB 2.0 speeds. That may not help. A USB2 cable should still leave you using the xhci driver - just at the lower speed. There is some 'magic' needed to hand over the port from ohci? to xhci (which probably require correct parsing of ACPI data to work out which usb2 port the xhci port is linked to). If the port isn't handed over (ie no xhci support in the kernel) the USB port should still run at USB2 speeds. There are also significant differences between the xhci hardware. Some of which are definitely bugs, some are probably documentedd bugs, other are just the hardware engineers making life extremely difficult for the software engineers. For example: The xhci controller supports arbitrary scatter gather except: 1) The maximum fragment size is 64k. 2) Fragments can't cross 64k address boundaries. 3) The end of a ring segment must happen at the end of a USB packet. David -- David Laight: da...@l8s.co.uk
Re: gcc48, drmkms issues with i386
On Mon, Apr 14, 2014 at 11:45:27PM +0900, Masao Uebayashi wrote: On Thu, Apr 10, 2014 at 3:40 AM, David Laight da...@l8s.co.uk wrote: On Wed, Apr 09, 2014 at 09:10:42AM -0500, John D. Baker wrote: On Wed, 9 Apr 2014, John D. Baker wrote: disk, the last part of the display actually looked like: prot_to_real: can't return to 0001296DFn: Diskn ... All the calls to 'prot_to_real' have to reside in the first 64k of the code area. The code them bombs out back to the outer loader. s/prot_to_real/real_to_prot/ Doesn't matter, they always appear as a pair. http://nxr.netbsd.org/xref/src/sys/arch/i386/stand/boot/Makefile.boot#131 This is quite a hack... And one I'm proud of :-) An alternative would be to put all the functions that call prot_to_real into a separate code section, and then arrange for that to get placed before the normal .code section. Trouble is, that probably requires a linker script. David -- David Laight: da...@l8s.co.uk
Re: gcc48, drmkms issues with i386
On Wed, Apr 09, 2014 at 09:50:02PM +, Christos Zoulas wrote: Plausibly prot_to_real could set the real mode $cs value to one appropriate for the return address. The calls are all from assembler and are followed by a bios call and then a call to real_to_prot. If that were done the /boot code itself could probably be linked with a virtual base address of 1MB and run with virtual == physical removing the confusing offset. Do you want to take a stab at fixing it? It would take me an order of magnitude longer to do the same. Not for at least a couple of weeks. David -- David Laight: da...@l8s.co.uk
Re: gcc48, drmkms issues with i386
On Wed, Apr 09, 2014 at 09:10:42AM -0500, John D. Baker wrote: On Wed, 9 Apr 2014, John D. Baker wrote: disk, the last part of the display actually looked like: prot_to_real: can't return to 0001296DFn: Diskn Should have been: prot_to_real: can't return to 000129CD Fn: Diskn The amd64-built version behaves the same. The only difference was the address reported in the message above: 00012D19 All the calls to 'prot_to_real' have to reside in the first 64k of the code area. The code them bombs out back to the outer loader. The linker used to manage that, but it might have been relying on the linker putting object files into a section in the order they were specified on the command line. Plausibly prot_to_real could set the real mode $cs value to one appropriate for the return address. The calls are all from assembler and are followed by a bios call and then a call to real_to_prot. If that were done the /boot code itself could probably be linked with a virtual base address of 1MB and run with virtual == physical removing the confusing offset. David -- David Laight: da...@l8s.co.uk
Re: fontconfig/freetype2 breaks amd64 build on netbsd-5/i386 host
On Tue, Mar 25, 2014 at 07:53:12PM -0500, John D. Baker wrote: ... Here's the result of running objdump -dr against the object file as it exists on my filesystem (not extracted from library): /d0/build/current/obj/amd64/external/mit/xorg/lib/freetype/ftxf86.o: file format elf64-x86-64 Disassembly of section .text: FT_Get_X11_Font_Format: 0: 48 85 fftest %rdi,%rdi 3: 74 1a je 1f FT_Get_X11_Font_Format+0x1f 5: 48 8b bf b0 00 00 00mov0xb0(%rdi),%rdi c: 48 8b 07mov(%rdi),%rax f: 48 8b 40 40 mov0x40(%rax),%rax 13: 48 85 c0test %rax,%rax 16: 74 07 je 1f FT_Get_X11_Font_Format+0x1f 18: be 00 00 00 00 mov$0x0,%esi 19: R_X86_64_32 .rodata.str1.1 1d: ff e0 jmpq *%rax 1f: 31 c0 xor%eax,%eax 21: c3 retq I extracted the module from the library and ran 'objdump -dr' on it. It's the same. That isn't PIC code, the PIC version is in ftxf86.pico I've not checked the .a from a working build, but since a .so is being generated it ought to contain the .pico versions. I wonder how that is supposed to happen? Maybe a parallel make happened to leave the wrong file lurking? David -- David Laight: da...@l8s.co.uk
Re: Recent new atf test failures
On Wed, Mar 26, 2014 at 12:25:53PM -0700, Paul Goyette wrote: Some time in the last two weeks, we've had a few new test cases failing in my amd64 test-bed. Tests that used to pass, but currently failing lib/csu/t_crt0/initfini3 atf/atf-c/macros_test/detect_unused_tests atf/atf-c++/macros_test/detect_unused_tests Tests that currently fail, but don't seem to exist in older builds lib/libm/t_exp/exp2_powers lib/libm/t_exp/exp2_values Those are some more extensive tests for exp2(). The exp2_powers tests are failing to generate an 'overflow' result. They work for me on a real system - so it might be a qemu issue? Maybe qemu is using the x87 fpu (with 80 bit precision) to emulate the 64bit (and 32bit) SSE2 double (and float) maths - so the large mutiplies used to generate overflow fail. Actually, I wonder, have you rebuilt qemu since jeorg changed the default x87 precision to 80bits? That might be the difference between your tests and gson's tests (which only show some minor errors for exp2f(7.7) and exp2f(8.8). The exp2_values tests is showing up something odd in FP maths. I've not changes the exp2f() code, but I'm seeing different errors in my own testing (native on amd64) from earlier tests. However it might just be that the allowed error is too small. I've a local version of exp2() that uses the x87 'f2xm1' and 'fscale' instructions on both i386 and amd64. I do need to do a clock-count comparison for 'f2xm1', but I expect it to be faster than the table lookup and 5th degree polynomial. Intel claim these functions are monatonic, I bet the polynomial version isn't. David -- David Laight: da...@l8s.co.uk
Re: Recent new atf test failures
On Wed, Mar 26, 2014 at 02:57:15PM -0700, Paul Goyette wrote: On Wed, 26 Mar 2014, David Laight wrote: Actually, I wonder, have you rebuilt qemu since jeorg changed the default x87 precision to 80bits? That might be the difference between your tests and gson's tests (which only show some minor errors for exp2f(7.7) and exp2f(8.8). No, I have not updated my qemu recently (several months). That change was somewhere near the end of last year. The behaviour depends on the binutils version at the time the program was linked. There isn't a sysctl to force 64 or 80 bit modes. David -- David Laight: da...@l8s.co.uk
Re: fontconfig/freetype2 breaks amd64 build on netbsd-5/i386 host
On Tue, Mar 25, 2014 at 04:25:36PM -0500, John D. Baker wrote: Following the updates/fixes to fontconfig/freetype2 in -current, building for amd64 target on my netbsd-5/i386 host consistently fails as follows: [...] --- libfontconfig.so.2.2 --- # build src/libfontconfig.so.2.2 rm -f libfontconfig.so.2.2 /d0/build/current/tools/i386/bin/x86_64--netbsd-gcc -Wl,-x -shared -Wl,-soname,libfontconfig.so.2 -Wl,--warn-shared-textrel -Wl,-Map=libfontconfig.so.2.map --sysroot=/d0/build/current/DEST/amd64 -Wl,-rpath,/usr/X11R7/lib -L=/usr/X11R7/lib -o libfontconfig.so.2.2 -Wl,-rpath-link,/d0/build/current/DEST/amd64/lib -L=/lib -Wl,--whole-archive libfontconfig_pic.a -Wl,--no-whole-archive -L/d0/build/current/obj/amd64/external/mit/expat/lib/libexpat -lexpat -L/d0/build/current/obj/amd64/external/mit/xorg/lib/freetype -lfreetype /d0/build/current/tools/i386/lib/gcc/x86_64--netbsd/4.8.3/../../../../x86_64--netbsd/bin/ld: /d0/build/current/DEST/amd64/usr/X11R7/lib/libfreetype.a(ftxf86.o): relocation R_X86_64_32 against `.rodata.str1.1' can not be used when making a shared object; recompile with -fPIC /d0/build/current/DEST/amd64/usr/X11R7/lib/libfreetype.a: could not read symbols: Bad value collect2: error: ld returned 1 exit status *** [libfontconfig.so.2.2] Error code 1 nbmake[9]: stopped in /x/current/src/external/mit/xorg/lib/fontconfig/src 1 error Can you find the command line used to compile ftxf86.o ? and/or extract the object file from the library and feed it through 'objdump -dr' to find the relocation (and to see if it looks like PIC code at all). David -- David Laight: da...@l8s.co.uk
Re: i386 and amd64 AVX support
On Tue, Mar 11, 2014 at 08:51:06PM +, Alexander Nasonov wrote: David Laight wrote: I've committed code to the amd64 and i386 kernels that enables AVX for userspace. In particular the high ymm registers should be saved on context switches. Any additional testing would be welcome. Thanks for working on it. I resumed playing with avx instructions and I haven't found any problem so far. Still some stuff to tidy up. Mostly: - make avx registers available to signal handlers. - and to process core dumps - add to ptrace for gdb. The process core dump code is particularly problematical (especially for cpus that support avx512) since it relies on several on-stack copies of the fpu state (over 2k with avx512). Might require major rework of the core dump code (made more complicated by the requirement to be able to write core dumps to pipes). David -- David Laight: da...@l8s.co.uk
Re: posix_memalign conflict between /usr/include files
On Sat, Mar 08, 2014 at 05:17:33PM +0100, Martin Husemann wrote: On Sat, Mar 08, 2014 at 10:35:03PM +0900, Ryo ONODERA wrote: How to handle this issue? The throw() needs to be removed. I remember a discussion about this before. But I can't remember what the throw() is about - especially on a function with a C interface. David -- David Laight: da...@l8s.co.uk
Re: Porting DTrace to ARM
On Thu, Mar 06, 2014 at 03:34:18PM +0900, Ryota Ozaki wrote: On Thu, Mar 6, 2014 at 2:21 PM, Masao Uebayashi uebay...@gmail.com wrote: Ah. I misread that schedstate_percpu uses percpu(9)'s fast path, which doesn't exist... Anyway if it's assumed that cpu is not attached at run-time, assigning struct cpu_data::void *cpu_dtraceinfo at module attachment would be just fine. void * is probably good, otherwise we have to pull out structure definitions (ok, there are two: solaris_cpu_t and cpu_core_t) from external/cddl. opensolaris_init in external/cddl/osnet/sys/kern/opensolaris.c is a good place to assign, I think. Using 'void *' causes problems with knowing which pointer is valid for a given call. There is no problem using 'struct foo *' without the contents of 'struct foo' being visible. David -- David Laight: da...@l8s.co.uk
i386 and amd64 AVX support
I've committed code to the amd64 and i386 kernels that enables AVX for userspace. In particular the high ymm registers should be saved on context switches. Any additional testing would be welcome. At the moment there is no support for gdb and the ymm registers are not written to core dumps, nor available to signal handlers. Note that the ymm registers are caller-saved so should be don't care on all system calls, so context switches from interrupt routines are needed to actually test whether they are saved properly. The code should also support the upcoming AVX-512, although stealing another 2k from the kernel stack might cause problems! David -- David Laight: da...@l8s.co.uk
Re: Build break for port-hppa
On Wed, Feb 26, 2014 at 01:13:34PM -0800, Paul Goyette wrote: Ooops - hit send too soon... With sources updated on 2014-02-26 at 15:29:31 UTC #create ramdisk/ramdisk.fs Calculated size of `ramdisk.fs.tmp': 256 bytes, 1436 inodes Extent size set to 4096 ramdisk.fs.tmp: 2.4MB (5000 sectors) block size 4096, fragment size 512 using 1 cylinder groups of 2.44MB, 625 blks, 1664 inodes. super-block backups (for fsck -b #) at: 32,nbmakefs: Writing inode 1415 (work/./usr/mdec/boot), bytes 36864 + 4096: Nospace left on device Populating `ramdisk.fs.tmp' Which architecture ? David -- David Laight: da...@l8s.co.uk
Re: 6.99.32: panic when starting X
On Sun, Feb 23, 2014 at 09:56:55PM +0100, Thomas Klausner wrote: On Sun, Feb 23, 2014 at 10:34:32AM +, Nick Hudson wrote: On 02/23/14 09:41, Thomas Klausner wrote: Also, x/i in ddb/crash that address and show registers (gdb) x/i usb_allocmem_flags+0x6c 0x808dbe2c usb_allocmem_flags+108: cmp%rbx,(%rcx) I assume usb_allocmem_flags+0x6c is 0x808dbe2c Correct! Does this help? I have the kernel (without symbols) and the crash dump if you want to know more or look at it. The kernels I've build don't have a 'cmp' instruction any where near that offset in usb_allocmem_flags. The function isn't that big, so if you run 'objdump -d /netbsd netbsd.dis' and search for the function body you'll only have about 120 lines. I can usually work out the source lines from that. (gdb's 'disas usb_allocmem_flags' probably gives the same lines.) David -- David Laight: da...@l8s.co.uk
Re: 6.99.32: panic when starting X
On Sun, Feb 23, 2014 at 10:26:21PM +, David Laight wrote: On Sun, Feb 23, 2014 at 09:56:55PM +0100, Thomas Klausner wrote: On Sun, Feb 23, 2014 at 10:34:32AM +, Nick Hudson wrote: On 02/23/14 09:41, Thomas Klausner wrote: Also, x/i in ddb/crash that address and show registers (gdb) x/i usb_allocmem_flags+0x6c 0x808dbe2c usb_allocmem_flags+108: cmp%rbx,(%rcx) I assume usb_allocmem_flags+0x6c is 0x808dbe2c Correct! Does this help? I have the kernel (without symbols) and the crash dump if you want to know more or look at it. The kernels I've build don't have a 'cmp' instruction any where near that offset in usb_allocmem_flags. The function isn't that big, so if you run 'objdump -d /netbsd netbsd.dis' and search for the function body you'll only have about 120 lines. I can usually work out the source lines from that. (gdb's 'disas usb_allocmem_flags' probably gives the same lines.) Thomas sent me the disassembly. It 'blew up' dereferencing block-tag in the loop: 1.53 mrg 313:mutex_enter(usb_blk_lock); 1.1 augustss 314:/* Check for free fragments. */ 1.44 matt 315:LIST_FOREACH(f, usb_frag_freelist, next) { 1.48 matt 316:KDASSERTMSG(usb_valid_block_p(f-block, usb_blk_fraglist), 1.50 jym 317:%s: usb frag %p: unknown block pointer %p, 318: __func__, f, f-block); 1.1 augustss 319:if (f-block-tag == tag) 320:break; 1.41 matt 321:} I'd guess a 'use after free' or 'allocate too short a buffer'. David -- David Laight: da...@l8s.co.uk
Re: updates to ls(1), output, and Emacs dired mode
On Sat, Feb 22, 2014 at 09:40:38PM +, Patrick Welche wrote: On Sat, Feb 22, 2014 at 09:55:48PM +0900, Ryo ONODERA wrote: From: chris...@astron.com (Christos Zoulas), Date: Fri, 21 Feb 2014 02:11:36 + (UTC) In article cabfrot8bczo+czrp-tffrc3j-qjdcp1grdkcjnujpq_jojt...@mail.gmail.com, B Harder brad.har...@gmail.com wrote: I suspect that the recent changes to ls have affected its output, which affects Emacs dired mode (it parses ls output). 1) Am I correct output has changed? 2) if yes, is this expected behaviour? No, output should not have changed unless the new options are used. With ls.c 1.71, output of ls -w is broken. /usr/src/bin/ls% LANG=C ./ls -w . . . . . . . . . . . . . . . . . I noticed that as ls | more giving a different result to ls. ls | more implies ls -1 | more. I'm not sure you can actually get the terminal output into a file (without using something like script). David -- David Laight: da...@l8s.co.uk
Re: amd64 build broken - npx.h not marked obsolete
On Thu, Feb 13, 2014 at 07:28:07AM -0800, Paul Goyette wrote: With up-to-date sources I'm getting == 1 missing files in DESTDIR Files in flist but missing from DESTDIR. File wasn't installed ? -- ./usr/include/i386/npx.h end of 1 missing files == Should this file be marked obsolete in src/distr/sets/lists/comp/md.i386 and /md.amd64 ? I'd obsoleted it for i386, I'd not realised it was released for amd64. Marked obsolete now. It must be possible to about having to edit so many files... David -- David Laight: da...@l8s.co.uk
Re: Dozens of new test failures on amd64!
On Wed, Feb 12, 2014 at 09:59:12AM -0800, Paul Goyette wrote: It seems to correspond with the recent changes/commits to atf ... We used to have 11 test failures for amd64, now we have 65! Please see [1] for details... [1] http://whooppee.com/amd64-results/6327_1_atf.html#failed-tcs-summary Something strange happens with this on my system as well. *** Check failed: /test-bed/src/tests/lib/libm/t_fmod.c:53: fabs(fmod(1.0, 0.1) - 0.1) = 55 * DBL_EPSILON not met It might be my fault! I've been fiddling with the fpu code. Except that it works on a bare-metal kernel I built 8pm on Sunday just before committing the code but fails under qemu. (Running on the same kernel.) Mind you the generated code is very strange! Ah that is because it uses the x87's 'partial remainder' instruction in a loop, and under some other conditions falls back on the fmod() library function. David -- David Laight: da...@l8s.co.uk
Re: Another 6.99.31 amd64 panic
On Tue, Feb 11, 2014 at 05:28:11PM +, Christos Zoulas wrote: In article CAG0OUxizzaDgjffmfKU1tSiPwYLi-+AUS+98mNgv=e6oqkc...@mail.gmail.com, Chavdar Ivanov ci4...@gmail.com wrote: Same with a kernel from today. Chavdar On 10 February 2014 16:38, Chavdar Ivanov ci4...@gmail.com wrote: From a build at 2014/02/09 14:29 I get: ... boot device: raid0 root on raid0a dumps on raid0b root file system type: ffs uvm_fault(0xfe8006d1ce60, 0x0, 4) - e uvm_fault(0xfe8006d1ce60, 0x0, 4) - e fatal page fault in supervisor mode trap type 6 code 0 rip 807d428e cs 8 rflags 10246 cr2 0 ilevel 0 rsp fe8006d09560 curlwp 0xfe8006d2fa00 pid 1.1 lowest kstack 0xfe8006d06000 kernel: page fault trap, code=0 Stopped in pid 1.1 (init) at netbsd:trap+0x99b: movzwl 0(%rax),%eax db{1} bt trap() at netbsdL:trap+0x99b --- trap (number 6) --- ?() at 0 execve_loadvm() at netbsd:execve_loadvm+0x1d6 execve1() at netbsd:execve1+0x2d start_init() at netbsd:start_init+0x2a7 db{1} movq256(%rbx), %rdx movq%rbx, %rsi movq-88(%rbp), %rdi callcheck_exec - movl%eax, %r13d testl %eax, %eax That does not look correct, can you use objdump --disassemble on kern_exec.o then compile kern_exec.c changing on the compile line s/-c/-S -gstabs/ and see which source line corresponds to your failing instruction by matching the offset from kern_exec.o to the instruction in kern_exec.s and then finding the source line to kern_exec.c? objdump -r -d kern_exec.o will lookup the relocations for you. But I'd guess that the backtrace has missed a function and the fault is somewhere inside check_exec(). The address 807d428e printed by the fault code is probbaly correct. Try 'objdump -d /netbsd' and sort out which function it is in. David -- David Laight: da...@l8s.co.uk
Re: Automated report: NetBSD-current/i386 build failure
On Tue, Feb 11, 2014 at 10:37:03PM +, NetBSD Test Fixture wrote: This is an automatically generated notice of a NetBSD-current/i386 build failure. The failure occurred on babylon5.NetBSD.org, a NetBSD/amd64 host, using sources from CVS date 2014.02.11.20.17.16. An extract from the build.sh output follows: File is obsolete or flist is out of date ? -- ./usr/include/x86/fpu.h = end of 1 extra files === Gah, I forgot that would creep into the i386 build already. I'll add it it. David -- David Laight: da...@l8s.co.uk
Re: Re: compat linux exec arguments weirdness
On Sun, Feb 09, 2014 at 08:41:56PM +0100, Onno van der Linden wrote: On Sun, Feb 09, 2014 at 08:45:40AM -0800, Chuck Silvers wrote: Looks like the implementation of AT_RANDOM messes up the argument stack (at least for the elf32 case, can't test the amd64 case myself). this should be fixed now, please update and give it a try. Works! Thanks very much, you can close the PR as far as I'm concerned. And now on to that firefox 27 compile error . :-) If that is the one to do with fxsave64, I've commited a fix. David -- David Laight: da...@l8s.co.uk
Re: kernel crashes because crypto unloading?
On Sun, Jan 19, 2014 at 09:49:42AM -0800, Paul Goyette wrote: On Sun, 19 Jan 2014, Paul Goyette wrote: I would have expected config_cfdata_detach() to fail (with EBUSY) if the device was still open by someone. So I'm not sure who/what still owns allocations from the module's memory pool. Hmmm, I guess I misunderstood something. It seems that there is no protection against detaching a device even when it is currently open. A quick-and-dirty program that simply opens /dev/crypto and sleeps shows that the module gets unloaded. I'm not sure at this point if the crypto(4) driver should implement a ref-count, or if a more generic solution should be created within the autoconf(9) framework. The module can't do its own refcounting. Think about what happens in the 'close' code on the last close. The driver will decrement the ref count to zero. The process gets pre-empted. The driver gets unloaded. The process resumes open/close (well probably the vnode) needs to hold a reference count against the device. There is a another race as well. If a loadable kernel module creates a kernel thread, then it has to request a module reference for that thread. When the thread exits it must do so by calling into the kernel requesting that the thread exit AND that the module reference count be reduced. Most of the time you should be able to assume that the code calling into the module holds a reference (possibly indirectly) that ensures the module won't go away. David -- David Laight: da...@l8s.co.uk
Re: bootxx_ffsv1 compilation failure on amd64
On Wed, Jan 15, 2014 at 02:37:17PM -0800, crazzybouy wrote: Hi All I stumbled upon this post while looking for ways to recompile bootxx_ffsv1. I need to put some prints to the boot loader to debug an issue for loading netbsd kernel with ramdisk size bigger than 16mb that does not work for me on an AMD64 board. Does it work for ramdisk + kernel 15MB at all? IIRC /boot is loaded at 64k (with a limit of 640k) and the kernel is loaded at 1M. The BIOS calls used to read the disk use a 16bit real mode seg:off address so can only generate 20bit addresses - so a 16MB limit. Loading any higher would require a low memory 'bouce' buffer. David -- David Laight: da...@l8s.co.uk
Re: evtchn_do_event: handler...didn't lower ipl (Was: Re: xl or xm for xen)
On Tue, Dec 03, 2013 at 08:55:06AM +0700, Robert Elz wrote: When statclock() - and hardclock() before it - is (or are) called, the cpu (apparently) already holds a (spin) mutex (the ci_mtx_count field of the cpu_info struct is -1).Given that, and the way spin mutexes work, statclock() (and then hardclock()) must return with the ipl higher. I'd have thought that acquiring a mutex would increase the count. So a count of -1 would indicate and extra release. Or does this counter have silly values? David -- David Laight: da...@l8s.co.uk
Re: ld.elf_so i386 memcpy corruption - calligrawords hangs
On Thu, Oct 17, 2013 at 10:45:08AM +0200, Martin Husemann wrote: You could uncomment the following lines in the src/libexec/ld.elf_so/Makefile #CPPFLAGS+= -DDEBUG #CPPFLAGS+= -DRTLD_DEBUG (re-)build and install ld.elf_so, and set LD_DEBUG=1 when starting the program. Better is to link your program with the alternate elf interpreter name. Then you don't affect anything else. If the filenames are the same length you should be able to find the string in the elf program and patch it (technically it is a shared string - but it is unlikely to be used twice). David -- David Laight: da...@l8s.co.uk
Re: link problems
On Fri, Oct 11, 2013 at 10:14:55AM +0200, Martin Husemann wrote: On Thu, Oct 10, 2013 at 06:42:54PM +0200, Martin Husemann wrote: You are right, but I can't find the initialization ;-) It is a bit hidden, but I think the patch below should do it - modulo the open question what defaults exactly we want changed. Joerg, do you mean to enable add_DT_NEEDED_for_regular as well by default? Do we have some simple test case for the whole issue? Martin ... + input_flags.add_DT_NEEDED_for_dynamic = TRUE; ... What does that change do? If you link a program with -lcurses you don't want a DT_NEEDED entry for libtemcap.so whether or not the program directly references anything in libtermcap.so. David -- David Laight: da...@l8s.co.uk