Re: FIFO fd not marked readable after EOF
On Mon, Oct 6, 2014 at 5:08 AM, Dimitris Papastamos s...@2f30.org wrote: I am using select(2) on a FIFO fd and monitoring for readability. select(2) doesn't return after the writer exits. The same piece of code marks the fd as readable on Linux. Not sure which behaviour is correct though. The Linux behavior is correct: http://pubs.opengroup.org/onlinepubs/9699919799/functions/pselect.html: A descriptor shall be considered ready for reading when a call to an input function with O_NONBLOCK clear would not block, whether or not the function would transfer data successfully. (The function might return data, an end-of-file indication, or an error other than one indicating that it is blocked, and in each of these cases the descriptor shall be considered ready for reading.) http://pubs.opengroup.org/onlinepubs/9699919799/functions/read.html: When attempting to read from an empty pipe or FIFO: [...] If no process has the pipe open for writing, read() shall return 0 to indicate end-of-file.
[patch] dhclient: support for RFC 3442 Local Subnet Routes
From RFC 3442: Local Subnet Routes In some cases more than one IP subnet may be configured on a link. In such cases, a host whose IP address is in one IP subnet in the link could communicate directly with a host whose IP address is in a different IP subnet on the same link. In cases where a client is being assigned an IP address on an IP subnet on such a link, for each IP subnet in the link other than the IP subnet on which the client has been assigned the DHCP server MAY be configured to specify a router IP address of 0.0.0.0. For example, consider the case where there are three IP subnets configured on a link: 10.0.0/24, 192.168.0/24, 10.0.21/24. If the client is assigned an IP address of 10.0.21.17, then the server could include a route with a destination of 10.0.0/24 and a router address of 0.0.0.0, and also a route with a destination of 192.168.0/24 and a router address of 0.0.0.0. A DHCP client whose underlying TCP/IP stack does not provide this capability MUST ignore routes in the Classless Static Routes option whose router IP address is 0.0.0.0. Please note that the behavior described here only applies to the Classless Static Routes option, not to the Static Routes option nor the Router option. The patch below adds support for these routes to dhclient. It refactors the existing magic direct-route incantation I added in r1.272, and reuses it in add_classless_static-routes(). I've experimentally verified that this makes OpenBSD work on Google Compute Engine again. E.g., a DHCP lease with: fixed-address 10.240.142.165; option subnet-mask 255.255.255.255; option classless-static-routes 10.240.0.1/32 0.0.0.0, 0.0.0.0/0 10.240.0.1; now results in a routing table like: Internet: DestinationGatewayFlags Refs Use Mtu Prio Iface default10.240.0.1 UGS00 - 8 vio0 10.240.0.1 link#1 UHLc 10 - 8 vio0 10.240.0.1/32 link#1 UCS10 - 8 vio0 10.240.142.165 42:01:0a:f0:8e:a5 UHLl 00 - 1 lo0 10.240.142.165/32 link#1 UC 00 - 4 vio0 127/8 127.0.0.1 UGRS 00 32768 8 lo0 127.0.0.1 127.0.0.1 UH 10 32768 4 lo0 224/4 127.0.0.1 URS00 32768 8 lo0 ok? Index: dhclient.c === RCS file: /cvs/src/sbin/dhclient/dhclient.c,v retrieving revision 1.319 diff -u -p -r1.319 dhclient.c --- dhclient.c 11 Aug 2014 18:41:13 - 1.319 +++ dhclient.c 1 Oct 2014 00:47:22 - @@ -108,9 +108,10 @@ struct client_lease *clone_lease(struct voidsocket_nonblockmode(int); voidapply_ignore_list(char *); +void add_direct_route(int, struct in_addr, struct in_addr, struct in_addr); void add_default_route(int, struct in_addr, struct in_addr); void add_static_routes(int, struct option_data *); -void add_classless_static_routes(int, struct option_data *); +void add_classless_static_routes(int, struct option_data *, struct in_addr); int compare_lease(struct client_lease *, struct client_lease *); void set_lease_times(struct client_lease *); @@ -872,10 +873,12 @@ bind_lease(void) add_address(ifi-name, ifi-rdomain, client-active-address, mask); if (options[DHO_CLASSLESS_STATIC_ROUTES].len) { add_classless_static_routes(ifi-rdomain, - options[DHO_CLASSLESS_STATIC_ROUTES]); + options[DHO_CLASSLESS_STATIC_ROUTES], + client-active-address); } else if (options[DHO_CLASSLESS_MS_STATIC_ROUTES].len) { add_classless_static_routes(ifi-rdomain, - options[DHO_CLASSLESS_MS_STATIC_ROUTES]); + options[DHO_CLASSLESS_MS_STATIC_ROUTES], + client-active-address); } else { opt = options[DHO_ROUTERS]; if (opt-len = sizeof(gateway)) { @@ -883,17 +886,13 @@ bind_lease(void) gateway.s_addr = ((struct in_addr *)opt-data)-s_addr; /* -* If we were given a /32 IP assignment, then make sure -* the gateway address is routable with equivalent of -* -* route add -net $gw -netmask 255.255.255.255 \ -* -cloning -iface $addr +* To be compatible with ISC DHCP behavior on Linux, if +* we were given a /32 IP assignment, then add a /32 +* direct route for the gateway to make it routable.
Enable DWARF line decoding in ddb
Patch below enables DWARF line decoding in ddb; e.g., ddb{0} trace Debugger() at Debugger+0x9 [../../../../arch/amd64/amd64/db_interface.c:405] ddb_sysctl() at ddb_sysctl+0x1b4 [../../../../ddb/db_usrreq.c:104] sys___sysctl() at sys___sysctl+0x216 [../../../../kern/kern_sysctl.c:229] syscall() at syscall+0x297 [../../../../sys/syscall_mi.h:84] --- syscall (number 202) --- end of kernel end trace frame: 0x7f7cf1d7, count: -4 acpi_pdirpa+0x4117aa: Full set of instructions to apply and test (assuming amd64 MP): # Apply patch cd /usr/src/sys patch /path/to/diff # Enable DEBUG=-g $EDITOR conf/GENERIC # uncomment ``makeoptions DEBUG=-g'' line and save # Build new kernel cd arch/amd64/conf config GENERIC.MP cd ../compile/GENERIC.MP make clean make # Build new boot(8) cd ../../stand/boot make install # Install new boot(8) cp /usr/mdec/boot /boot installboot -v /boot /usr/mdec/biosboot sd0 Now reboot and boot the nbsd kernel. Stack traces in ddb should include file and line numbers. For file/line numbers to appear, all of the following are necessary: 1. New boot(8) 2. New kernel 3. Kernel built with DEBUG=-g 4. Boot bsd.gdb instead of bsd Under any other conditions, ddb should gracefully fall back to bare stack traces. I've tested most combinations, but additional testing is appreciated. Implementation details: - boot(8) now loads the ELF header string table so it can identify section names. It also now loads the .debug_line section into memory, and marks loaded sections with SHF_ALLOC. - ddb when generating stack traces now checks if there's a section named .debug_line that's marked SHF_ALLOC; and if so, ddb parses the section contents to translate program counter values into file/line pairs. - New boot(8) should be backwards compatible with old kernels, because they won't care if .debug_line is loaded; new bsd.gdb should be compatible with old boot(8) because .debug_line isn't normally marked SHF_ALLOC. Index: conf/files === RCS file: /home/matthew/anoncvs/cvs/src/sys/conf/files,v retrieving revision 1.577 diff -u -p -r1.577 files --- conf/files 8 Sep 2014 04:40:30 - 1.577 +++ conf/files 29 Sep 2014 02:46:16 - @@ -611,6 +611,7 @@ filenet/if_pppoe.c pppoe needs-flag file ddb/db_access.c ddb | kgdb file ddb/db_break.cddb file ddb/db_command.c ddb +file ddb/db_dwarf.cddb file ddb/db_elf.c ddb file ddb/db_examine.c ddb file ddb/db_expr.c ddb Index: ddb/db_elf.c === RCS file: /home/matthew/anoncvs/cvs/src/sys/ddb/db_elf.c,v retrieving revision 1.11 diff -u -p -r1.11 db_elf.c --- ddb/db_elf.c14 Sep 2014 14:17:24 - 1.11 +++ ddb/db_elf.c29 Sep 2014 02:46:16 - @@ -32,12 +32,14 @@ */ #include sys/types.h +#include sys/stdint.h #include sys/param.h #include sys/systm.h #include sys/exec.h #include machine/db_machdep.h +#include ddb/db_dwarf.h #include ddb/db_sym.h #include ddb/db_output.h #include ddb/db_extern.h @@ -45,6 +47,7 @@ #include sys/exec_elf.h static char *db_elf_find_strtab(db_symtab_t *); +static char *db_elf_find_linetab(db_symtab_t *, size_t *); #defineSTAB_TO_SYMSTART(stab) ((Elf_Sym *)((stab)-start)) #defineSTAB_TO_SYMEND(stab)((Elf_Sym *)((stab)-end)) @@ -110,10 +113,10 @@ db_elf_sym_init(int symsize, void *symta * . . . * . . . * last section header -* first symbol or string table section +* first symbol, string, or line table section * . . . * . . . -* last symbol or string table section +* last symbol, string, or line table section */ /* @@ -232,6 +235,30 @@ db_elf_find_strtab(db_symtab_t *stab) } /* + * Internal helper function - return a pointer to the line table + * for the current symbol table. + */ +static char * +db_elf_find_linetab(db_symtab_t *stab, size_t *size) +{ + Elf_Ehdr *elf = STAB_TO_EHDR(stab); + Elf_Shdr *shp = STAB_TO_SHDR(stab, elf); + char *shstrtab; + int i; + + shstrtab = (char *)elf + shp[elf-e_shstrndx].sh_offset; + for (i = 0; i elf-e_shnum; i++) { + if ((shp[i].sh_flags SHF_ALLOC) != 0 + strcmp(.debug_line, shstrtab+shp[i].sh_name) == 0) { + *size = shp[i].sh_size; + return ((char *)elf + shp[i].sh_offset); + } + } + + return (NULL); +} + +/* * Lookup the symbol with the given name. */ db_sym_t @@ -350,11 +377,25 @@ boolean_t db_elf_line_at_pc(db_symtab_t *symtab, db_sym_t cursym, char
Re: lsearch.c
Hm, is there a practical consequence of this? Seems like it would only affect applications trying to store a reference to lsearch in a function pointer of type void (*)(const void *, void *, size_t *, size_t, int (*)(const void *, const void *)); do those exist? On Thu, Jul 17, 2014 at 5:38 PM, enh e...@google.com wrote: POSIX says lsearch's 'base' is void*, not const void*. http://pubs.opengroup.org/onlinepubs/9699919799/functions/lsearch.html openbsd gets this wrong in search.h and lsearch.c. (but lfind is correct --- that should be const void*.) -e
Re: lsearch.c
Hrm. It seems silly to me to change it to require a non-const pointer argument, but POSIX, Linux, Solaris, FreeBSD, and NetBSD all use void *base so it doesn't seem like we should have any ports tree issues from changing it to be compatible either. On Thu, Jul 17, 2014 at 6:04 PM, enh e...@google.com wrote: For me (Android) the practical consequence is compiler warnings unless I change search.h to be wrong. I'm going through the non-Open BSD stuff I have working out why. In this case, this seems to be the reason we have the NetBSD search.h implementation. The majority of our BSD code is OpenBSD so my preference would be to switch over as much as possible. On Jul 17, 2014 5:59 PM, Matthew Dempsky matt...@dempsky.org wrote: Hm, is there a practical consequence of this? Seems like it would only affect applications trying to store a reference to lsearch in a function pointer of type void (*)(const void *, void *, size_t *, size_t, int (*)(const void *, const void *)); do those exist? On Thu, Jul 17, 2014 at 5:38 PM, enh e...@google.com wrote: POSIX says lsearch's 'base' is void*, not const void*. http://pubs.opengroup.org/onlinepubs/9699919799/functions/lsearch.html openbsd gets this wrong in search.h and lsearch.c. (but lfind is correct --- that should be const void*.) -e
Re: lsearch.c
On Thu, Jul 17, 2014 at 8:05 PM, patrick keshishian pkesh...@gmail.com wrote: Silly even though, the description of lsearch says it will modify (it shall be added at the end of) the table, for which base argument points to the first element? Ah, I didn't pay close attention to the definition and assumed it was simply a search function like memchr() or strstr(). If it might modify memory, then yes, making it non-const absolutely makes sense.
Possible bug in cpu_chooseproc?
As the name suggests, remrunqueue(p) removes p from its run queue, and I believe that makes TAILQ_FOREACH() here unsafe. Instead of actually removing all threads from the processor, we'll only remove the first from each of its run queues. Diff below replaces TAILQ_FOREACH with the safe/idiomatic pattern for draining a queue. ok? (This hasn't bit me and I don't know any practical consequences of it. Just spotted it while tracking down a bug in a kern_synch.c diff I'm working on.) Index: kern_sched.c === RCS file: /cvs/src/sys/kern/kern_sched.c,v retrieving revision 1.32 diff -u -p -r1.32 kern_sched.c --- kern_sched.c4 May 2014 05:03:26 - 1.32 +++ kern_sched.c13 Jul 2014 20:18:38 - @@ -272,7 +272,7 @@ sched_chooseproc(void) if (spc-spc_schedflags SPCF_SHOULDHALT) { if (spc-spc_whichqs) { for (queue = 0; queue SCHED_NQS; queue++) { - TAILQ_FOREACH(p, spc-spc_qs[queue], p_runq) { + while ((p = TAILQ_FIRST(spc-spc_qs[queue]))) { remrunqueue(p); p-p_cpu = sched_choosecpu(p); setrunqueue(p);
Re: CVS: cvs.openbsd.org: src
On Fri, Jul 11, 2014 at 3:41 PM, Bob Beck b...@obtuse.com wrote: The OPENSSL_VERSION number is a guarantee for a certain version of the ABI. As we dont' provide that (in fact much of the ABI in LIbreSSL is beyond 1.0.1g, it is not accurate to use the old OPENSSL_VERSION. Essnentially this OPENSSL_VERSION is bigger than 1.0.1g's. By that argument, we won't be ABI compatible with OpenSSL 2.0 either, so we shouldn't provide OPENSSL_VERSION at all. My 2c is for keeping OPENSSL_VERSION_NUMBER as the most recent OpenSSL version that we're *mostly* API/feature compatible with, and using LIBRESSL_VERSION_NUMBER to identify the exact LibreSSL version. By polluting the OPENSSL_VERSION_NUMBER namespace we just make things more difficult for downstream users that want to be compatible with both OpenSSL and LibreSSL. E.g., to check for a feature that was added in OpenSSL 1.2 but isn't present in LibreSSL, that code now needs to be #if OPENSSL_VERSION_NUMBER = 1.2 !defined(LIBRESSL_VERSION_NUMBER) rather than simply #if OPENSSL_VERSION_NUMBER = 1.2 Breaking the latter just seems like making it more difficult to get people to port their software from OpenSSL to LibreSSL.
Re: CVS: cvs.openbsd.org: src
On Fri, Jul 11, 2014 at 4:37 PM, Bob Beck b...@obtuse.com wrote: The fundamental probelm with this Matthew - is that next time, if we do this, by the next release we will be chasing what features we have imported from 1.0.2g and 10.2.z, and 1.0.2.qq - where does it end? It ends whenever it stops helping portability for apps that are currently written for OpenSSL. We've expressly decided to ignore any API/ABI compatibility guarantees with OpenSSL, so an OpenSSL version number is inherently just a best effort to make things easier on applications to transition from OpenSSL to LibreSSL. Clang went through this same process with code that did GCC version checks. Today Clang still claims it's GCC 4.2, but in a separate version it reveals it's Clang 3.5. Existing code that only knows to check for older versions of GCC (e.g., OpenBSD's sys/cdefs.h) continues work just fine with Clang, because it picks up all of the definitions targeted towards GCC 4.2. New code that wants to make use of features in GCC 4.7 and Clang 3.5 though needs to check for both; but even if it doesn't, if it includes fallback for older versions of GCC it should still work okay with Clang. Concrete analogy: suppose LibreSSL 2.1 and OpenSSL 1.1 both add some new feature, and an application that wants to be compatible with both wants to make use of that feature. How do they version check for its availability? Naively, it would be #if LibreSSL = 2.1 || OpenSSL = 1.1 but that's going to cause the application to break when compiled with older versions of LibreSSL. It would actually needs to be #if LibreSSL = 2.1 || (!defined(LibreSSL) OpenSSL = 1.1) We don't gain anything by making people need to write the latter, IMO.
Re: ddb: ELF line decoding in stack traces
On Mon, Jun 23, 2014 at 06:46:27AM -0700, Matthew Dempsky wrote: I did this mostly for fun / personal enlightenment, so I haven't put too much effort into polishing it yet. I got some positive feedback about the direction of this diff, so tonight I finally got around to polishing it some more. In particular, db_dwarf.c now includes a stand-alone addr2line clone that can be compiled separately with cc -o ddb-addr2line db_dwarf.c, and the output should match addr2line(1)'s. I've extensively tested it against amd64's bsd.gdb, but I'd love testing feedback on big-endian and LP32 platforms. Unless anyone objects, I plan to commit the db_dwarf.{c,h} additions tomorrow (i.e., not wired into the build) so they can be developed/tested further in tree. The other bits I'll split out and post for separate review as appropriate. (I at least need to first ensure new bsd.gdb kernels will gracefully handle being loaded by old boot loaders.) Index: ddb/db_dwarf.h === RCS file: ddb/db_dwarf.h diff -N ddb/db_dwarf.h --- /dev/null 1 Jan 1970 00:00:00 - +++ ddb/db_dwarf.h 10 Jul 2014 07:14:26 - @@ -0,0 +1,24 @@ +/* $OpenBSD$*/ +/* + * Copyright (c) 2014 Matthew Dempsky matt...@dempsky.org + * + * Permission to use, copy, modify, and distribute this software for any + * purpose with or without fee is hereby granted, provided that the above + * copyright notice and this permission notice appear in all copies. + * + * THE SOFTWARE IS PROVIDED AS IS AND THE AUTHOR DISCLAIMS ALL WARRANTIES + * WITH REGARD TO THIS SOFTWARE INCLUDING ALL IMPLIED WARRANTIES OF + * MERCHANTABILITY AND FITNESS. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR + * ANY SPECIAL, DIRECT, INDIRECT, OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES + * WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN + * ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF + * OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE. + */ + +#ifndef _DDB_DB_DWARF_H_ +#define _DDB_DB_DWARF_H_ + +bool db_dwarf_line_at_pc(const char *, size_t, uintptr_t, +const char **, const char **, int *); + +#endif /* _DDB_DB_DWARF_H_ */ Index: ddb/db_dwarf.c === RCS file: ddb/db_dwarf.c diff -N ddb/db_dwarf.c --- /dev/null 1 Jan 1970 00:00:00 - +++ ddb/db_dwarf.c 10 Jul 2014 07:49:07 - @@ -0,0 +1,550 @@ +/* $OpenBSD$*/ +/* + * Copyright (c) 2014 Matthew Dempsky matt...@dempsky.org + * + * Permission to use, copy, modify, and distribute this software for any + * purpose with or without fee is hereby granted, provided that the above + * copyright notice and this permission notice appear in all copies. + * + * THE SOFTWARE IS PROVIDED AS IS AND THE AUTHOR DISCLAIMS ALL WARRANTIES + * WITH REGARD TO THIS SOFTWARE INCLUDING ALL IMPLIED WARRANTIES OF + * MERCHANTABILITY AND FITNESS. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR + * ANY SPECIAL, DIRECT, INDIRECT, OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES + * WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN + * ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF + * OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE. + */ + +#ifdef _KERNEL +#include sys/types.h +#include sys/stdint.h +#include sys/param.h +#include sys/systm.h +#include ddb/db_dwarf.h +#ifdef DIAGNOSTIC +#define DWARN(fmt, ...) printf(ddb: fmt \n, __VA_ARGS__) +#else +#define DWARN(fmt, ...) ((void)0) +#endif +#else /* _KERNEL */ +#include err.h +#include stdbool.h +#include stdint.h +#include string.h +#define DWARN warnx +#endif /* _KERNEL */ + +enum { + DW_LNS_copy = 1, + DW_LNS_advance_pc = 2, + DW_LNS_advance_line = 3, + DW_LNS_set_file = 4, + DW_LNS_set_column = 5, + DW_LNS_negate_stmt = 6, + DW_LNS_set_basic_block = 7, + DW_LNS_const_add_pc = 8, + DW_LNS_fixed_advance_pc = 9, + DW_LNS_set_prologue_end = 10, + DW_LNS_set_epilogue_begin = 11, +}; + +enum { + DW_LNE_end_sequence = 1, + DW_LNE_set_address = 2, + DW_LNE_define_file = 3, +}; + +struct dwbuf { + const char *buf; + size_t len; +}; + +static bool +read_bytes(struct dwbuf *d, void *v, size_t n) +{ + if (d-len n) + return (false); + memcpy(v, d-buf, n); + d-buf += n; + d-len -= n; + return (true); +} + +static bool +read_s8(struct dwbuf *d, int8_t *v) +{ + return (read_bytes(d, v, sizeof(*v))); +} + +static bool +read_u8(struct dwbuf *d, uint8_t *v) +{ + return (read_bytes(d, v, sizeof(*v))); +} + +static bool +read_u16(struct dwbuf *d, uint16_t *v) +{ + return (read_bytes(d, v, sizeof(*v))); +} + +static bool +read_u32
mallocarray(9)
We've found a bunch of uses for reallocarray() in userland, and I think the idiom is worth reusing in the kernel. There are enough places where we do malloc(x * y) that I think it makes sense to add mallocarray(x, y). ok? Index: share/man/man9/Makefile === RCS file: /home/matthew/cvs-mirror/cvs/src/share/man/man9/Makefile,v retrieving revision 1.210 diff -u -p -r1.210 Makefile --- share/man/man9/Makefile 30 Jun 2014 21:48:09 - 1.210 +++ share/man/man9/Makefile 10 Jul 2014 17:15:45 - @@ -232,7 +232,7 @@ MLINKS+=ktrace.9 ktrcsw.9 ktrace.9 ktrem ktrace.9 ktrsysret.9 ktrace.9 KTRPOINT.9 MLINKS+=lock.9 lockinit.9 lock.9 lockmgr.9 lock.9 lockstatus.9 MLINKS+=log.9 addlog.9 -MLINKS+=malloc.9 free.9 +MLINKS+=malloc.9 mallocarray.9 malloc.9 free.9 MLINKS+=membar_sync.9 membar_enter.9 membar_sync.9 membar_exit.9 \ membar_sync.9 membar_producer.9 membar_sync.9 membar_consumer.9 MLINKS+=mbuf.9 m_copym2.9 mbuf.9 m_copym.9 mbuf.9 m_free.9 mbuf.9 MFREE.9 \ Index: share/man/man9/malloc.9 === RCS file: /home/matthew/cvs-mirror/cvs/src/share/man/man9/malloc.9,v retrieving revision 1.52 diff -u -p -r1.52 malloc.9 --- share/man/man9/malloc.9 3 Apr 2014 04:10:34 - 1.52 +++ share/man/man9/malloc.9 10 Jul 2014 17:13:39 - @@ -33,6 +33,7 @@ .Os .Sh NAME .Nm malloc , +.Nm mallocarray , .Nm free .Nd kernel memory allocator .Sh SYNOPSIS @@ -40,6 +41,13 @@ .In sys/malloc.h .Ft void * .Fn malloc unsigned long size int type int flags +.Ft void * +.Fo mallocarray +.Fa unsigned long nmemb +.Fa unsigned long size +.Fa int type +.Fa int flags +.Fc .Ft void .Fn free void *addr int type .Sh DESCRIPTION @@ -48,8 +56,17 @@ The function allocates uninitialized memory in kernel address space for an object whose size is specified by .Fa size . +The +.Fn mallocarray +function is the same as +.Fn malloc , +except it allocates space for an array of +.Fa nmemb +objects and checks for arithmetic overflow. +.Pp +The .Fn free -releases memory at address +function releases memory at address .Fa addr that was previously allocated by .Fn malloc @@ -60,7 +77,8 @@ is a null pointer, no action occurs. .Pp The .Fa flags -argument further qualifies malloc's +argument further qualifies +.Fn malloc Ns 's operational characteristics as follows: .Bl -tag -width xxx -offset indent .It Dv M_WAITOK Index: sys/conf/files === RCS file: /home/matthew/cvs-mirror/cvs/src/sys/conf/files,v retrieving revision 1.572 diff -u -p -r1.572 files --- sys/conf/files 19 Apr 2014 12:27:06 - 1.572 +++ sys/conf/files 10 Jul 2014 17:09:03 - @@ -679,6 +679,7 @@ file kern/kern_ktrace.c ktrace file kern/kern_lock.c file kern/kern_lkm.c lkm file kern/kern_malloc.c +file kern/kern_mallocarray.c file kern/kern_malloc_debug.c malloc_debug file kern/kern_rwlock.c file kern/kern_physio.c Index: sys/sys/malloc.h === RCS file: /home/matthew/cvs-mirror/cvs/src/sys/sys/malloc.h,v retrieving revision 1.108 diff -u -p -r1.108 malloc.h --- sys/sys/malloc.h19 May 2014 14:30:03 - 1.108 +++ sys/sys/malloc.h10 Jul 2014 17:08:05 - @@ -391,10 +391,12 @@ extern struct kmemusage *kmemusage; extern char *kmembase; extern struct kmembuckets bucket[]; -extern void *malloc(unsigned long size, int type, int flags); -extern void free(void *addr, int type); -extern int sysctl_malloc(int *, u_int, void *, size_t *, void *, size_t, - struct proc *); +void *malloc(unsigned long size, int type, int flags); +void *mallocarray(unsigned long nmemb, unsigned long size, + int type, int flags); +void free(void *addr, int type); +intsysctl_malloc(int *, u_int, void *, size_t *, void *, size_t, + struct proc *); size_t malloc_roundup(size_t); void malloc_printit(int (*)(const char *, ...)); Index: sys/kern/kern_mallocarray.c === RCS file: sys/kern/kern_mallocarray.c diff -N sys/kern/kern_mallocarray.c --- /dev/null 1 Jan 1970 00:00:00 - +++ sys/kern/kern_mallocarray.c 10 Jul 2014 17:10:06 - @@ -0,0 +1,38 @@ +/* $OpenBSD: reallocarray.c,v 1.1 2014/05/08 21:43:49 deraadt Exp $ */ +/* + * Copyright (c) 2008 Otto Moerbeek o...@drijf.net + * + * Permission to use, copy, modify, and distribute this software for any + * purpose with or without fee is hereby granted, provided that the above + * copyright notice and this permission notice appear in all copies. + * + * THE SOFTWARE IS PROVIDED AS IS AND THE AUTHOR DISCLAIMS ALL WARRANTIES + * WITH REGARD TO THIS SOFTWARE INCLUDING ALL IMPLIED WARRANTIES OF + * MERCHANTABILITY AND FITNESS. IN NO EVENT
Re: mallocarray(9)
On Thu, Jul 10, 2014 at 10:28:35AM -0700, Matthew Dempsky wrote: +/* + * This is sqrt(SIZE_MAX+1), as s1*s2 = SIZE_MAX + * if both s1 MUL_NO_OVERFLOW and s2 MUL_NO_OVERFLOW + */ +#define MUL_NO_OVERFLOW (1UL (sizeof(size_t) * 4)) + +void * +mallocarray(unsigned long nmemb, unsigned long size, int type, int flags) +{ + if ((nmemb = MUL_NO_OVERFLOW || size = MUL_NO_OVERFLOW) + nmemb 0 SIZE_MAX / nmemb size) { + if (flags M_CANFAIL) + return (NULL); + panic(overflow); + } + return (malloc(size * nmemb, type, flags)); +} Oops, didn't fully convert size_t - unsigned long to match malloc(9). (size_t *is* unsigned long currently on all of our architectures, but technically that's supposed to be opaque.) Fixed kern_mallocarray.c below. /* $OpenBSD$ */ /* * Copyright (c) 2008 Otto Moerbeek o...@drijf.net * * Permission to use, copy, modify, and distribute this software for any * purpose with or without fee is hereby granted, provided that the above * copyright notice and this permission notice appear in all copies. * * THE SOFTWARE IS PROVIDED AS IS AND THE AUTHOR DISCLAIMS ALL WARRANTIES * WITH REGARD TO THIS SOFTWARE INCLUDING ALL IMPLIED WARRANTIES OF * MERCHANTABILITY AND FITNESS. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR * ANY SPECIAL, DIRECT, INDIRECT, OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES * WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN * ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF * OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE. */ #include sys/param.h #include sys/limits.h #include sys/malloc.h #include sys/systm.h /* * This is sqrt(ULONG_MAX+1), as s1*s2 = ULONG_MAX * if both s1 MUL_NO_OVERFLOW and s2 MUL_NO_OVERFLOW */ #define MUL_NO_OVERFLOW (1UL (sizeof(unsigned long) * 4)) void * mallocarray(unsigned long nmemb, unsigned long size, int type, int flags) { if ((nmemb = MUL_NO_OVERFLOW || size = MUL_NO_OVERFLOW) nmemb 0 ULONG_MAX / nmemb size) { if (flags M_CANFAIL) return (NULL); panic(overflow); } return (malloc(size * nmemb, type, flags)); }
Add MAP_ANONYMOUS as a synonym for MAP_ANON
The Austin Group this morning accepted proposed wording to standardize both MAP_ANON and MAP_ANONYMOUS, recognizing that neither was clearly more popular than the other among applications, and that there's precedent in POSIX for simply standardizing multiple spellings for a feature when both are common and trivial for implementations to provide (e.g., PAGE_SIZE and PAGESIZE). Diff below defines MAP_ANONYMOUS as a synonym for MAP_ANON, keeping MAP_ANON as canonical in following BSD heritage. ok? Index: lib/libc/sys/mmap.2 === RCS file: /cvs/src/lib/libc/sys/mmap.2,v retrieving revision 1.50 diff -u -p -r1.50 mmap.2 --- lib/libc/sys/mmap.2 2 Jul 2014 22:22:35 - 1.50 +++ lib/libc/sys/mmap.2 10 Jul 2014 18:32:59 - @@ -121,7 +121,7 @@ Sharing, mapping type, and options are s .Fa flags argument by OR'ing the following values. Exactly one of the first two values must be specified: -.Bl -tag -width MAP_PRIVATE -offset indent +.Bl -tag -width MAP_ANONYMOUS -offset indent .It Dv MAP_PRIVATE Modifications are private. .It Dv MAP_SHARED @@ -129,13 +129,16 @@ Modifications are shared. .El .Pp Any combination of the following flags may additionally be used: -.Bl -tag -width MAP_PRIVATE -offset indent +.Bl -tag -width MAP_ANONYMOUS -offset indent .It Dv MAP_ANON Map anonymous memory not associated with any specific file. The file descriptor used for creating .Dv MAP_ANON must currently be \-1 indicating no name is associated with the region. +.It Dv MAP_ANONYMOUS +Synonym for +.Dv MAP_ANON . .It Dv MAP_FIXED Demand that the mapping is placed at .Fa addr , @@ -157,7 +160,7 @@ source compatibility with code written f but are not recommended for use in new .Ox code: -.Bl -tag -width MAP_PRIVATE -offset indent +.Bl -tag -width MAP_ANONYMOUS -offset indent .It Dv MAP_COPY Modifications are private and, unlike .Dv MAP_PRIVATE , Index: sys/sys/mman.h === RCS file: /cvs/src/sys/sys/mman.h,v retrieving revision 1.25 diff -u -p -r1.25 mman.h --- sys/sys/mman.h 27 Jun 2014 20:50:43 - 1.25 +++ sys/sys/mman.h 10 Jul 2014 18:32:59 - @@ -57,6 +57,7 @@ #defineMAP_FIXED 0x0010 /* map addr must be exactly as requested */ #define__MAP_NOREPLACE 0x0800 /* fail if address not available */ #defineMAP_ANON0x1000 /* allocated from memory, swap space */ +#defineMAP_ANONYMOUS MAP_ANON/* alternate POSIX spelling */ #defineMAP_FLAGMASK0x1ff7
Re: mallocarray(9)
On Thu, Jul 10, 2014 at 02:42:34PM -0400, Ted Unangst wrote: On Thu, Jul 10, 2014 at 10:28, Matthew Dempsky wrote: We've found a bunch of uses for reallocarray() in userland, and I think the idiom is worth reusing in the kernel. There are enough places where we do malloc(x * y) that I think it makes sense to add mallocarray(x, y). +void *malloc(unsigned long size, int type, int flags); +void *mallocarray(unsigned long nmemb, unsigned long size, + int type, int flags); The unsigned long here is ancient history. It should be size_t I think. Index: sys/kern/kern_mallocarray.c Please just add it to kern_malloc.c without creating a new file. ok? Index: share/man/man9/Makefile === RCS file: /home/matthew/cvs-mirror/cvs/src/share/man/man9/Makefile,v retrieving revision 1.210 diff -u -p -r1.210 Makefile --- share/man/man9/Makefile 30 Jun 2014 21:48:09 - 1.210 +++ share/man/man9/Makefile 10 Jul 2014 17:15:45 - @@ -232,7 +232,7 @@ MLINKS+=ktrace.9 ktrcsw.9 ktrace.9 ktrem ktrace.9 ktrsysret.9 ktrace.9 KTRPOINT.9 MLINKS+=lock.9 lockinit.9 lock.9 lockmgr.9 lock.9 lockstatus.9 MLINKS+=log.9 addlog.9 -MLINKS+=malloc.9 free.9 +MLINKS+=malloc.9 mallocarray.9 malloc.9 free.9 MLINKS+=membar_sync.9 membar_enter.9 membar_sync.9 membar_exit.9 \ membar_sync.9 membar_producer.9 membar_sync.9 membar_consumer.9 MLINKS+=mbuf.9 m_copym2.9 mbuf.9 m_copym.9 mbuf.9 m_free.9 mbuf.9 MFREE.9 \ Index: share/man/man9/malloc.9 === RCS file: /home/matthew/cvs-mirror/cvs/src/share/man/man9/malloc.9,v retrieving revision 1.52 diff -u -p -r1.52 malloc.9 --- share/man/man9/malloc.9 3 Apr 2014 04:10:34 - 1.52 +++ share/man/man9/malloc.9 10 Jul 2014 18:52:54 - @@ -33,13 +33,16 @@ .Os .Sh NAME .Nm malloc , +.Nm mallocarray , .Nm free .Nd kernel memory allocator .Sh SYNOPSIS .In sys/types.h .In sys/malloc.h .Ft void * -.Fn malloc unsigned long size int type int flags +.Fn malloc size_t size int type int flags +.Ft void * +.Fn malloc size_t nmemb size_t size int type int flags .Ft void .Fn free void *addr int type .Sh DESCRIPTION @@ -48,8 +51,17 @@ The function allocates uninitialized memory in kernel address space for an object whose size is specified by .Fa size . +The +.Fn mallocarray +function is the same as +.Fn malloc , +except it allocates space for an array of +.Fa nmemb +objects and checks for arithmetic overflow. +.Pp +The .Fn free -releases memory at address +function releases memory at address .Fa addr that was previously allocated by .Fn malloc @@ -60,7 +72,8 @@ is a null pointer, no action occurs. .Pp The .Fa flags -argument further qualifies malloc's +argument further qualifies +.Fn malloc Ns 's operational characteristics as follows: .Bl -tag -width xxx -offset indent .It Dv M_WAITOK Index: sys/sys/malloc.h === RCS file: /home/matthew/cvs-mirror/cvs/src/sys/sys/malloc.h,v retrieving revision 1.108 diff -u -p -r1.108 malloc.h --- sys/sys/malloc.h19 May 2014 14:30:03 - 1.108 +++ sys/sys/malloc.h10 Jul 2014 18:49:44 - @@ -391,10 +391,11 @@ extern struct kmemusage *kmemusage; extern char *kmembase; extern struct kmembuckets bucket[]; -extern void *malloc(unsigned long size, int type, int flags); -extern void free(void *addr, int type); -extern int sysctl_malloc(int *, u_int, void *, size_t *, void *, size_t, - struct proc *); +void *malloc(size_t, int, int); +void *mallocarray(size_t, size_t, int, int); +void free(void *, int); +intsysctl_malloc(int *, u_int, void *, size_t *, void *, size_t, + struct proc *); size_t malloc_roundup(size_t); void malloc_printit(int (*)(const char *, ...)); Index: sys/kern/kern_malloc.c === RCS file: /home/matthew/cvs-mirror/cvs/src/sys/kern/kern_malloc.c,v retrieving revision 1.107 diff -u -p -r1.107 kern_malloc.c --- sys/kern/kern_malloc.c 19 May 2014 14:30:03 - 1.107 +++ sys/kern/kern_malloc.c 10 Jul 2014 18:50:15 - @@ -159,7 +159,7 @@ struct timeval malloc_lasterr; * Allocate a block of memory */ void * -malloc(unsigned long size, int type, int flags) +malloc(size_t size, int type, int flags) { struct kmembuckets *kbp; struct kmemusage *kup; @@ -701,3 +701,37 @@ malloc_printit( #endif } #endif /* DDB */ + +/* + * Copyright (c) 2008 Otto Moerbeek o...@drijf.net + * + * Permission to use, copy, modify, and distribute this software for any + * purpose with or without fee is hereby granted, provided that the above + * copyright notice and this permission notice appear in all copies. + * + * THE SOFTWARE IS PROVIDED AS IS AND THE AUTHOR DISCLAIMS ALL WARRANTIES + * WITH REGARD TO THIS SOFTWARE
Re: mallocarray(9)
On Thu, Jul 10, 2014 at 12:34 PM, Jean-Philippe Ouellet jean-phili...@ouellet.biz wrote: 2nd one should be mallocarray. Doh, fixed. Thanks!
Re: Add MAP_ANONYMOUS as a synonym for MAP_ANON
On Thu, Jul 10, 2014 at 1:17 PM, Kent R. Spillner kspill...@acm.org wrote: If MAP_ANON is going to be the canonical value then shouldn't you change MAP_PRIVATE to MAP_ANON instead of MAP_ANONYMOUS in mmap.2? No, Bl's -width option is used to specify how much padding should be used to align the list entry text when mandoc, and MAP_ANONYMOUS is the widest map flag (other than MAP_HASSEMAPHORE, which I'm hoping goes away soon anyway).
Re: tun TUNDOIOVEC ioctl
On Thu, Jul 10, 2014 at 1:20 PM, Ted Unangst t...@tedunangst.com wrote: Thoughts? Seems kind of hacky to me, but if it results in significant performance improvements in real world uses, then I could be swayed since it's not very intrusive either.
Re: Use atomics for mutating p_sigmask
On Tue, Jul 8, 2014 at 1:21 AM, Martin Pieuchot mpieuc...@nolizard.org wrote: There was a previous attempt to mark sigpromask(2) as nolock but if I recall properly setting p_sigmask atomically is not enough and there's still a possible race in ptsignal(). Have you looked into this function, is it safe now? I haven't looked very closely yet, but I'll be sure to look again before proposing NOLOCK for sys_sigprocmask(). However, this diff only replaces the SPLs with atomic operations; it leaves sys_sigprocmask() under kernel lock. As long as sys_sigprocmask() is still kernel locked, ptsignal() can only execute concurrently with sys_sigprocmask() if its an interrupt on the same CPU. Without this diff, that interrupt can only occur before or after the SPL-protected code section; while after this diff, the interrupt can only occur before or after the atomic operation (since by definition atomic operations are atomic).
Re: Use atomics for mutating p_sigmask
On Tue, Jul 8, 2014 at 1:29 AM, Mark Kettenis mark.kette...@xs4all.nl wrote: Date: Mon, 7 Jul 2014 13:46:03 -0700 From: Matthew Dempsky matt...@dempsky.org p_sigmask is only modified by the owning thread from process context (e.g., sys_sigprocmask(), sys_sigreturn(), userret(), or postsig()), but it can be accessed anywhere (e.g., interrupts or threads on other CPUs). Currently sys_sigprocmask() protects p_sigmask with splhigh(), but that's not MP-safe. The simpler solution is to take advantage of our atomics APIs. Unfortunately for implementing SIG_SETMASK, we don't have an atomic_store_int() function, and I can't bring myself to abuse volatile for this purpose, so I've resorted to atomic_swap_uint(). Sory, but I think that's wrong. You need volatile to make sure the mask is read from memory when you're checking bits. Or you need to insert the proper memory barriers I think. To be clear: I meant I don't want to abuse volatile as though it's a magic make-these-operations-**atomic** flag, as that's not what it really means (even if in practice it often has that effect). Also, as long as the (always current thread) mutators and cross-thread accessors are still serialized by the kernel lock, marking p_sigmask as volatile shouldn't be necessary. However, I do agree that once we start unlocking any of them (which is a future goal of this work), we need some sort of atomic guarantee on the read side too, and marking p_sigmask as volatile seems like a reasonable first step. (I'd probably go further and also decorate the accesses with C11-style atomic_load()s.) So I'm happy to mark p_sigmask as volatile with this diff if you'd prefer. I just don't think it's strictly necessary yet, but I'm not opposed to it either.
Allow tsleep() with ident==NULL
It's rare, but there *are* cases where a thread wants to sleep and doesn't expect any wakeup() calls at all; e.g., nanosleep() and sigsuspend(). In these cases, there's no point in requiring a valid wait channel identifier or adding the thread to the sleep queue. Diff below explicitly allows tsleep() with ident==NULL, but skips adding the thread to a sleep queue in that case. As an implementation detail, we currently treat p_wchan == NULL as an indication that the thread has been woken up. To avoid breaking that assumption this diff uses p_wchan == NOSLPQUE to identify threads that are still sleeping, but aren't associated with a sleep queue. This also lets us tighten the the kernel lock KASSERT in tsleep(): rather than weakly only checking non-timed sleeps, we can precisely check all sleeps that use a wait channel. ok? Index: share/man/man9/tsleep.9 === RCS file: /home/matthew/cvs-mirror/cvs/src/share/man/man9/tsleep.9,v retrieving revision 1.9 diff -u -p -r1.9 tsleep.9 --- share/man/man9/tsleep.9 22 Jan 2014 07:32:47 - 1.9 +++ share/man/man9/tsleep.9 8 Jul 2014 16:25:53 - @@ -96,8 +96,9 @@ The same identifier must be used in a ca .Fn wakeup to get the process going again. .Fa ident -should not be -.Dv NULL . +may be +.Dv NULL +if the process is not waiting on a wait channel. .It Fa priority The process priority to be used when the process is awakened and put on the queue of runnable processes. Index: sys/kern/kern_synch.c === RCS file: /home/matthew/cvs-mirror/cvs/src/sys/kern/kern_synch.c,v retrieving revision 1.115 diff -u -p -r1.115 kern_synch.c --- sys/kern/kern_synch.c 22 Mar 2014 06:05:45 - 1.115 +++ sys/kern/kern_synch.c 8 Jul 2014 16:32:50 - @@ -78,6 +78,12 @@ sleep_queue_init(void) TAILQ_INIT(slpque[i]); } +/* + * Dummy wait channel identifier for threads that don't expect a wakeup(). + * Threads sleeping on this wait channel aren't queued on slpque. + */ +#defineNOSLPQUE((volatile void *)(noslpque)) +static char noslpque; /* * During autoconfiguration or after a panic, a sleep will simply @@ -109,8 +115,17 @@ tsleep(const volatile void *ident, int p KASSERT((priority ~(PRIMASK | PCATCH)) == 0); + /* Make sure caller gave us at least one reason to stop sleeping. */ + KASSERT(ident != NULL || (priority PCATCH) != 0 || timo != 0); + #ifdef MULTIPROCESSOR - KASSERT(timo || __mp_lock_held(kernel_lock)); + if (ident != NULL) { + /* +* Waiting on a condition variable requires a lock, otherwise +* we might miss our wakeup notification. +*/ + KASSERT(__mp_lock_held(kernel_lock)); + } #endif if (cold || panicstr) { @@ -191,12 +206,11 @@ sleep_setup(struct sleep_state *sls, con { struct proc *p = curproc; -#ifdef DIAGNOSTIC + KASSERT(ident != NOSLPQUE); + KASSERT(p-p_stat == SONPROC); + if (ident == NULL) - panic(tsleep: no ident); - if (p-p_stat != SONPROC) - panic(tsleep: not SONPROC); -#endif + ident = NOSLPQUE; #ifdef KTRACE if (KTRPOINT(p, KTR_CSW)) @@ -213,7 +227,8 @@ sleep_setup(struct sleep_state *sls, con p-p_wmesg = wmesg; p-p_slptime = 0; p-p_priority = prio PRIMASK; - TAILQ_INSERT_TAIL(slpque[LOOKUP(ident)], p, p_runq); + if (ident != NOSLPQUE) + TAILQ_INSERT_TAIL(slpque[LOOKUP(ident)], p, p_runq); } void @@ -350,7 +365,8 @@ void unsleep(struct proc *p) { if (p-p_wchan) { - TAILQ_REMOVE(slpque[LOOKUP(p-p_wchan)], p, p_runq); + if (p-p_wchan != NOSLPQUE) + TAILQ_REMOVE(slpque[LOOKUP(p-p_wchan)], p, p_runq); p-p_wchan = NULL; } } @@ -365,6 +381,9 @@ wakeup_n(const volatile void *ident, int struct proc *p; struct proc *pnext; int s; + + KASSERT(ident != NULL); + KASSERT(ident != NOSLPQUE); SCHED_LOCK(s); qp = slpque[LOOKUP(ident)];
Compile kernel with -std=gnu99
Diff below converts the kernel to build with -std=gnu99. (For simplicity, I've only included amd64 for now, but I'll make the same change to all kernel Makefiles if this is ok.) The only incompatibility (that I'm aware of) is that ISO C99's inline semantics differ slightly from GNU C89's historical (but non-standard) inline semantics, but I believe the diff below keeps us consistent with the semantics the kernel currently assumes. (More details below.) I've tested on amd64 and I get the exact same .o files with or without this change (except vers.o, but only because of timestamping). It's probably worth conducting the same test on one of our GCC 3 architectures. ok? Boring standards details: GNU89 and C99 define static inline to have the same semantics, but they differ for non-static definitions. In particular, C99 introduces the concept of inline definitions so that it can allow non-static inline functions to be defined in headers and used across multiple translation units without causing the function to be receive an external definition in each object file. GNU89 requires static inline for inline functions in header files, but then you can end up with multiple static definitions of the function if it's not inlined everywhere. That might seem scary for silently introducing incompatibilities, but fortunately GCC 3.3 and 4.2 don't support C99 inline semantics anyway (they always use GNU89 inline), and GCC 4.2 unconditionally emits a warning if it sees code that would be compiled differently under C99 semantics. For newer compilers that do support C99 inline, -fgnu89-inline forces the legacy GNU89 semantics while also silencing the GCC 4.2 warning. Index: Makefile.amd64 === RCS file: /home/matthew/cvs-mirror/cvs/src/sys/arch/amd64/conf/Makefile.amd64,v retrieving revision 1.59 diff -u -p -r1.59 Makefile.amd64 --- Makefile.amd64 8 May 2014 17:59:28 - 1.59 +++ Makefile.amd64 8 Jul 2014 17:41:49 - @@ -39,8 +39,13 @@ CMACHFLAGS+= -fno-stack-protector CMACHFLAGS+= -Wa,-n .endif +CSTD= -std=gnu99 +.if ${COMPILER_VERSION} != gcc3 +CSTD+= -fgnu89-inline +.endif + COPTS?=-O2 -CFLAGS=${DEBUG} ${CWARNFLAGS} ${CMACHFLAGS} ${COPTS} ${PIPE} +CFLAGS=${CSTD} ${DEBUG} ${CWARNFLAGS} ${CMACHFLAGS} ${COPTS} ${PIPE} AFLAGS=-D_LOCORE -x assembler-with-cpp ${CWARNFLAGS} ${CMACHFLAGS} LINKFLAGS= -Ttext 0x810001e0 -e start --warn-common -nopie
Re: Compile kernel with -std=gnu99
On Tue, Jul 8, 2014 at 12:03 PM, Mark Kettenis mark.kette...@xs4all.nl wrote: I disagree with this diff. We should discourage the use of GNU extensions in our kernel. Therefore I think std=gnu99 would give the wrong signal. Can you clarify your concern? Currently we're (implicitly) compiling with -std=gnu89, which is ISO C90 + GNU extensions. This diff changes us to (explicitly) compile with -std=gnu99 -fgnu89-inline, which is ISO C99 + GNU extensions + GNU C89 inline. I think moving from C90 to C99 is a good idea. I'm indifferent to GNU extensions. If we could make do without them, then great; but technically inline asm is a GNU extension (even if it's available in C99 mode) and I doubt we'll benefit from eliminating that. Using GNU89 inline is an intermediary step, because the kernel isn't ready for C99 inline. See below. As for the inline issue. IMHO, given the incompatbility problems between ISO C and GNU C, only static inline is usable. The semantics of the other inline forms is just too confusing. The occasional extra copy of the code is as far as I'm concerned acceptable. The functions should be small enough for it not to matter. Converting the existing inline functions to be C99 compatible (either by adding static or removing inline) is a next step I have planned, but I want to allow other C99 features first. Also, there are inline functions in MD code, so I'd rather have all kernels on -std=gnu99 -fgnu89-inline. Then we can start cleaning up GNU89 inlines and remove -fgnu89-inline on arches once they're clean. E.g., first clean up all MI and x86 inlines, then the x86 kernels can start compiling without -fgnu89-inline, which will make sure we don't regress in MI code while we start tackling other MD files.
Re: Compile kernel with -std=gnu99
On Tue, Jul 8, 2014 at 12:28 PM, Mark Kettenis mark.kette...@xs4all.nl wrote: With that explanation, this sounds a lot more reasonable. Ah, good. :) Sorry, I should have mentioned up front the followup steps I had planned and the rationale for taking this path.
Re: Use atomics for mutating p_sigmask
Updated diff below with the following changes: - p_sigmask is now volatile, per kettenis's request - kern_fork.c's memcpy() for p_startcopy needs to cast away the volatile now - kettenis pointed out atomic_swap_uint() isn't safe to use in MI code yet, so for now proc_sigsetmask() just relies on volatile assignment being atomic ok? Index: sys/proc.h === RCS file: /home/matthew/cvs-mirror/cvs/src/sys/sys/proc.h,v retrieving revision 1.186 diff -u -p -r1.186 proc.h --- sys/proc.h 15 May 2014 03:52:25 - 1.186 +++ sys/proc.h 8 Jul 2014 19:19:17 - @@ -310,7 +310,7 @@ struct proc { /* The following fields are all copied upon creation in fork. */ #definep_startcopy p_sigmask - sigset_t p_sigmask; /* Current signal mask. */ + volatile sigset_t p_sigmask;/* Current signal mask. */ u_char p_priority; /* Process priority. */ u_char p_usrpri; /* User-priority based on p_cpu and ps_nice. */ @@ -520,6 +520,8 @@ int proc_cansugid(struct proc *); void proc_finish_wait(struct proc *, struct proc *); void process_zap(struct process *); void proc_free(struct proc *); +void proc_sigsetmask(struct proc *, sigset_t); +void proc_sigsuspend(struct proc *, sigset_t); struct sleep_state { int sls_s; @@ -559,4 +561,3 @@ struct cpu_info *cpuset_first(struct cpu #endif /* _KERNEL */ #endif /* !_SYS_PROC_H_ */ - Index: kern/kern_sig.c === RCS file: /home/matthew/cvs-mirror/cvs/src/sys/kern/kern_sig.c,v retrieving revision 1.166 diff -u -p -r1.166 kern_sig.c --- kern/kern_sig.c 4 May 2014 05:03:26 - 1.166 +++ kern/kern_sig.c 8 Jul 2014 20:29:50 - @@ -62,6 +62,7 @@ #include sys/ptrace.h #include sys/sched.h #include sys/user.h +#include sys/atomic.h #include sys/mount.h #include sys/syscallargs.h @@ -439,37 +440,55 @@ sys_sigprocmask(struct proc *p, void *v, syscallarg(int) how; syscallarg(sigset_t) mask; } */ *uap = v; - int error = 0; - int s; - sigset_t mask; + sigset_t mask = SCARG(uap, mask); *retval = p-p_sigmask; - mask = SCARG(uap, mask); - s = splhigh(); switch (SCARG(uap, how)) { case SIG_BLOCK: - p-p_sigmask |= mask ~ sigcantmask; + atomic_setbits_int(p-p_sigmask, mask ~ sigcantmask); break; case SIG_UNBLOCK: - p-p_sigmask = ~mask; + atomic_clearbits_int(p-p_sigmask, mask); break; case SIG_SETMASK: - p-p_sigmask = mask ~ sigcantmask; + proc_sigsetmask(p, mask ~ sigcantmask); break; default: - error = EINVAL; - break; + return (EINVAL); } - splx(s); - return (error); + + return (0); +} + +void +proc_sigsetmask(struct proc *p, sigset_t mask) +{ + KASSERT(p == curproc); + + /* XXX: Abusing volatile assignment as atomic store operation. */ + p-p_sigmask = mask; +} + +/* + * Temporarily replace calling proc's signal mask for the duration of a + * system call. Original signal mask will be restored by userret(). + */ +void +proc_sigsuspend(struct proc *p, sigset_t mask) +{ + KASSERT(p == curproc); + KASSERT(!(p-p_flag P_SIGSUSPEND)); + + p-p_oldmask = p-p_sigmask; + atomic_setbits_int(p-p_flag, P_SIGSUSPEND); + proc_sigsetmask(p, mask); } /* ARGSUSED */ int sys_sigpending(struct proc *p, void *v, register_t *retval) { - *retval = p-p_siglist; return (0); } @@ -489,16 +508,7 @@ sys_sigsuspend(struct proc *p, void *v, struct process *pr = p-p_p; struct sigacts *ps = pr-ps_sigacts; - /* -* When returning from sigpause, we want -* the old mask to be restored after the -* signal handler has finished. Thus, we -* save it here and mark the sigacts structure -* to indicate this. -*/ - p-p_oldmask = p-p_sigmask; - atomic_setbits_int(p-p_flag, P_SIGSUSPEND); - p-p_sigmask = SCARG(uap, mask) ~ sigcantmask; + proc_sigsuspend(p, SCARG(uap, mask) ~ sigcantmask); while (tsleep(ps, PPAUSE|PCATCH, pause, 0) == 0) /* void */; /* always return EINTR rather than ERESTART... */ @@ -749,7 +759,7 @@ trapsignal(struct proc *p, int signum, u p-p_ru.ru_nsignals++; (*pr-ps_emul-e_sendsig)(ps-ps_sigact[signum], signum, p-p_sigmask, trapno, code, sigval); - p-p_sigmask |= ps-ps_catchmask[signum]; + atomic_setbits_int(p-p_sigmask, ps-ps_catchmask[signum]); if ((ps-ps_sigreset mask) != 0) { ps-ps_sigcatch = ~mask; if (signum !=
Re: Use atomics for mutating p_sigmask
On Tue, Jul 8, 2014 at 3:51 PM, Philip Guenther guent...@gmail.com wrote: * using atomics when we don't even have correct process-pending signals risks a solution that doesn't work when more than one pending set has to be examined. I'll hold off on this until we have process-pending signals then.
Mark get*[ug]id() as NOLOCK
Recently guenther changed user credentials to be a per-process resource, but kept a per-thread cache of credentials that get refreshed at each system call entry. All of the get*[ug]id() system calls simply access the thread cached credentials, and possibly copyout() them if necessary, so they're safe to mark as NOLOCK. ok? Index: syscalls.master === RCS file: /cvs/src/sys/kern/syscalls.master,v retrieving revision 1.141 diff -u -p -r1.141 syscalls.master --- syscalls.master 6 Jul 2014 20:55:58 - 1.141 +++ syscalls.master 7 Jul 2014 18:06:34 - @@ -80,8 +80,8 @@ int flags, void *data); } 22 STD { int sys_unmount(const char *path, int flags); } 23 STD { int sys_setuid(uid_t uid); } -24 STD { uid_t sys_getuid(void); } -25 STD { uid_t sys_geteuid(void); } +24 STD NOLOCK { uid_t sys_getuid(void); } +25 STD NOLOCK { uid_t sys_geteuid(void); } #ifdef PTRACE 26 STD { int sys_ptrace(int req, pid_t pid, caddr_t addr, \ int data); } @@ -112,7 +112,7 @@ 41 STD { int sys_dup(int fd); } 42 STD { int sys_fstatat(int fd, const char *path, \ struct stat *buf, int flag); } -43 STD { gid_t sys_getegid(void); } +43 STD NOLOCK { gid_t sys_getegid(void); } 44 STD { int sys_profil(caddr_t samples, size_t size, \ u_long offset, u_int scale); } #ifdef KTRACE @@ -124,7 +124,7 @@ 46 STD { int sys_sigaction(int signum, \ const struct sigaction *nsa, \ struct sigaction *osa); } -47 STD { gid_t sys_getgid(void); } +47 STD NOLOCK { gid_t sys_getgid(void); } 48 STD { int sys_sigprocmask(int how, sigset_t mask); } 49 STD { int sys_getlogin(char *namebuf, u_int namelen); } 50 STD { int sys_setlogin(const char *namebuf); } @@ -181,7 +181,7 @@ const struct timeval *tptr); } 78 STD { int sys_mincore(void *addr, size_t len, \ char *vec); } -79 STD { int sys_getgroups(int gidsetsize, \ +79 STD NOLOCK { int sys_getgroups(int gidsetsize, \ gid_t *gidset); } 80 STD { int sys_setgroups(int gidsetsize, \ const gid_t *gidset); } @@ -476,11 +476,11 @@ 278UNIMPL sys_extattr_set_fd 279UNIMPL sys_extattr_get_fd 280UNIMPL sys_extattr_delete_fd -281STD { int sys_getresuid(uid_t *ruid, uid_t *euid, \ +281STD NOLOCK { int sys_getresuid(uid_t *ruid, uid_t *euid, \ uid_t *suid); } 282STD { int sys_setresuid(uid_t ruid, uid_t euid, \ uid_t suid); } -283STD { int sys_getresgid(gid_t *rgid, gid_t *egid, \ +283STD NOLOCK { int sys_getresgid(gid_t *rgid, gid_t *egid, \ gid_t *sgid); } 284STD { int sys_setresgid(gid_t rgid, gid_t egid, \ gid_t sgid); }
Use atomics for mutating p_sigmask
p_sigmask is only modified by the owning thread from process context (e.g., sys_sigprocmask(), sys_sigreturn(), userret(), or postsig()), but it can be accessed anywhere (e.g., interrupts or threads on other CPUs). Currently sys_sigprocmask() protects p_sigmask with splhigh(), but that's not MP-safe. The simpler solution is to take advantage of our atomics APIs. Unfortunately for implementing SIG_SETMASK, we don't have an atomic_store_int() function, and I can't bring myself to abuse volatile for this purpose, so I've resorted to atomic_swap_uint(). While here, I also refactored the P_SIGSUSPEND code a bit so there's less code duplication. I've included just amd64 and i386's machdep.c for now, because those were the only ones I'm readily able to test. The others should be easy to similarly fix though, and I can send a followup diff for them. ok? Index: sys/proc.h === RCS file: /home/matthew/cvs-mirror/cvs/src/sys/sys/proc.h,v retrieving revision 1.186 diff -u -p -r1.186 proc.h --- sys/proc.h 15 May 2014 03:52:25 - 1.186 +++ sys/proc.h 7 Jul 2014 19:23:06 - @@ -520,6 +520,8 @@ int proc_cansugid(struct proc *); void proc_finish_wait(struct proc *, struct proc *); void process_zap(struct process *); void proc_free(struct proc *); +void proc_sigsetmask(struct proc *, sigset_t); +void proc_sigsuspend(struct proc *, sigset_t); struct sleep_state { int sls_s; @@ -559,4 +561,3 @@ struct cpu_info *cpuset_first(struct cpu #endif /* _KERNEL */ #endif /* !_SYS_PROC_H_ */ - Index: kern/kern_sig.c === RCS file: /home/matthew/cvs-mirror/cvs/src/sys/kern/kern_sig.c,v retrieving revision 1.166 diff -u -p -r1.166 kern_sig.c --- kern/kern_sig.c 4 May 2014 05:03:26 - 1.166 +++ kern/kern_sig.c 7 Jul 2014 19:24:07 - @@ -62,6 +62,7 @@ #include sys/ptrace.h #include sys/sched.h #include sys/user.h +#include sys/atomic.h #include sys/mount.h #include sys/syscallargs.h @@ -439,37 +440,55 @@ sys_sigprocmask(struct proc *p, void *v, syscallarg(int) how; syscallarg(sigset_t) mask; } */ *uap = v; - int error = 0; - int s; - sigset_t mask; + sigset_t mask = SCARG(uap, mask); *retval = p-p_sigmask; - mask = SCARG(uap, mask); - s = splhigh(); switch (SCARG(uap, how)) { case SIG_BLOCK: - p-p_sigmask |= mask ~ sigcantmask; + atomic_setbits_int(p-p_sigmask, mask ~ sigcantmask); break; case SIG_UNBLOCK: - p-p_sigmask = ~mask; + atomic_clearbits_int(p-p_sigmask, mask); break; case SIG_SETMASK: - p-p_sigmask = mask ~ sigcantmask; + proc_sigsetmask(p, mask ~ sigcantmask); break; default: - error = EINVAL; - break; + return (EINVAL); } - splx(s); - return (error); + + return (0); +} + +void +proc_sigsetmask(struct proc *p, sigset_t mask) +{ + KASSERT(p == curproc); + + /* XXX: An atomic store would suffice here. */ + (void)atomic_swap_uint(p-p_sigmask, mask); +} + +/* + * Temporarily replace calling proc's signal mask for the duration of a + * system call. Original signal mask will be restored by userret(). + */ +void +proc_sigsuspend(struct proc *p, sigset_t mask) +{ + KASSERT(p == curproc); + KASSERT(!(p-p_flag P_SIGSUSPEND)); + + p-p_oldmask = p-p_sigmask; + atomic_setbits_int(p-p_flag, P_SIGSUSPEND); + proc_sigsetmask(p, mask); } /* ARGSUSED */ int sys_sigpending(struct proc *p, void *v, register_t *retval) { - *retval = p-p_siglist; return (0); } @@ -489,16 +508,7 @@ sys_sigsuspend(struct proc *p, void *v, struct process *pr = p-p_p; struct sigacts *ps = pr-ps_sigacts; - /* -* When returning from sigpause, we want -* the old mask to be restored after the -* signal handler has finished. Thus, we -* save it here and mark the sigacts structure -* to indicate this. -*/ - p-p_oldmask = p-p_sigmask; - atomic_setbits_int(p-p_flag, P_SIGSUSPEND); - p-p_sigmask = SCARG(uap, mask) ~ sigcantmask; + proc_sigsuspend(p, SCARG(uap, mask) ~ sigcantmask); while (tsleep(ps, PPAUSE|PCATCH, pause, 0) == 0) /* void */; /* always return EINTR rather than ERESTART... */ Index: kern/sys_generic.c === RCS file: /home/matthew/cvs-mirror/cvs/src/sys/kern/sys_generic.c,v retrieving revision 1.86 diff -u -p -r1.86 sys_generic.c --- kern/sys_generic.c 30 Mar 2014 21:54:48 - 1.86 +++ kern/sys_generic.c 7 Jul 2014 19:24:07 - @@ -658,9 +620,7 @@ dopselect(struct proc
Reclaim some wasted pages on amd64
I spent yesterday trying to really grok early pmap initialization on amd64, and I noticed what I believe to be wasted physical pages of memory: 1. In locore.S, we setup both an identity and an actual mapping for the kernel, to help us bootstrap to executing in high memory. Also, we reserve pages as though these mappings are independent; but actually the mappings cleverly (accidentally??) reuse the same L2 and L3 tables, so we can safely reserve just one set of pages. (Just the same, I've added checks to make sure this is safe, and also to sanity check that we don't overflow the page tables.) 2. In pmap_bootstrap(), we setup the rest of the direct physical memory mapping. locore.S has already mapped the first 4GB of memory, and we don't remove or replace those mappings; so we can save another couple pages by only allocating page directories for the rest of physical memory. In total, this saves a whopping 5 pages (20kB) of physical memory on my amd64 laptop, as measured by dmesg's avail mem. So not terribly important, but I have a patch to enable 512GB of physical address space that somewhat builds on top of this, and I think it will be easier to review if these changes are already in place. This could probably use some testing under memory-intensive workloads, especially on systems with 4GB of RAM. Index: amd64/locore.S === RCS file: /home/matthew/anoncvs/cvs/src/sys/arch/amd64/amd64/locore.S,v retrieving revision 1.54 diff -u -p -r1.54 locore.S --- amd64/locore.S 10 Nov 2012 09:45:05 - 1.54 +++ amd64/locore.S 3 Jul 2014 23:33:25 - @@ -358,29 +358,48 @@ bi_size_ok: * 0 1 2 3 */ -#if L2_SLOT_KERNBASE 0 -#define TABLE_L2_ENTRIES (2 * (NKL2_KIMG_ENTRIES + 1)) -#else -#define TABLE_L2_ENTRIES (NKL2_KIMG_ENTRIES + 1) -#endif +/* + * We now want to enable paging, move the kernel to high virtual memory, and + * jump there; but unfortunately, we can't do that atomically. Instead, we + * setup a page table that maps the kernel pages to their eventual virtual + * address ranges, but also includes an identity map that keeps those pages + * available via their physical addresses even once paging is enabled. Later, + * once we're executing out of the the eventual VA range, we remove the + * identity map. + * + * As an additional trick to conserve a few pages, the actual map and the + * identity map share L2 and L3 tables. This works as long as they don't + * require inexactly overlapping table entries, which we check for below. + * (It also means the kernel may end up mapped multiple times in virtual + * memory, but only until we tear down the identity map.) + */ +/* XXX(matthew): Why do we always add 1 to NKL2_KIMG_ENTRIES? */ -#if L3_SLOT_KERNBASE 0 -#define TABLE_L3_ENTRIES (2 * NKL3_KIMG_ENTRIES) -#else -#define TABLE_L3_ENTRIES NKL3_KIMG_ENTRIES +#if L2_SLOT_KERNBASE 0 L2_SLOT_KERNBASE NKL2_KIMG_ENTRIES + 1 +#error L2 table collision between identity and actual map entries +#endif +#if L3_SLOT_KERNBASE 0 L3_SLOT_KERNBASE NKL3_KIMG_ENTRIES +#error L3 table collision between identity and actual map entries #endif +/* Sanity check. */ +#if L2_SLOT_KERNBASE + NKL2_KIMG_ENTRIES + 1 NPDPG +#error L2 table overflow +#endif +#if L3_SLOT_KERNBASE + NKL3_KIMG_ENTRIES NPDPG +#error L3 table overflow +#endif #define PROC0_PML4_OFF 0 #define PROC0_STK_OFF (PROC0_PML4_OFF + NBPG) #define PROC0_PTP3_OFF (PROC0_STK_OFF + UPAGES * NBPG) #define PROC0_PTP2_OFF (PROC0_PTP3_OFF + NKL4_KIMG_ENTRIES * NBPG) -#define PROC0_PTP1_OFF (PROC0_PTP2_OFF + TABLE_L3_ENTRIES * NBPG) -#definePROC0_DMP3_OFF (PROC0_PTP1_OFF + TABLE_L2_ENTRIES * NBPG) +#define PROC0_PTP1_OFF (PROC0_PTP2_OFF + NKL3_KIMG_ENTRIES * NBPG) +#definePROC0_DMP3_OFF (PROC0_PTP1_OFF + (NKL2_KIMG_ENTRIES + 1) * NBPG) #define PROC0_DMP2_OFF (PROC0_DMP3_OFF + NDML3_ENTRIES * NBPG) #define TABLESIZE \ -((NKL4_KIMG_ENTRIES + TABLE_L3_ENTRIES + TABLE_L2_ENTRIES + 1 + UPAGES + \ - NDML3_ENTRIES + NDML2_ENTRIES) * NBPG) +((NKL4_KIMG_ENTRIES + NKL3_KIMG_ENTRIES + (NKL2_KIMG_ENTRIES + 1) + 1 + \ + UPAGES + NDML3_ENTRIES + NDML2_ENTRIES) * NBPG) #define fillkpt\ 1: movl%eax,(%ebx) ; /* store phys addr */ \ Index: amd64/pmap.c === RCS file: /home/matthew/anoncvs/cvs/src/sys/arch/amd64/amd64/pmap.c,v retrieving revision 1.70 diff -u -p -r1.70 pmap.c --- amd64/pmap.c15 Jun 2014 11:43:24 - 1.70 +++ amd64/pmap.c3 Jul 2014 23:41:34 - @@ -510,10 +510,9 @@ pmap_bootstrap(paddr_t first_avail, padd { vaddr_t kva, kva_end, kva_start = VM_MIN_KERNEL_ADDRESS; struct pmap *kpm; - int i; + size_t i, ndmpdp; unsigned long p1i; pt_entry_t pg_nx = (cpu_feature CPUID_NXE? PG_NX : 0); - long ndmpdp; paddr_t dmpd,
clang compat for Makefile.amd64
Diff below allows building an amd64 kernel with CC=clang make. Some random notes: 1. A bunch of the -Wno-foo warnings are probably worth fixing, but it's not a high priority for me at the moment. 2. -Wframe-larger-than= is the name modern GCC and Clang have adopted, and it probably wouldn't hurt to add it as an alias for base GCC's -Wstack-larger-than- flag. 3. I think -ffreestanding instead of -fno-builtin-{foo,bar,quux,...} may work with GCC too, but I haven't tested. At least according to the GCC 3.3.6 and 4.2.4 manuals, the only functions GCC will assume exist are mem{cmp,cpy,move,set}(). 4. The HOSTCC and HOSTED_XXX variables are present in all of our Makefile.foo files, but they don't seem to be used for anything. Is there any point to keeping them around? (If not, I can remove them from all Makefile.foo's.) 5. amd64/conf/kern.ldscript is confusing. It's not used, and I spent a while the other day (while working on my ddb line numbers diff) trying to figure out why changing it wasn't affecting my build output. I'd like to remove it until someone proposes a real use for it. Index: Makefile.amd64 === RCS file: /home/matthew/anoncvs/cvs/src/sys/arch/amd64/conf/Makefile.amd64,v retrieving revision 1.59 diff -u -p -r1.59 Makefile.amd64 --- Makefile.amd64 8 May 2014 17:59:28 - 1.59 +++ Makefile.amd64 2 Jul 2014 23:53:07 - @@ -24,14 +24,24 @@ _archdir?= $S/arch/${_arch} INCLUDES= -nostdinc -I$S -I. -I$S/arch CPPFLAGS= ${INCLUDES} ${IDENT} ${PARAM} -D_KERNEL -MD -MP CWARNFLAGS=-Werror -Wall -Wstrict-prototypes -Wmissing-prototypes \ - -Wno-main -Wno-uninitialized \ - -Wstack-larger-than-2047 + -Wno-main -Wno-uninitialized +.if ${CC} == clang +CWARNFLAGS+= -Wframe-larger-than=2047 -Wno-enum-conversion -Wno-unused \ + -Wno-tautological-compare -Wno-pointer-sign \ + -Wno-shift-overflow +.else +CWARNFLAGS+= -Wstack-larger-than-2047 +.endif CMACHFLAGS=-mcmodel=kernel -mno-red-zone -mno-sse2 -mno-sse -mno-3dnow \ - -mno-mmx -msoft-float -fno-omit-frame-pointer + -mno-mmx -msoft-float -fno-omit-frame-pointer ${NOPIE_FLAGS} +.if ${CC} == clang +CMACHFLAGS+= -ffreestanding +.else CMACHFLAGS+= -fno-builtin-printf -fno-builtin-snprintf \ -fno-builtin-vsnprintf -fno-builtin-log \ - -fno-builtin-log2 -fno-builtin-malloc ${NOPIE_FLAGS} + -fno-builtin-log2 -fno-builtin-malloc +.endif .if ${IDENT:M-DNO_PROPOLICE} CMACHFLAGS+= -fno-stack-protector .endif @@ -49,11 +59,6 @@ DB_STRUCTINFO= db_structinfo.h .else DB_STRUCTINFO= .endif - -HOSTCC?= ${CC} -HOSTED_CPPFLAGS=${CPPFLAGS:S/^-nostdinc$//} -HOSTED_CFLAGS= ${CFLAGS} -HOSTED_C= ${HOSTCC} ${HOSTED_CFLAGS} ${HOSTED_CPPFLAGS} -c $ NORMAL_C_NOP= ${CC} ${CFLAGS} ${CPPFLAGS} -c $ NORMAL_C= ${CC} ${CFLAGS} ${CPPFLAGS} ${PROF} -c $
Re: clang compat for Makefile.amd64
On Wed, Jul 2, 2014 at 5:16 PM, Matthew Dempsky matt...@dempsky.org wrote: +.if ${CC} == clang +CMACHFLAGS+= -ffreestanding Sorry, Clang actually needs -fno-integrated-as too, otherwise the integrated assembler complains about the bogus XYZZY instructions in kern/genassym.sh.
identcpu for 1-GByte pages
According to the Intel 64 and IA-32 Architectures Software Developer's Manual, CPUID.8001H:EDX.Page1GB [bit 26] indicates whether 1-GByte pages are supported with IA-32e paging. I think the diff below adds support for identifying this feature in dmesg, but my X201s is seemingly to old to support it. ok? Index: amd64/include/specialreg.h === RCS file: /home/matthew/anoncvs/cvs/src/sys/arch/amd64/include/specialreg.h,v retrieving revision 1.27 diff -w -u -p -r1.27 specialreg.h --- amd64/include/specialreg.h 24 Aug 2013 04:26:16 - 1.27 +++ amd64/include/specialreg.h 3 Jul 2014 05:17:18 - @@ -207,6 +207,7 @@ #defineCPUID_NXE 0x0010 /* No-Execute Extension */ #defineCPUID_MMXX 0x0040 /* AMD MMX Extensions */ #defineCPUID_FFXSR 0x0200 /* fast FP/MMX save/restore */ +#defineCPUID_PAGE1GB 0x0400 /* 1-GByte pages */ #defineCPUID_LONG 0x2000 /* long mode */ #defineCPUID_3DNOW20x4000 /* 3DNow! Instruction Extension */ #defineCPUID_3DNOW 0x8000 /* 3DNow! Instructions */ Index: amd64/amd64/identcpu.c === RCS file: /home/matthew/anoncvs/cvs/src/sys/arch/amd64/amd64/identcpu.c,v retrieving revision 1.52 diff -w -u -p -r1.52 identcpu.c --- amd64/amd64/identcpu.c 19 Nov 2013 04:12:17 - 1.52 +++ amd64/amd64/identcpu.c 3 Jul 2014 05:19:54 - @@ -96,6 +96,7 @@ const struct { { CPUID_NXE,NXE }, { CPUID_MMXX, MMXX }, { CPUID_FFXSR, FFXSR }, + { CPUID_PAGE1GB,PAGE1GB }, { CPUID_LONG, LONG }, { CPUID_3DNOW2, 3DNOW2 }, { CPUID_3DNOW, 3DNOW } Index: i386/include/specialreg.h === RCS file: /home/matthew/anoncvs/cvs/src/sys/arch/i386/include/specialreg.h,v retrieving revision 1.46 diff -w -u -p -r1.46 specialreg.h --- i386/include/specialreg.h 24 Aug 2013 04:26:16 - 1.46 +++ i386/include/specialreg.h 3 Jul 2014 05:20:23 - @@ -206,6 +206,7 @@ #defineCPUID_NXE 0x0010 /* No-Execute Extension */ #defineCPUID_MMXX 0x0040 /* AMD MMX Extensions */ #defineCPUID_FFXSR 0x0200 /* fast FP/MMX save/restore */ +#defineCPUID_PAGE1GB 0x0400 /* 1-GByte pages */ #defineCPUID_LONG 0x2000 /* long mode */ #defineCPUID_3DNOW20x4000 /* 3DNow! Instruction Extension */ #defineCPUID_3DNOW 0x8000 /* 3DNow! Instructions */ Index: i386/i386/machdep.c === RCS file: /home/matthew/anoncvs/cvs/src/sys/arch/i386/i386/machdep.c,v retrieving revision 1.539 diff -w -u -p -r1.539 machdep.c --- i386/i386/machdep.c 15 Jun 2014 11:43:24 - 1.539 +++ i386/i386/machdep.c 3 Jul 2014 05:20:56 - @@ -1001,6 +1001,7 @@ const struct cpu_cpuid_feature i386_ecpu { CPUID_NXE,NXE }, { CPUID_MMXX, MMXX }, { CPUID_FFXSR, FFXSR }, + { CPUID_PAGE1GB,PAGE1GB }, { CPUID_LONG, LONG }, { CPUID_3DNOW2, 3DNOW2 }, { CPUID_3DNOW, 3DNOW }
Replace void * in u{dv,vn}_attach()
udv_attach() and uvn_attach() are called directly, not via any generic dispatch table mechanism, so there's no point in specifying that they accept void * instead of the underlying types they actually expect. ok? Index: share/man/man9/uvm.9 === RCS file: /home/matthew/cvs-mirror/cvs/src/share/man/man9/uvm.9,v retrieving revision 1.55 diff -u -p -r1.55 uvm.9 --- share/man/man9/uvm.930 Jun 2014 21:48:09 - 1.55 +++ share/man/man9/uvm.91 Jul 2014 19:57:54 - @@ -439,7 +439,7 @@ returns a standard errno. .Sh MEMORY MAPPING FILES AND DEVICES .nr nS 1 .Ft struct uvm_object * -.Fn uvn_attach void *arg vm_prot_t accessprot +.Fn uvn_attach struct vnode *vp vm_prot_t accessprot .Ft void .Fn uvm_vnp_setsize struct vnode *vp voff_t newsize .Ft void @@ -453,7 +453,7 @@ returns a standard errno. The .Fn uvn_attach function attaches a UVM object to vnode -.Fa arg , +.Fa vp , creating the object if necessary. The object is returned. .Pp Index: sys/kern/exec_subr.c === RCS file: /home/matthew/cvs-mirror/cvs/src/sys/kern/exec_subr.c,v retrieving revision 1.33 diff -u -p -r1.33 exec_subr.c --- sys/kern/exec_subr.c29 May 2014 05:05:34 - 1.33 +++ sys/kern/exec_subr.c1 Jul 2014 19:54:15 - @@ -187,7 +187,7 @@ vmcmd_map_pagedvn(struct proc *p, struct * first, attach to the object */ - uobj = uvn_attach((void *)cmd-ev_vp, VM_PROT_READ|VM_PROT_EXECUTE); + uobj = uvn_attach(cmd-ev_vp, VM_PROT_READ|VM_PROT_EXECUTE); if (uobj == NULL) return (ENOMEM); Index: sys/uvm/uvm_vnode.c === RCS file: /home/matthew/cvs-mirror/cvs/src/sys/uvm/uvm_vnode.c,v retrieving revision 1.82 diff -u -p -r1.82 uvm_vnode.c --- sys/uvm/uvm_vnode.c 8 May 2014 20:08:50 - 1.82 +++ sys/uvm/uvm_vnode.c 1 Jul 2014 19:53:41 - @@ -139,9 +139,8 @@ uvn_init(void) *pointers are equiv. */ struct uvm_object * -uvn_attach(void *arg, vm_prot_t accessprot) +uvn_attach(struct vnode *vp, vm_prot_t accessprot) { - struct vnode *vp = arg; struct uvm_vnode *uvn = vp-v_uvm; struct vattr vattr; int oldflags, result; Index: sys/uvm/uvm_vnode.h === RCS file: /home/matthew/cvs-mirror/cvs/src/sys/uvm/uvm_vnode.h,v retrieving revision 1.12 diff -u -p -r1.12 uvm_vnode.h --- sys/uvm/uvm_vnode.h 14 Mar 2002 01:27:19 - 1.12 +++ sys/uvm/uvm_vnode.h 1 Jul 2014 19:53:54 - @@ -102,7 +102,7 @@ struct uvm_vnode { * include sys/vnode.h, and files that include sys/vnode.h don't know * what a vm_prot_t is. */ -struct uvm_object *uvn_attach(void *, vm_prot_t); +struct uvm_object *uvn_attach(struct vnode *, vm_prot_t); #endif #endif /* _KERNEL */ Index: sys/uvm/uvm_device.c === RCS file: /home/matthew/cvs-mirror/cvs/src/sys/uvm/uvm_device.c,v retrieving revision 1.45 diff -u -p -r1.45 uvm_device.c --- sys/uvm/uvm_device.c13 Apr 2014 23:14:15 - 1.45 +++ sys/uvm/uvm_device.c1 Jul 2014 19:56:09 - @@ -97,9 +97,8 @@ struct uvm_pagerops uvm_deviceops = { * The last two arguments (off and size) are only used for access checking. */ struct uvm_object * -udv_attach(void *arg, vm_prot_t accessprot, voff_t off, vsize_t size) +udv_attach(dev_t device, vm_prot_t accessprot, voff_t off, vsize_t size) { - dev_t device = *((dev_t *)arg); struct uvm_device *udv, *lcv; paddr_t (*mapfn)(dev_t, off_t, int); #if NDRM 0 @@ -118,7 +117,7 @@ udv_attach(void *arg, vm_prot_t accesspr return(NULL); #if NDRM 0 - obj = udv_attach_drm(arg, accessprot, off, size); + obj = udv_attach_drm(device, accessprot, off, size); if (obj) return(obj); #endif Index: sys/uvm/uvm_device.h === RCS file: /home/matthew/cvs-mirror/cvs/src/sys/uvm/uvm_device.h,v retrieving revision 1.9 diff -u -p -r1.9 uvm_device.h --- sys/uvm/uvm_device.h7 Jun 2013 20:46:14 - 1.9 +++ sys/uvm/uvm_device.h1 Jul 2014 19:55:42 - @@ -70,8 +70,8 @@ struct uvm_device { * prototypes */ -struct uvm_object *udv_attach(void *, vm_prot_t, voff_t, vsize_t); -struct uvm_object *udv_attach_drm(void *, vm_prot_t, voff_t, vsize_t); +struct uvm_object *udv_attach(dev_t, vm_prot_t, voff_t, vsize_t); +struct uvm_object *udv_attach_drm(dev_t, vm_prot_t, voff_t, vsize_t); #endif /* _KERNEL */ Index: sys/uvm/uvm_extern.h === RCS file: /home/matthew/cvs-mirror/cvs/src/sys/uvm/uvm_extern.h,v retrieving revision 1.116 diff -u -p -r1.116 uvm_extern.h --- sys/uvm/uvm_extern.h13 Jun 2014
Re: Rename MAP_ANON to MAP_ANONYMOUS
On Mon, Jun 30, 2014 at 2:47 PM, Ingo Schwarze schwa...@usta.de wrote: so in a nutshell, mmap(2) was originally a BSD idea and first implemented in SunOS? And there is no doubt that *BSD always had MAP_ANON and never MAP_ANONYMOUS and that SunOS primarily defines MAP_ANON and MAP_ANONYMOUS only for /* (source compatibility) */, right? And that the earliest occurence of MAP_ANONYMOUS we found so far is Linux (1994)? And that 4.4BSD was released with MAP_ANON before that (1993)... I've reaffirmed that OpenBSD's stance is that we'd prefer that MAP_ANON be standardized, explained the commit history of MAP_ANON and MAP_ANONYMOUS (thanks Ingo!), and also pointed out that Advanced Programming in the UNIX Environment recommends MAP_ANON: http://austingroupbugs.net/view.php?id=850#c2299 I'll update the thread if anything new develops. Next Austin Group teleconference is July 10.
Rename MAP_ANON to MAP_ANONYMOUS
I filed an enhancement request with the Austin Group to standardize an mmap() flag for mapping anonymous memory. I proposed standardizing MAP_ANON, but the current proposal is to standardize MAP_ANONYMOUS instead, as that seems to be the more common definition and usage: http://austingroupbugs.net/view.php?id=850 All else equal, I prefer MAP_ANON as it's shorter and it's also the original BSD name; but I prefer having a standardized name even more. If anyone objects and has convincing arguments for why POSIX should standardize MAP_ANON instead, feel free to either comment on the above bug, or just express them here and I'll proxy them appropriately. Otherwise, diff below renames MAP_ANON to MAP_ANONYMOUS, and redefines MAP_ANON as an alias for MAP_ANONYMOUS. Index: sys/sys/mman.h === RCS file: /home/matthew/cvs-mirror/cvs/src/sys/sys/mman.h,v retrieving revision 1.25 diff -u -p -r1.25 mman.h --- sys/sys/mman.h 27 Jun 2014 20:50:43 - 1.25 +++ sys/sys/mman.h 27 Jun 2014 21:35:47 - @@ -56,7 +56,8 @@ */ #defineMAP_FIXED 0x0010 /* map addr must be exactly as requested */ #define__MAP_NOREPLACE 0x0800 /* fail if address not available */ -#defineMAP_ANON0x1000 /* allocated from memory, swap space */ +#defineMAP_ANONYMOUS 0x1000 /* allocated from memory, swap space */ +#defineMAP_ANONMAP_ANONYMOUS /* original BSD spelling */ #defineMAP_FLAGMASK0x1ff7 Index: lib/libc/sys/mmap.2 === RCS file: /home/matthew/cvs-mirror/cvs/src/lib/libc/sys/mmap.2,v retrieving revision 1.48 diff -u -p -r1.48 mmap.2 --- lib/libc/sys/mmap.2 27 Jun 2014 22:51:20 - 1.48 +++ lib/libc/sys/mmap.2 28 Jun 2014 00:01:25 - @@ -120,7 +120,7 @@ Sharing, mapping type, and options are s .Fa flags argument by OR'ing the following values. Exactly one of the first two values must be specified: -.Bl -tag -width MAP_PRIVATE -offset indent +.Bl -tag -width MAP_ANONYMOUS -offset indent .It Dv MAP_PRIVATE Modifications are private. .It Dv MAP_SHARED @@ -128,11 +128,11 @@ Modifications are shared. .El .Pp Any combination of the following flags may additionally be used: -.Bl -tag -width MAP_PRIVATE -offset indent -.It Dv MAP_ANON +.Bl -tag -width MAP_ANONYMOUS -offset indent +.It Dv MAP_ANONYMOUS Map anonymous memory not associated with any specific file. The file descriptor used for creating -.Dv MAP_ANON +.Dv MAP_ANONYMOUS must currently be \-1 indicating no name is associated with the region. .It Dv MAP_FIXED @@ -156,7 +156,16 @@ source compatibility with code written f but are not recommended for use in new .Ox code: -.Bl -tag -width MAP_PRIVATE -offset indent +.Bl -tag -width MAP_ANONYMOUS -offset indent +.It Dv MAP_ANON +Original name for +.Dv MAP_ANONYMOUS , +as introduced by +.Bx . +On +.Ox +this is an alias for +.Dv MAP_ANONYMOUS . .It Dv MAP_COPY Modifications are private and, unlike .Dv MAP_PRIVATE ,
Re: Rename MAP_ANON to MAP_ANONYMOUS
On Mon, Jun 30, 2014 at 10:42 AM, Mark Kettenis mark.kette...@xs4all.nl wrote: Solaris documents MAP_ANON in its man page, and defines MAP_ANONYMOUS as MAP_ANON for source compatibility. Yep, but what about it? Are you suggesting that should affect POSIX's standardization, or that we should do the same thing? I suspect if POSIX standardizes MAP_ANONYMOUS, that Solaris would switch to documenting MAP_ANONYMOUS and providing MAP_ANON for source compat, no?
Re: Rename MAP_ANON to MAP_ANONYMOUS
On Mon, Jun 30, 2014 at 11:31 AM, Mark Kettenis mark.kette...@xs4all.nl wrote: Yes, I'm saying that this should affect POSIX's standardization. Solaris is where mmap(2) came from. The full history is a bit more complicated though. From what I've managed to uncover over the past few days so far: mmap() first appeared in 4.1cBSD [mmap.2] and was scheduled for inclusion in 4.2BSD (1983) [UVM thesis, p36], but didn't actually appear until the 1993 4.4BSD release [UVM thesis, p36]. In the mean time, it was independently reimplemented for SunOS 4 [UVM thesis, p36], and released in 1988 [Wikipedia]. However, SunOS didn't implement MAP_ANON, and instead required users to pass a file descriptor for /dev/zero. In 1994 (prior to Linux's 1.0 release), Linus added support for MAP_ANONYMOUS [https://kernel.googlesource.com/pub/scm/linux/kernel/git/nico/archive/+/0b5e8609bf7e6899c1fea30aa467812d488b6c11%5E%21/#F3]. SunOS 5.8 (Feb 2000) added support for MAP_ANON [http://www.freebsd.org/cgi/man.cgi?query=mmapapropos=0sektion=0manpath=SunOS+5.8arch=defaultformat=html]. I'm not sure when it added MAP_ANONYMOUS support. Also according to Wikipedia, HP-UX is an SVR2 derivative and AIX is an SVR3 derivative. SVR2 and SVR3 were released in 1984 and 1986, respectively, so neither should have inherited MAP_ANON/MAP_ANONYMOUS from SVR/SunOS. I'm not sure when they picked them up. Also, look at: https://github.com/sgminer-dev/sgminer/blob/master/m4/mmap-anon.m4 Yeah, I've seen that, though despite the comment pointing out MAP_ANON is more common, it instead adds a #define MAP_ANONYMOUS MAP_ANON if it notices MAP_ANONYMOUS isn't provided. So ironically, that macro package's widespread use is indicative of code that uses MAP_ANONYMOUS instead of MAP_ANON. :-/
Re: Rename MAP_ANON to MAP_ANONYMOUS
On Mon, Jun 30, 2014 at 2:47 PM, Ingo Schwarze schwa...@usta.de wrote: mmap() first appeared in 4.1cBSD [mmap.2] and was scheduled for inclusion in 4.2BSD (1983) [UVM thesis, p36], but didn't actually appear until the 1993 4.4BSD release [UVM thesis, p36]. At least MAP_ANON is definitely a lot older than 1993. The original CSRG commit introducing it looks like this: sys/sys/SCCS/s.mman.h: As 00038/00013/00010 Ad D 7.2 90/12/05 15:28:56 mckusick 6 5 Ac update for new VM And that commit did make it into the 4.3BSD Net/2 release, published on August 20, 1991 (as seen on Kirk's CSRG archive CD). Thanks. I think that's generally consistent with the years I quoted above, though I suppose the UVM thesis's actually appear wording is arguable. Maybe CDC was referring to how 4.3BSD was a development release, whereas 4.4BSD was a production release? (This release represents an intermediate point in the development of 4.4BSD; [...] This distribution is NOT intended to be used on production systems http://ftp.netbsd.org/pub/NetBSD/misc/release/BSD/BSD-Net1) So i'd challenge anybody arguing for MAP_ANONYMOUS to show a published sys/mman.h header file containing that spelling where it was introduced before December 5, 1990. I don't think MAP_ANONYMOUS is being proposed for standardization because it's perceived to be of older origin than MAP_ANON or anything. I'm pretty sure the focus is instead because it's perceived to have greater 'market share' among present day systems and applications.
Re: Unnecessary mmap flags?
On Fri, Jun 27, 2014 at 02:46:01PM +0200, Mark Kettenis wrote: Losing the descriptions of the no-op flags is a bit unfortunate. Can you add those back? Okay, restored them below. Also tested that kdump can handle this change gracefully. In this diff I've also moved MAP_FILE down to the legacy flags section. I don't expect we'll want to remove it, but strictly speaking it *is* just a legacy flag: POSIX doesn't define MAP_FILE and instead just mandates the map from file semantics even without any flag. ok? Index: sys/sys/mman.h === RCS file: /home/matthew/cvs-mirror/cvs/src/sys/sys/mman.h,v retrieving revision 1.24 diff -u -p -r1.24 mman.h --- sys/sys/mman.h 13 Jun 2014 01:48:52 - 1.24 +++ sys/sys/mman.h 27 Jun 2014 17:26:32 - @@ -50,32 +50,49 @@ */ #defineMAP_SHARED 0x0001 /* share changes */ #defineMAP_PRIVATE 0x0002 /* changes are private */ -#defineMAP_COPY0x0004 /* copy region at mmap time */ /* * Other flags */ -#defineMAP_FIXED0x0010 /* map addr must be exactly as requested */ -#defineMAP_RENAME 0x0020 /* Sun: rename private pages to file */ -#defineMAP_NORESERVE0x0040 /* Sun: don't reserve needed swap area */ -#defineMAP_INHERIT 0x0080 /* region is retained after exec */ -#defineMAP_NOEXTEND 0x0100 /* for MAP_FILE, don't change file size */ -#defineMAP_HASSEMAPHORE 0x0200 /* region may contain semaphores */ -#defineMAP_TRYFIXED 0x0400 /* attempt hint address, even within heap */ +#defineMAP_FIXED 0x0010 /* map addr must be exactly as requested */ +#define__MAP_NOREPLACE 0x0800 /* fail if address not available */ +#defineMAP_ANON0x1000 /* allocated from memory, swap space */ -#define__MAP_NOREPLACE 0x0800 /* fail if address not available */ +#defineMAP_FLAGMASK0x1ff7 +#ifdef _KERNEL /* - * Error return from mmap() + * Backwards compat for OpenBSD 5.5. + * TODO: Remove after OpenBSD 5.7 release. */ -#define MAP_FAILED ((void *)-1) +#defineMAP_OLDCOPY 0x0004 /* alias for MAP_PRIVATE */ +#defineMAP_OLDRENAME 0x0020 +#defineMAP_OLDNORESERVE0x0040 +#defineMAP_OLDINHERIT 0x0080 +#defineMAP_OLDNOEXTEND 0x0100 +#defineMAP_OLDHASSEMAPHORE 0x0200 +#defineMAP_OLDTRYFIXED 0x0400 +#endif +#ifndef _KERNEL /* - * Mapping type + * Legacy defines for userland source compatibility. + * Can be removed once no longer needed in base and ports. */ -#defineMAP_FILE0x /* map from file (default) */ -#defineMAP_ANON0x1000 /* allocated from memory, swap space */ -#defineMAP_FLAGMASK0x1ff7 +#defineMAP_COPYMAP_PRIVATE /* copy region at mmap time */ +#defineMAP_FILE0 /* map from file (default) */ +#defineMAP_HASSEMAPHORE0 /* region may contain semaphores */ +#defineMAP_INHERIT 0 /* region is retained after exec */ +#defineMAP_NOEXTEND0 /* for MAP_FILE, don't change file size */ +#defineMAP_NORESERVE 0 /* Sun: don't reserve needed swap area */ +#defineMAP_RENAME 0 /* Sun: rename private pages to file */ +#defineMAP_TRYFIXED0 /* attempt hint address, even within heap */ +#endif + +/* + * Error return from mmap() + */ +#define MAP_FAILED ((void *)-1) /* * POSIX memory advisory values. Index: usr.bin/kdump/mksubr === RCS file: /home/matthew/cvs-mirror/cvs/src/usr.bin/kdump/mksubr,v retrieving revision 1.18 diff -u -p -r1.18 mksubr --- usr.bin/kdump/mksubr21 Dec 2013 07:32:35 - 1.18 +++ usr.bin/kdump/mksubr27 Jun 2014 17:08:46 - @@ -241,7 +241,9 @@ cat _EOF_ #include sys/fcntl.h #include sys/stat.h #include sys/unistd.h +#define _KERNEL #include sys/mman.h +#undef _KERNEL #include sys/wait.h #include sys/proc.h #define _KERNEL
Unnecessary mmap flags?
I just reviewed our mmap(2) flags to compare them against Linux, FreeBSD, Solaris, and Darwin's flags. Of the flags listed below, none of them are specified by POSIX, and none of them do anything interesting on OpenBSD: MAP_COPY just gets rewritten to MAP_PRIVATE, and the rest are silently ignored by UVM. Linux FreeBSD Solaris Darwin MAP_COPYno YES*no YES* MAP_RENAME no YES*YES*YES* MAP_NORESERVE YES YES*YES YES* MAP_INHERIT no YES** no no MAP_NOEXTENDno no no YES* MAP_HASSEMAPHOREno YES*** no YES*** MAP_TRYFIXEDno no no no * These are defined in the OS's sys/mman.h, but are undocumented in their mmap(2) manual, and behave the same as on OpenBSD (i.e., MAP_COPY is an alias for MAP_PRIVATE; the others have no effect). ** MAP_INHERIT is documented in FreeBSD's mmap(2) manual (as This flag never operated as advertised, which is true on OpenBSD too), but not defined in their sys/mman.h. *** MAP_HASSEMAPHORE is documented in FreeBSD and Darwin's mmap(2) manuals and defined in their sys/mman.h, but has no effects on either's VM behavior. So MAP_NORESERVE is perhaps necessary/worth keeping around, but the others seem like candidates for removal if nothing in ports needs them. MAP_HASSEMAPHORE is used in rthread_sem.c, but it doesn't do anything, so I suspect it's just cargo culting based on man page misinformation? Are there architectures that actually have restrictions on semaphore memory?
Re: Unnecessary mmap flags?
On Thu, Jun 26, 2014 at 12:28:18PM -0700, Matthew Dempsky wrote: I just reviewed our mmap(2) flags to compare them against Linux, FreeBSD, Solaris, and Darwin's flags. Of the flags listed below, none of them are specified by POSIX, and none of them do anything interesting on OpenBSD: MAP_COPY just gets rewritten to MAP_PRIVATE, and the rest are silently ignored by UVM. Feedback so far is the useless flags should go away. Diff below is a first step towards this: 1. MAP_COPY is redefined as an alias for MAP_PRIVATE, and the other useless MAP_* flags are redefined to 0. They're also hidden from the kernel to make sure no kernel code accidentally depends on them still. 2. Adds COMPAT_O55_MAP_COPY so we can stay binary compatible with any OpenBSD 5.5 programs that still use MAP_COPY (probably none, but it's trivial to do), and COMPAT_O55_MAP_NOOPMASK just to keep track of which bits were previously reserved for do-nothing flags. 3. Reshuffles the defines a little bit so the order makes more sense. Followup patch will add a deprecation warning for the MAP_* flags, but I think that'll need some ports testing, whereas this should be safe to commit now. ok? Index: sys/mman.h === RCS file: /home/matthew/cvs-mirror/cvs/src/sys/sys/mman.h,v retrieving revision 1.24 diff -u -p -r1.24 mman.h --- sys/mman.h 13 Jun 2014 01:48:52 - 1.24 +++ sys/mman.h 26 Jun 2014 23:54:28 - @@ -50,32 +50,43 @@ */ #defineMAP_SHARED 0x0001 /* share changes */ #defineMAP_PRIVATE 0x0002 /* changes are private */ -#defineMAP_COPY0x0004 /* copy region at mmap time */ + +/* + * Mapping type + */ +#defineMAP_FILE0x /* map from file (default) */ +#defineMAP_ANON0x1000 /* allocated from memory, swap space */ /* * Other flags */ -#defineMAP_FIXED0x0010 /* map addr must be exactly as requested */ -#defineMAP_RENAME 0x0020 /* Sun: rename private pages to file */ -#defineMAP_NORESERVE0x0040 /* Sun: don't reserve needed swap area */ -#defineMAP_INHERIT 0x0080 /* region is retained after exec */ -#defineMAP_NOEXTEND 0x0100 /* for MAP_FILE, don't change file size */ -#defineMAP_HASSEMAPHORE 0x0200 /* region may contain semaphores */ -#defineMAP_TRYFIXED 0x0400 /* attempt hint address, even within heap */ +#defineMAP_FIXED 0x0010 /* map addr must be exactly as requested */ +#define__MAP_NOREPLACE 0x0800 /* fail if address not available */ -#define__MAP_NOREPLACE 0x0800 /* fail if address not available */ +#ifdef _KERNEL +#define COMPAT_O55_MAP_COPY0x0004 /* alias for MAP_PRIVATE */ +#define COMPAT_O55_MAP_NOOPMASK0x07e0 /* formerly reserved flag bits */ +#endif + +#defineMAP_FLAGMASK0x1ff7 +#ifndef _KERNEL /* - * Error return from mmap() + * Deprecated flags with no significant meaning on OpenBSD. */ -#define MAP_FAILED ((void *)-1) +#defineMAP_COPYMAP_PRIVATE +#defineMAP_HASSEMAPHORE0 +#defineMAP_INHERIT 0 +#defineMAP_NOEXTEND0 +#defineMAP_NORESERVE 0 +#defineMAP_RENAME 0 +#defineMAP_TRYFIXED0 +#endif /* - * Mapping type + * Error return from mmap() */ -#defineMAP_FILE0x /* map from file (default) */ -#defineMAP_ANON0x1000 /* allocated from memory, swap space */ -#defineMAP_FLAGMASK0x1ff7 +#define MAP_FAILED ((void *)-1) /* * POSIX memory advisory values. Index: uvm/uvm_mmap.c === RCS file: /home/matthew/cvs-mirror/cvs/src/sys/uvm/uvm_mmap.c,v retrieving revision 1.94 diff -u -p -r1.94 uvm_mmap.c --- uvm/uvm_mmap.c 13 Apr 2014 23:14:15 - 1.94 +++ uvm/uvm_mmap.c 26 Jun 2014 23:49:39 - @@ -345,8 +345,8 @@ sys_mmap(struct proc *p, void *v, regist return (EINVAL); if ((flags MAP_FLAGMASK) != flags) return (EINVAL); - if (flags MAP_COPY) - flags = (flags ~MAP_COPY) | MAP_PRIVATE; + if (flags COMPAT_O55_MAP_COPY) + flags = (flags ~COMPAT_O55_MAP_COPY) | MAP_PRIVATE; if ((flags (MAP_SHARED|MAP_PRIVATE)) == (MAP_SHARED|MAP_PRIVATE)) return (EINVAL); if ((flags (MAP_FIXED|__MAP_NOREPLACE)) == __MAP_NOREPLACE)
POSIX-compliant page fault error codes
POSIX specifies these error cases for memory faults: SIGSEGV/SEGV_MAPERR: Accessing an unmapped page. SIGSEGV/SEGV_ACCERR: Reading from a non-readable or writing to a non-writable page. SIGBUS/BUS_ADRERR: Accessing a mapped page that exceeds the end of the underlying mapped file. I added a regress test at regress/sys/kern/siginfo-fault to cover these cases, but unfortunately we're non-compliant in a few ways, and fixing it is somewhat MD. With the diff below, the tests pass on amd64, but other platforms will need similar changes. Currently VM_PAGER_BAD is only returned by pgo_get() in the case of uvn_get() trying to access a page beyond the end of the file, so this diff changes uvm_fault() to recognize this and return ENOSPC (arbitrary unused error code) and then the MD trap() code needs to know to map this error to BUS_ADRERR. Additionally, some MD trap()s already know to map EACCES to SEGV_ACCERR instead of SEGV_MAPERR, but amd64 wasn't one of them. So this diff fixes that too. Index: uvm/uvm_fault.c === RCS file: /home/matthew/cvs-mirror/cvs/src/sys/uvm/uvm_fault.c,v retrieving revision 1.73 diff -u -p -r1.73 uvm_fault.c --- uvm/uvm_fault.c 8 May 2014 20:08:50 - 1.73 +++ uvm/uvm_fault.c 23 Jun 2014 21:29:24 - @@ -1125,7 +1125,8 @@ Case2: goto ReFault; } - return (EACCES); /* XXX i/o error */ + /* XXX i/o error */ + return (result == VM_PAGER_BAD ? ENOSPC : EACCES); } /* re-verify the state of the world. */ Index: arch/amd64/amd64/trap.c === RCS file: /home/matthew/cvs-mirror/cvs/src/sys/arch/amd64/amd64/trap.c,v retrieving revision 1.40 diff -u -p -r1.40 trap.c --- arch/amd64/amd64/trap.c 15 Jun 2014 11:43:24 - 1.40 +++ arch/amd64/amd64/trap.c 23 Jun 2014 21:38:31 - @@ -387,9 +387,6 @@ faultcommon: KERNEL_UNLOCK(); goto out; } - if (error == EACCES) { - error = EFAULT; - } if (type == T_PAGEFLT) { if (pcb-pcb_onfault != 0) { @@ -407,13 +404,23 @@ faultcommon: sv.sival_ptr = (void *)fa; trapsignal(p, SIGKILL, T_PAGEFLT, SEGV_MAPERR, sv); } else { + int signo, code; + if (error == ENOSPC) { + signo = SIGBUS; + code = BUS_ADRERR; + } else { + signo = SIGSEGV; + code = (error == EACCES) ? SEGV_ACCERR : + SEGV_MAPERR; + } #ifdef TRAP_SIGDEBUG - printf(pid %d (%s): SEGV at rip %lx addr %lx\n, - p-p_pid, p-p_comm, frame-tf_rip, fa); + printf(pid %d (%s): %s at rip %lx addr %lx\n, + p-p_pid, p-p_comm, (signo == SIGBUS) ? + BUS : SEGV, frame-tf_rip, fa); frame_dump(frame); #endif sv.sival_ptr = (void *)fa; - trapsignal(p, SIGSEGV, T_PAGEFLT, SEGV_MAPERR, sv); + trapsignal(p, signo, T_PAGEFLT, code, sv); } KERNEL_UNLOCK(); break;
Re: POSIX-compliant page fault error codes
On Tue, Jun 24, 2014 at 11:04:10AM -0700, Matthew Dempsky wrote: SIGBUS/BUS_ADRERR: Accessing a mapped page that exceeds the end of the underlying mapped file. Generating SIGBUS for this case has proven controversial due to concern that this is Linux invented behavior and not compatible with Solaris, so I decided to collect some more background information on the subject. - SunOS 4.1.3's mmap() manual specifies: Any reference to addresses beyond the end of the object, however, will result in the delivery of a SIGBUS signal. This wording was relaxed to SIGBUS or SIGSEGV in SunOS 5.6 and remains in current manuals. (I'm not sure, but I suspect this may be to simply reflect that memory protection violations take priority over bounds checking.) SunOS 4.1.3: http://www.freebsd.org/cgi/man.cgi?query=mmapsektion=2manpath=SunOS+4.1.3 SunOS 5.6: http://www.freebsd.org/cgi/man.cgi?query=mmapsektion=2manpath=SunOS+5.6 Solaris 11: http://docs.oracle.com/cd/E23824_01/html/821-1463/mmap-2.html - Many other SVR-derived OSes similarly document SIGBUS in their mmap() manuals too: AIX: http://www-01.ibm.com/support/knowledgecenter/ssw_aix_53/com.ibm.aix.basetechref/doc/basetrf1/mmap.htm?lang=en HPUX: http://h20566.www2.hp.com/portal/site/hpsc/template.BINARYPORTLET/public/kb/docDisplay/resource.process/?spf_p.tpst=kbDocDisplay_ws_BIspf_p.rid_kbDocDisplay=docDisplayResURLjavax.portlet.begCacheTok=com.vignette.cachetokenspf_p.rst_kbDocDisplay=wsrp-resourceState%3DdocId%253Demr_na-c02261243-2%257CdocLocale%253Djavax.portlet.endCacheTok=com.vignette.cachetoken UnixWare: http://uw714doc.sco.com/en/man/html.2/mmap.2.html - This behavior has been (awkwardly) specified for mmap() since SUSv2: References within the address range starting at pa and continuing for len bytes to whole pages following the end of an object shall result in delivery of a SIGBUS signal. Later versions of POSIX have the same wording. SUSv2: http://pubs.opengroup.org/onlinepubs/007908799/xsh/mmap.html POSIX.2001: http://pubs.opengroup.org/onlinepubs/009695399/functions/mmap.html POSIX.2008: http://pubs.opengroup.org/onlinepubs/9699919799/functions/mmap.html - More generally, POSIX explains the SIGBUS/SIGSEGV distinction thusly: When an object is mapped, various application accesses to the mapped region may result in signals. In this context, SIGBUS is used to indicate an error using the mapped object, and SIGSEGV is used to indicate a protection violation or misuse of an address. Specific examples are provided too: Memory Protection: http://pubs.opengroup.org/onlinepubs/9699919799/functions/V2_chap02.html#tag_15_08_03_03
ddb: ELF line decoding in stack traces
Diff below implements file/line number decoding in ddb for ELF; e.g., ddb{2} trace Debugger() at Debugger+0x9 [../../../../arch/amd64/amd64/db_interface.c:406] ddb_sysctl() at ddb_sysctl+0x1b4 [../../../../ddb/db_usrreq.c:104] sys___sysctl() at sys___sysctl+0x214 [../../../../kern/kern_sysctl.c:229] syscall() at syscall+0x297 [../../../../sys/syscall_mi.h:84] --- syscall (number 202) --- end of kernel end trace frame: 0x7f7e1def, count: -4 acpi_pdirpa+0x41195a: ddb{2} It works by modifying boot(8) to copy the .debug_line DWARF section into memory when present (same as it does currently for symbol and string table sections), and then ddb's ELF symbol format handler code makes use of it to translate program counters into file/line pairs just like addr2line would. For line decoding to work, the kernel must have been compiled with -g. Normal and/or stripped kernels continue to work as they do now, just won't produce file/line information. Only lightly tested on amd64 so far, but might work on other platforms too. If you're brave, you can try it out thusly: 1. Apply patch. 2. Build boot(8), copy to /boot, and run installboot(8). 3. Configure kernel with 'makeoptions DEBUG=-g'. 4. Build kernel and install bsd.gdb (not bsd!). 5. Reboot. I did this mostly for fun / personal enlightenment, so I haven't put too much effort into polishing it yet. Index: ddb/db_elf.c === RCS file: /home/matthew/anoncvs/cvs/src/sys/ddb/db_elf.c,v retrieving revision 1.10 diff -u -p -r1.10 db_elf.c --- ddb/db_elf.c16 Mar 2014 20:31:46 - 1.10 +++ ddb/db_elf.c23 Jun 2014 12:27:26 - @@ -46,6 +46,7 @@ #include sys/exec_elf.h static char *db_elf_find_strtab(db_symtab_t *); +static char *db_elf_find_linetab(db_symtab_t *, size_t *); #defineSTAB_TO_SYMSTART(stab) ((Elf_Sym *)((stab)-start)) #defineSTAB_TO_SYMEND(stab)((Elf_Sym *)((stab)-end)) @@ -111,10 +112,10 @@ db_elf_sym_init(int symsize, void *symta * . . . * . . . * last section header -* first symbol or string table section +* first symbol, string, or line table section * . . . * . . . -* last symbol or string table section +* last symbol, string, or line table section */ /* @@ -233,6 +234,29 @@ db_elf_find_strtab(db_symtab_t *stab) } /* + * Internal helper function - return a pointer to the line table + * for the current symbol table. + */ +static char * +db_elf_find_linetab(db_symtab_t *stab, size_t *size) +{ + Elf_Ehdr *elf = STAB_TO_EHDR(stab); + Elf_Shdr *shp = STAB_TO_SHDR(stab, elf); + char *shstrtab; + int i; + + shstrtab = (char *)elf + shp[elf-e_shstrndx].sh_offset; + for (i = 0; i elf-e_shnum; i++) { + if (strcmp(.debug_line, shstrtab+shp[i].sh_name) == 0) { + *size = shp[i].sh_size; + return ((char *)elf + shp[i].sh_offset); + } + } + + return (NULL); +} + +/* * Lookup the symbol with the given name. */ db_sym_t @@ -343,6 +367,8 @@ db_elf_symbol_values(db_symtab_t *symtab *valuep = symp-st_value; } +#include db_elf_dwarf.c + /* * Return the file and line number of the current program counter * if we can find the appropriate debugging symbol. @@ -351,11 +377,15 @@ boolean_t db_elf_line_at_pc(db_symtab_t *symtab, db_sym_t cursym, char **filename, int *linenum, db_expr_t off) { + const char *linetab; + size_t linetab_size; - /* -* XXX We don't support this (yet). -*/ - return (FALSE); + linetab = db_elf_find_linetab(symtab, linetab_size); + if (linetab == NULL) + return (FALSE); + + return (db_elf_dwarf_line_at_pc(linetab, linetab_size, off, + filename, linenum)); } /* Index: ddb/db_elf_dwarf.c === RCS file: ddb/db_elf_dwarf.c diff -N ddb/db_elf_dwarf.c --- /dev/null 1 Jan 1970 00:00:00 - +++ ddb/db_elf_dwarf.c 23 Jun 2014 13:37:34 - @@ -0,0 +1,404 @@ +/* $OpenBSD$*/ +/* + * Copyright (c) 2014 Matthew Dempsky matt...@dempsky.org + * + * Permission to use, copy, modify, and distribute this software for any + * purpose with or without fee is hereby granted, provided that the above + * copyright notice and this permission notice appear in all copies. + * + * THE SOFTWARE IS PROVIDED AS IS AND THE AUTHOR DISCLAIMS ALL WARRANTIES + * WITH REGARD TO THIS SOFTWARE INCLUDING ALL IMPLIED WARRANTIES OF + * MERCHANTABILITY AND FITNESS. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR + * ANY SPECIAL, DIRECT, INDIRECT, OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES + * WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN + * ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS
Re: gcc: warn about overly aligned stack variables
On Mon, Jun 23, 2014 at 9:59 AM, Miod Vallat m...@online.fr wrote: I love this. I have verified that kernels for all our gcc4 platforms still build with this diff. Awesome! I also heard back from guenther@ that he completed an i386 base build without warnings too. Do we want to test this any further (e.g., ports), or should I go ahead and commit? If the latter, anyone want to step up to ok it? :)
Re: procfs to die [was: CVS: cvs.openbsd.org: src]
I suspect procfs is only enabled on i386 because that's the only arch with compat_linux support? If so, anyone who relies on compat_linux support should be sure to test and report back if they have problems. On Sun, Jun 22, 2014 at 2:22 PM, Philip Guenther guent...@gmail.com wrote: If you're currently using procfs, please respond with exactly what parts of it are actually needed by the programs you run. Without a good reason to keep it, it'll be deleted, as I broke listing of /proc back in January and no one has noticed, suggesting that no one is actually using it... Philip Guenther -- Forwarded message -- From: Philip Guenther guent...@cvs.openbsd.org Date: Sun, Jun 22, 2014 at 2:15 PM Subject: CVS: cvs.openbsd.org: src To: source-chan...@cvs.openbsd.org CVSROOT:/cvs Module name:src Changes by: guent...@cvs.openbsd.org2014/06/22 15:15:51 Modified files: sys/arch/i386/conf: GENERIC Log message: PROCFS has been broken for months without complaints, so stop building it suggested by sthen@
gcc: warn about overly aligned stack variables
GCC supports an aligned attribute to specify a minimum alignment for types/objects. However, if an object is allocated on the stack and its alignment exceeds the preferred stack boundary, then GCC 4.2 silently ignores the alignment. This bit us 4 years ago when the SCSI stack started allocating mutexes on the stack: HPPA mutexes need to be 16-byte aligned, but HPPA's stack is naturally only 8-byte aligned. Since newer GCC properly support overly aligned stack allocations, it seems prudent to at least make base GCC emit a warning if its going to ignore an alignment request. With the diff below, compiling a source file like this: typedef int __attribute__((aligned(512))) aligned_int; aligned_int good; void foo() { aligned_int bad; } now produces a warning like this: $ cc -c test.c test.c: In function 'foo': test.c:6: warning: ignoring alignment for stack allocated 'bad' I verified this doesn't break an amd64 kernel build, but I haven't had time to look beyond that. Sharing in case anyone's interested and/or wants to test further themselves. Index: gcc/cfgexpand.c === RCS file: /home/matthew/cvs-mirror/cvs/src/gnu/gcc/gcc/cfgexpand.c,v retrieving revision 1.4 diff -u -p -r1.4 cfgexpand.c --- gcc/cfgexpand.c 6 May 2014 23:32:34 - 1.4 +++ gcc/cfgexpand.c 20 Jun 2014 22:55:53 - @@ -159,8 +159,10 @@ get_decl_align_unit (tree decl) align = DECL_ALIGN (decl); align = LOCAL_ALIGNMENT (TREE_TYPE (decl), align); - if (align PREFERRED_STACK_BOUNDARY) + if (align PREFERRED_STACK_BOUNDARY) { +warning (0, ignoring alignment for stack allocated %q+D, decl); align = PREFERRED_STACK_BOUNDARY; + } if (cfun-stack_alignment_needed align) cfun-stack_alignment_needed = align;
Re: exec_elf.c: mistake ?
Reading through exec_elf.c, I just noticed the uninitialized bdiff variable myself, and remembered this thread. Tangentially, the code for worrying about zero-filling the last page is overzealous. We only need to zero-fill the page if memsz filesz (accounting for alignment and page boundaries). In the common case (at least on amd64), we always have either memsz==filesz (most segments) or filesz==0 (segment for .bss), so we shouldn't need zero-filling. And actually, I think the logic for only doing zero-fill for writable segments is wrong. It looks to me like the ELF gABI specifies zero-fill semantics for memsz filesz even for read-only segments. On Sat, Oct 5, 2013 at 4:27 PM, Philip Guenther guent...@gmail.com wrote: On Thu, 15 Aug 2013, Maxime Villard wrote: Hum hum, after investigating a bit, and from what I understood - if I'm not mistaken, I think it would make sense if bdiff was set to bdiff = ph-p_vaddr - trunc_page(ph-p_vaddr); Yep. Since we are supposed to align only on 2^n boundaries, if we get 0 or 1 we do not align on p_align. But p_vaddr still has to be aligned on PAGE_MASK (with trunc_page()); I read somewhere that ELF loaders are smart enough to adjust the address when it does not exactly fit a page boundary. So bdiff should be the difference with the original p_vaddr. Actually, bdiff is already set to this value in the other conditions. There's another problem, 'base' should also be initialized here. I would say that is should be set to the truncated p_vaddr plus the address at which we want to load: base = *addr + trunc_page(ph-p_vaddr); but I'm not sure at all. By the logic of the ph-p_align 1 case, it should be base = *addr + trunc_page(ph-p_vaddr) - ph-p_vaddr; (I think the only reason the 'if' is necessary is that p_align==0 must be treated the same as p_align==1. The ELF_TRUNC() macro handles an alignment of 1 correctly, but it'll barf on 0.) Philip Guenther
Re: wrong semantic type in kern_descrip.c
On Sun, Jun 15, 2014 at 1:20 PM, Jean-Philippe Ouellet jean-phili...@ouellet.biz wrote: Those 4 memcpy()s are copying the things referenced by the old filedesc to the new one. The things being copied are file*s, not file**s. They're the same size anyway, but still... Technically, C99 grants implementations leeway to use different representations for pointers to structure type and pointers to other types (C99 6.2.5 P26), though I can't imagine implementations where this would actually come up, and OpenBSD certainly doesn't support any. Committed, thanks!
Re: 9p
On Wed, Jun 4, 2014 at 2:00 PM, M Farkas-Dyck strake...@gmail.com wrote: Would a rwlock do? The sender and recver operate asynchronously, so the sender needs to hold a lock while sending and release it when asleep, but it can't be a mutex as the send operation may sleep, so I used requ.ready as the lock, but a rwlock seems appropriate. msleep() unlocks the mutex after the process is placed on the sleep queue, and then relocks the mutex after being woken up. I would expect that should be sufficient for your use case, but I haven't looked close enough. Style nits: Please keep source files ASCII; i.e., Copyright (C) instead of the Unicode copyright character. Also, look at man 9 style; in particular, OpenBSD doesn't put spaces around - and we keep lines under 80 characters long. Oh, and no static functions in the kernel. They cause problems with ddb.
Re: recvmsg, fd passing and soreceive
On Wed, May 14, 2014 at 4:58 AM, Jérémie Courrèges-Anglas j...@wxcvbn.org wrote: if (cmsg-cmsg_len == CMSG_LEN(sizeof(int)) cmsg-cmsg_level == SOL_SOCKET cmsg-cmsg_type == SCM_RIGHTS) { passed_fd = *(int *)CMSG_DATA(cmsg); In your test program you only ever send one FD, so this is fine; but in general multiple FDs can be sent at a time, and you'll receive a single SOL_SOCKET/SCM_RIGHTS control message with an array of N ints. E.g., see https://code.google.com/p/chromium/codesearch#chromium/src/base/posix/unix_domain_socket_linux.ccl=130. So the two issues show up when you want to call sendmsg and recvmsg with no data at all, but only control messages. sendmsg allocates an empty mbuf for data (m_len == 0), this mbuf is not discarded and freed by soreceive for SOCK_STREAM sockets, and stays in the socket buffer. I understand that this API was probably not designed to pass only control messages, and no data at all. But it is easily fixable, and the manpage could provide portability information. Blah, I'm not really a fan of allowing sending zero-length data messages. It makes it tricky for receivers to tell if they actually received an EOF or not. But if we're going to allow it (which we do currently it seems), then your diff seems like a reasonable fix to me. mbufs aren't my specialty though. =/
Re: malloc in libssl/src/apps
On Mon, May 5, 2014 at 3:56 PM, Alexander Hall alexan...@beard.se wrote: I believe a similar situation could appear with not explicitly initialized global or static declarations, e.g. in sbin/fsirand/fsirand.c: fsirand(char *device) { ... static char *inodebuf; This is safe too: C requires that pointer variables with static storage duration like this be initialized to a null pointer (C99 section 6.7.8 paragraph 10), not to a sequence of 0 bytes.
Re: Question and regression test for strftime adn wcsftime
On Tue, Apr 22, 2014 at 2:43 PM, Stefan Sperling s...@openbsd.org wrote: Your regression test has at least one bug ('bad' is never initialised). It's perhaps bad *style* to not explicitly initialize it, but C99 6.7.8p10 says If an object that has static storage duration is not initialized explicitly, then: [...] if it has arithmetic type, it is initialized to (positive or unsigned) zero. So technically not a bug. :)
Re: dhclient support for /32 assignments
On Tue, Dec 3, 2013 at 4:15 PM, Matthew Dempsky matt...@dempsky.org wrote: The patch below extends dhclient to mimic this logic from ISC DHCP's linux script: if [ x$new_subnet_mask = x255.255.255.255 ] ; then route add -host $router dev $interface fi route add default gw $router $metric_arg dev $interface Just for the archive's benefit, I was able to track this down to this RELNOTES entry in ISC DHCP: - The dhclient-script was updated to create a host route for the default gateway if the supplied subnet mask for an IPv4 address was a /32. This allows the client to work in 'captive' network environments, where the operator does not want clients to crosstalk directly. listed under Changes since 4.0.0 (new features), which gives a bit more context for the change. I haven't found anything else interesting though, and I've given up hope of finding anything.
Re: dhclient support for /32 assignments
On Tue, Dec 03, 2013 at 11:48:05PM -0500, Kenneth Westerback wrote: Rfc 3442 is what I referred to. I don't think RFC 3442 discusses what to do with /32 IP address assignments though? Anyway, below is a revised diff that does the same direct-route magic for all gateway IPs, not just the default gateway IP. I'll try this out later today to make sure it still works with Compute Engine. Assuming it works as intended, ok? Index: dhclient.c === RCS file: /cvs/src/sbin/dhclient/dhclient.c,v retrieving revision 1.268 diff -u -p -r1.268 dhclient.c --- dhclient.c 20 Nov 2013 17:22:46 - 1.268 +++ dhclient.c 4 Dec 2013 18:45:21 - @@ -107,9 +107,11 @@ struct client_lease *clone_lease(struct voidsocket_nonblockmode(int); voidapply_ignore_list(char *); -void add_default_route(int, struct in_addr, struct in_addr); -void add_static_routes(int, struct option_data *); -void add_classless_static_routes(int, struct option_data *); +void add_default_route(int, struct in_addr, struct in_addr, struct in_addr); +void add_static_routes(int, struct in_addr, struct in_addr, +struct option_data *); +void add_classless_static_routes(int, struct in_addr, struct in_addr, +struct option_data *); int compare_lease(struct client_lease *, struct client_lease *); @@ -871,8 +873,8 @@ bind_lease(void) */ add_address(ifi-name, ifi-rdomain, client-new-address, mask); if (options[DHO_CLASSLESS_STATIC_ROUTES].len) { - add_classless_static_routes(ifi-rdomain, - options[DHO_CLASSLESS_STATIC_ROUTES]); + add_classless_static_routes(ifi-rdomain, client-new-address, + mask, options[DHO_CLASSLESS_STATIC_ROUTES]); } else { if (options[DHO_ROUTERS].len) { memset(gateway, 0, sizeof(gateway)); @@ -880,11 +882,11 @@ bind_lease(void) memcpy(gateway.s_addr, options[DHO_ROUTERS].data, options[DHO_ROUTERS].len); add_default_route(ifi-rdomain, client-new-address, - gateway); + mask, gateway); } if (options[DHO_STATIC_ROUTES].len) - add_static_routes(ifi-rdomain, - options[DHO_STATIC_ROUTES]); + add_static_routes(ifi-rdomain, client-new-address, + mask, options[DHO_STATIC_ROUTES]); } client-new-resolv_conf = resolv_conf_contents( @@ -2379,6 +2381,23 @@ priv_write_file(struct imsg_write_file * } /* + * If we were given a /32 IP address assignment, then make sure the + * gateway address is routable with the equivalent of + * + * route add -net $gw -netmask 255.255.255.255 -cloning -iface $addr + */ +void +ensure_direct_route(int rdomain, struct in_addr addr, struct in_addr addrmask, +struct in_addr gateway) +{ + if (addrmask.s_addr == INADDR_BROADCAST) { + add_route(rdomain, gateway, addrmask, addr, + RTA_DST | RTA_NETMASK | RTA_GATEWAY, + RTF_CLONING | RTF_STATIC); + } +} + +/* * add_default_route is the equivalent of * * route -q $rdomain add default -iface $router @@ -2388,11 +2407,14 @@ priv_write_file(struct imsg_write_file * * route -q $rdomain add default $router */ void -add_default_route(int rdomain, struct in_addr addr, struct in_addr gateway) +add_default_route(int rdomain, struct in_addr addr, struct in_addr addrmask, +struct in_addr gateway) { struct in_addr netmask, dest; int addrs, flags; + ensure_direct_route(rdomain, addr, addrmask, gateway); + memset(netmask, 0, sizeof(netmask)); memset(dest, 0, sizeof(dest)); addrs = RTA_DST | RTA_NETMASK; @@ -2412,23 +2434,26 @@ add_default_route(int rdomain, struct in } void -add_static_routes(int rdomain, struct option_data *static_routes) +add_static_routes(int rdomain, struct in_addr addr, struct in_addr addrmask, +struct option_data *static_routes) { struct in_addr dest, netmask, gateway; - u_int8_t *addr; + u_int8_t *data; int i; memset(netmask, 0, sizeof(netmask)); /* Not used for CLASSFULL! */ for (i = 0; (i + 7) static_routes-len; i += 8) { - addr = static_routes-data[i]; + data = static_routes-data[i]; memset(dest, 0, sizeof(dest)); memset(gateway, 0, sizeof(gateway)); - memcpy(dest.s_addr, addr, 4); + memcpy(dest.s_addr, data, 4); if (dest.s_addr == INADDR_ANY) continue; /* RFC 2132 says 0.0.0.0 is not allowed. */ - memcpy(gateway.s_addr, addr+4, 4); +
Re: dhclient support for /32 assignments
On Wed, Dec 04, 2013 at 02:10:21PM -0500, Kenneth R Westerback wrote: No, that was my point. i.e. don't avoid adding the route when given a /32 address just because class static routes are also present. I think there might be a misunderstanding, so let me back up and try to clarify. :) Compute Engine gives out DHCP leases like ip: 10.240.77.88, netmask: 255.255.255.255, gateway: 10.240.0.1. The problem is 10.240.0.1 isn't routable given a 10.240.77.88/32 IP address. ISC DHCP on Linux handles this by special casing netmask == 255.255.255.255, and adding an extra direct route for the gateway IP. E.g., for the above assignment, ISC DHCP would run route add 10.240.0.1 dev eth0 which tells Linux's network stack 10.240.0.1 is directly accessible via eth0, even though it's not part of the 10.240.77.88/32 subnet. (It would then run route add default 10.240.0.1 as per normal.) As far as I can tell, there's no authoritative/standard document that describes this special case, and ISC DHCP only applies this special case to the default gateway IP specified by DHO_ROUTERS. Notably, it does *not* apply this direct routing logic for gateway IPs specified by DHO_STATIC_ROUTES and/or DHO_CLASSLESS_STATIC_ROUTES, but it's unclear whether that's intentional or accidental. Not from me. This is way bigger that the last three line diff, and I have insufficient routing foo to comment on the usefulness of all these routes. The original three line diff was so short because it mimicked ISC DHCP and only special cased the default gateway (i.e., DHO_ROUTERS), and only did so when DHO_ROUTERS was actually used (i.e., when DHO_CLASSLESS_STATIC_ROUTES is not specified). You suggested we should unconditionally add the route, but it was unclear to me whether you meant: 1. We should always apply the special case logic for adding a direct route for the default gateway specified by DHO_ROUTERS, even if DHO_CLASSLESS_STATIC_ROUTES is specified; or 2. We should apply the special case logic to add direct routes for all gateways (i.e., those specified by DHO_CLASSLESS_STATIC_ROUTES and DHO_STATIC_ROUTES too), not just the default gateway. Interpretation #1 didn't seem right, because if DHO_CLASSLESS_STATIC_ROUTES is specified, then DHO_ROUTERS isn't used at all, so it might not be meaningful to add a direct route for DHO_ROUTERS's gateway. Interpretation #2 seems reasonable (though deviating from ISC DHCP's undocumented behavior), so that's what the second patch implemented. But that's necessarily more code than the original diff, because we need to repeat the special case logic for each gateway IP as they're extracted from the DHCP options. The change is somewhat bigger and more invasive, but I think not that much more complex: 1. Add a function ensure_direct_route() function that applies ISC DHCP's /32 logic for a given gateway IP address (same code as from first patch). 2. Call ensure_direct_route() in each of add_default_route(), add_static_routes(), and add_classless_static_routes() for each gateway IP address. 3. Add extra 'addr' and 'addrmask' parameters to each of the add_*_routes() functions to pass the extra data needed for ensure_direct_route(), and fix call sites appropriately. 4. In add_static_routes(), rename the local 'addr' variable that points into option data to 'data' to avoid conflicting with the new 'addr' parameter.
Re: select.2: update includes
On Tue, Dec 3, 2013 at 12:08 PM, Christian Weisgerber na...@mips.inka.de wrote: ok? ok matthew +.Fd #include sys/select.h Worth using .In instead while you're at it?
Re: select.2: update includes
On Tue, Dec 3, 2013 at 12:37 PM, Philip Guenther guent...@gmail.com wrote: string.h needs to stay until FD_ZERO() and FD_COPY() are changed to not use memset()/memcpy(). Good point. Would something like this work? #define FD_COPY(f, t) (*(fd_set *)(t) = *(const fd_set *)(f)) static const fd_set __fd_zero_set; #define FD_ZERO(p) FD_COPY(__fd_zero_set, p) Downside is we lose const warnings. We could get those back if we change FD_COPY into an inline function instead of a macro, or do some GNU block-expression trickery. Or we can just declare memset/memcpy in sys/select.h too?
Re: select.2: update includes
On Tue, Dec 3, 2013 at 1:55 PM, Christian Weisgerber na...@mips.inka.de wrote: +#if __BSD_VISIBLE +#defineFD_COPY(f, t) (void)(*(t) = *(f)) +#endif +#defineFD_ZERO(p) do \ + fd_set *_p = (p); \ + __size_t _n = __howmany(FD_SETSIZE, __NFDBITS); \ + \ + while (_n 0) \ + _p-fds_bits[--_n] = 0; \ +} while (0) I think you're missing a { after the do. :) But otherwise looks ok to me.
Re: select.2: update includes
On Tue, Dec 3, 2013 at 1:39 PM, Philip Guenther guent...@gmail.com wrote: What problem does the casts solve? I wasn't sure if people might be calling FD_COPY()/FD_ZERO() with void* or char* typed arguments (or other custom sized types). If we can assume they'll only pass fd_set* arguments, then they could be done away with. (And actually, as noted in naddy's cite from FreeBSD, FD_COPY() is a non-standard BSDism, so it can just continue to assume the caller pulled in string.h) Sounds good to me. That'll put a __fd_zero_set in each .o that references FD_ZERO(), ick. Oops, yeah, I was thinking they'd be collapsed into a single const value in the output object, but I forgot they'd need to be kept separate for object identity.
dhclient support for /32 assignments
The patch below extends dhclient to mimic this logic from ISC DHCP's linux script: if [ x$new_subnet_mask = x255.255.255.255 ] ; then route add -host $router dev $interface fi route add default gw $router $metric_arg dev $interface With this change, dhclient is able to successfully configure the network via DHCP on Compute Engine: $ ifconfig vio0 vio0: flags=8b43UP,BROADCAST,RUNNING,PROMISC,ALLMULTI,SIMPLEX,MULTICAST mtu 1500 lladdr 42:01:0a:f0:8f:56 priority: 0 groups: egress media: Ethernet autoselect status: active inet6 fe80::4001:aff:fef0:8f56%vio0 prefixlen 64 scopeid 0x1 inet 10.240.143.86 netmask 0x $ netstat -r -n -f inet Routing tables Internet: DestinationGatewayFlags Refs Use Mtu Prio Iface default10.240.0.1 UGS1 183 - 8 vio0 10.240.0.1 42:01:0a:f0:00:01 UHLc 10 -56 vio0 10.240.0.1/32 link#1 UC 10 -56 vio0 10.240.143.86/32 link#1 UC 00 - 4 vio0 127/8 127.0.0.1 UGRS 00 33144 8 lo0 127.0.0.1 127.0.0.1 UH 10 33144 4 lo0 224/4 127.0.0.1 URS00 33144 8 lo0 ok? Index: dhclient.c === RCS file: /cvs/src/sbin/dhclient/dhclient.c,v retrieving revision 1.268 diff -u -p -r1.268 dhclient.c --- dhclient.c 20 Nov 2013 17:22:46 - 1.268 +++ dhclient.c 4 Dec 2013 00:06:08 - @@ -879,6 +879,21 @@ bind_lease(void) /* XXX Only use FIRST router address for now. */ memcpy(gateway.s_addr, options[DHO_ROUTERS].data, options[DHO_ROUTERS].len); + + /* +* If we were given a /32 IP assignment, then make sure +* the gateway address is routable with equivalent of +* +* route add -net $gw -netmask 255.255.255.255 \ +* -cloning -iface $addr +*/ + if (mask.s_addr == INADDR_BROADCAST) { + add_route(ifi-rdomain, gateway, mask, + client-new-address, + RTA_DST | RTA_NETMASK | RTA_GATEWAY, + RTF_CLONING | RTF_STATIC); + } + add_default_route(ifi-rdomain, client-new-address, gateway); }
Re: dhclient support for /32 assignments
On Tue, Dec 3, 2013 at 5:55 PM, Kenneth R Westerback kwesterb...@rogers.com wrote: Located here, the addition of the 255.255.255.255 route is not done in the presence of DHO_CLASSLESS_STATIC_ROUTES. As I recall only DHO_ROUTERS and DHO_STATIC_ROUTES are incompatible with DHO_CLASSLESS_STATIC_ROUTES. So we may want this chunk to occur unconditionally. I'm not opposed to that per se and I don't imagine it should cause any problems, but it would diverge somewhat from ISC DHCP's behavior. For reference, ISC DHCP 4.2.5-P1 doesn't support DHO_CLASSLESS_STATIC_ROUTES, the Linux client script doesn't support DHO_STATIC_ROUTES, and the FreeBSD script only does this routing magic for DHO_ROUTERS not DHO_STATIC_ROUTES. I'll try to find something authoritative on how we're 'supposed' to behave here.
Re: [PATCH] ELF: ensure PT_INTERP strings are NULL-terminated
On Wed, Aug 28, 2013 at 5:54 AM, Maxime Villard m...@m00nbsd.net wrote: + /* Ensure interp is a valid, NUL-terminated string */ + for (n = 0; n pp-p_filesz; n++) { + if (interp[n] == '\0') + break; + } + if (n != pp-p_filesz - 1) { + error = ENOEXEC; goto bad; } A more concise test would be: if (strnlen(interp, pp-p_filesz) != pp-p_filesz - 1) { error = ENOEXEC; goto bad; }
Re: Update elf(5)
Might be nice to mention that 5.4 'was' the first release that used ELF on all platforms? On Mon, Jul 8, 2013 at 11:16 PM, Simon Kuhnle si...@blarzwurst.de wrote: With VAX being an ELF platform, this is no longer true, is it? Regards Simon Index: elf.5 === RCS file: /cvs/src/share/man/man5/elf.5,v retrieving revision 1.20 diff -u -r1.20 elf.5 --- elf.5 17 Jan 2013 21:54:18 - 1.20 +++ elf.5 9 Jul 2013 06:14:54 - @@ -1355,9 +1355,7 @@ .Sh HISTORY .Ox ELF support first appeared in -.Ox 1.2 , -although not all supported platforms use it as the native -binary file format. +.Ox 1.2 . ELF in itself first appeared in .At V . The ELF format is an adopted standard.
Re: Update elf(5)
On Tue, Jul 9, 2013 at 12:22 AM, Marc Espie es...@nerim.net wrote: Starting .Ox 5.4 , all supported platforms use it as the native binary file format. I think Starting *with* .Ox 5.4, ... but otherwise ok matthew.
Re: Update elf(5)
Committed. Thanks Simon! On Tue, Jul 9, 2013 at 10:21 AM, Jason McIntyre j...@kerhand.co.uk wrote: On Tue, Jul 09, 2013 at 07:07:03PM +0200, Marc Espie wrote: On Tue, Jul 09, 2013 at 06:09:02PM +0200, Simon Kuhnle wrote: On Tue, Jul 09, 2013 at 12:29:10AM -0700, Matthew Dempsky wrote: On Tue, Jul 9, 2013 at 12:22 AM, Marc Espie es...@nerim.net wrote: Starting .Ox 5.4 , all supported platforms use it as the native binary file format. I think Starting *with* .Ox 5.4, ... but otherwise ok matthew. Updated with your suggestions. Regards Simon Index: elf.5 === RCS file: /cvs/src/share/man/man5/elf.5,v retrieving revision 1.20 diff -u -r1.20 elf.5 --- elf.5 17 Jan 2013 21:54:18 - 1.20 +++ elf.5 9 Jul 2013 16:07:27 - @@ -1355,9 +1355,10 @@ .Sh HISTORY .Ox ELF support first appeared in -.Ox 1.2 , -although not all supported platforms use it as the native -binary file format. +.Ox 1.2 . +Starting with +.Ox 5.4 , +all supported platforms use it as the native binary file format. ELF in itself first appeared in .At V . The ELF format is an adopted standard. Okay ok. jmc
Re: Static variables
On Mon, Jul 8, 2013 at 2:06 AM, Maxime Villard rusty...@gmx.fr wrote: Ah, yes. I didn't know. For what it's worth, this is specified in C99 §6.7.8 (Initializaton) paragraph 10: If an object that has static storage duration is not initialized explicitly, then: — if it has pointer type, it is initialized to a null pointer; — if it has arithmetic type, it is initialized to (positive or unsigned) zero; — if it is an aggregate, every member is initialized (recursively) according to these rules; — if it is a union, the first named member is initialized (recursively) according to these rules. On OpenBSD (and most, if not all, ELF platforms), this is implemented by assigning these objects into the .bss section, which is initialized to all zero bytes at program startup, taking advantage of the fact that all of our platforms represent null pointers and zero values as sequences of zero bytes.
Re: fsck vs fsck_ffs WHAT'S THE DIFFERENCE?
On Tue, Jun 18, 2013 at 11:04 AM, laborat...@cpnetserver.net wrote: Please, What is the difference between 'fsck' and 'fsck_ffs' command? fsck detects the filesystem on disk and runs the appropriate fsck_${FILESYSTEM} utility automatically.
Re: catopen(3) improvements
On Fri, May 31, 2013 at 5:19 PM, Stefan Sperling s...@openbsd.org wrote: Existing lib/libc/nls/*.msg files are renamed to the names shown in the libc/Makefile part of the diff, and new ones are added to support the UTF-8 locale (converted from the existing .msg files with iconv). I'm not including this change in the diff since that would mix four different character sets in a single email. Is it possible for us to use UTF-8 for all of the source files, but convert them to KOI8 or ISO-8859-1 as appropriate at build/install time? Or is that what you mean by we don't have iconv in base, so there's no functionality in base for converting between two charsets?
Re: catopen(3) improvements
On Fri, May 31, 2013 at 5:43 PM, Stefan Sperling s...@openbsd.org wrote: Yes. Conversion currently depends on the GNU iconv port. If iconv existed in base we could use just the UTF-8 source files. Hmm, I see. In that case, I guess committing converted files makes sense. In your man page diff, you should make use of .Dv and .Ev to identify the defines and environment variables. Instead of setting FD_CLOEXEC with fcntl(), just use O_CLOEXEC when opening. It looks like parse_lang() won't handle locales like C.UTF-8 because there's no '_'. I think you should be able to implement the %l/%t/%c stuff without needing to call strdup() for each one (e.g., by just doing a one time pass over the locale string up front and noting where the '_' and '.' chars are). Otherwise, you should report strdup() failures as ENOMEM to the catopen() caller, I think.
Re: check for device_lookup result in vscsi
On Fri, May 10, 2013 at 6:02 AM, Mike Belopuhov m...@belopuhov.com wrote: my intention here is very simple: there's a way you should call device_lookup and everyone has to fulfill it's part of the contract. all our devices do, vscsi doesn't. what's the reason for it to be one of a kind? I'm ok with adding a check, though I don't think it should really be necessary either. Just a thought: What about adding them as KASSERT()s? Why doesn't your diff include vscsiclose()?
Re: strcasecmp and multibyte encodings
On Tue, Apr 9, 2013 at 11:10 AM, Stefan Sperling s...@openbsd.org wrote: + size_t lus1 = strlen(us1); + size_t lus2 = strlen(us2); These strlen() calls are also wrong, because they could read past the n bytes allowed for strncasecmp(). It seems strncasecmp() cannot return errors according to POSIX. Only the _l variants have errors defined. The convention for methods like these to return errors is the caller is required to set errno = 0; before the call, and then check for errno != 0 after the call. :( While I recognize that POSIX mandates the current locale to be used for case conversion, I'm not sure this change is worth the extra complexity. As you point out, the standard seems to throw its hands up in the air when the question about conversion errors with strcasecmp() and strncasecmp() is raised, possibly because these functions are older than the locale concept. The definition of strncasecmp() has actually been a pretty contentious topic on the POSIX mailing list for the past few weeks, albeit mostly focused on whether LC_CTYPE=POSIX allows for UTF-8 or not. E.g., http://austingroupbugs.net/view.php?id=663 I'm inclined to leave the functions as is for now.
Re: rthreads are always enabled
On Fri, Apr 5, 2013 at 3:26 PM, Ted Unangst t...@tedunangst.com wrote: Index: sys/sysctl.h === RCS file: /cvs/src/sys/sys/sysctl.h,v retrieving revision 1.131 diff -u -p -r1.131 sysctl.h --- sys/sysctl.h24 Mar 2013 00:09:31 - 1.131 +++ sys/sysctl.h5 Apr 2013 22:18:44 - @@ -174,7 +174,6 @@ struct ctlname { #defineKERN_CPTIME271 /* array: cp_time2 */ #defineKERN_CACHEPCT 72 /* buffer cache % of physmem */ #defineKERN_FILE2 73 /* struct: file entries */ -#defineKERN_RTHREADS 74 /* kernel rthreads support enabled */ #defineKERN_CONSDEV75 /* dev_t: console terminal device */ #defineKERN_NETLIVELOCKS 76 /* int: number of network livelocks */ #defineKERN_POOL_DEBUG 77 /* int: enable pool_debug */ Maybe leave a comment that 74 was KERN_RTHREADS like for KERN_PROC? @@ -256,7 +255,7 @@ struct ctlname { { cp_time2, CTLTYPE_STRUCT }, \ { bufcachepercent, CTLTYPE_INT }, \ { file2, CTLTYPE_STRUCT }, \ - { rthreads, CTLTYPE_INT }, \ + { rthreads, 0 }, \ { consdev, CTLTYPE_STRUCT }, \ { netlivelocks, CTLTYPE_INT }, \ { pool_debug, CTLTYPE_INT }, \ Maybe change rthreads to gap like the other removed sysctl entries? Otherwise seems ok to me.
Re: ksh SIGFPE
On Wed, Mar 27, 2013 at 3:56 PM, Nicholas Marriott nicholas.marri...@gmail.com wrote: case O_DIV: case O_DIVASN: + if (vl-val.i == LONG_MIN vr-val.i == -1) + evalerr(es, ET_STR, can't represent result); res = vl-val.i / vr-val.i; break; I think LONG_MIN / -1 should still be LONG_MIN. We don't give errors for other overflow conditions.
Re: ksh SIGFPE
On Wed, Mar 27, 2013 at 4:15 PM, Nicholas Marriott nicholas.marri...@gmail.com wrote: Sure, that actually looks to be what other shells do anyhow. That looks ok to me. Which shells did you check, out of curiosity? On Goobuntu, both bash and dash give SIGFPE too actually. Checking POSIX, I notice that it requires that shell semantics match the C language semantics, but then C semantics for LONG_MIN / -1 and LONG_MIN % -1 appear to be undefined*... so there's nothing to match really. (* The glibc manual claims C requires LONG_MIN % -1 to evaluate to 0, but my reading of C11 6.5.5.6 says since LONG_MIN / -1 is not representable, that the behavior of LONG_MIN % -1 is undefined.)
Re: vfs_syscall: doreadlinkat()
Committed, thanks!
Re: vfs_syscall: doreadlinkat()
On Thu, Jan 24, 2013 at 9:57 AM, Maxime Villard rusty...@gmx.fr wrote: Hum here, if vp-v_type != VLNK, auio is untouched, but before returning we use auio.uio_resid, which is not initialized. Is it? Nice catch. We should probably move the *retval assignment up just after the VOP_READLINK() call, since this can technically result in undefined behavior if you try to readlink on a non-symlink file. I don't think it should leak any information moment though, since the EINVAL errno will take precedence instead of *retval when we return to userspace.
Re: tcp ping
On Thu, Sep 13, 2012 at 12:02 PM, Ted Unangst t...@tedunangst.com wrote: This adds a -T portnum option to ping. I haven't polished the output because I'm not sure if this is desirable or not, but I found it useful. If it's not a hell no, never in base I can finish it up some. Is there precedent for having TCP-based ping in any other OS's ping(1) utility? It seems like adding a separate tcpping(1) utility and leaving ping(1) for ICMP would be more appropriate. FWIW, nmap has a lot of TCP ping options; e.g., sending just SYN and ACK packets for pings instead of a complete TCP handshake.
Switch to -fstack-protector-all by default
The diff below changes GCC's default behavior to -fstack-protector-all (i.e., add stack protection code to every function instead of just some based on heuristics), but you can still revert to the heuristic behavior by passing -fstack-protector on the command line. Beware this diff has only received very light testing so far, so expect things to break and to need to rollback or reinstall. Please report back to the list what breaks so we can take a look and figure out how to address it. For most things that break, it should be possible to just add CFLAGS+=-fstack-protector to an appropriate Makefile to at least get the old behavior. Expect at least ld.so on sparc, sparc64, and powerpc to need -fstack-protector. The kernel is likely to need this too. To apply the patch: $ cd /usr/src/gnu/gcc/gcc $ patch /path/to/diff $ cd /usr/src/gnu/usr.bin/cc $ make depend $ make $ make install You can now try building things as usual. Thanks! Index: toplev.c === RCS file: /cvs/src/gnu/gcc/gcc/toplev.c,v retrieving revision 1.2 diff -u -p -r1.2 toplev.c --- toplev.c30 May 2010 22:04:10 - 1.2 +++ toplev.c12 Sep 2012 17:03:45 - @@ -1834,7 +1834,7 @@ process_options (void) /* Targets must be able to place spill slots at lower addresses. If the target already uses a soft frame pointer, the transition is trivial. */ if (flag_stack_protect == -1) -flag_stack_protect = FRAME_GROWS_DOWNWARD ? 1 : 0; +flag_stack_protect = FRAME_GROWS_DOWNWARD ? 2 : 0; if (!FRAME_GROWS_DOWNWARD flag_stack_protect) { warning (0, -fstack-protector not supported for this target);
ld.so pre-install sanity check
This diff is super hacky and just mailed out for proof-of-concept, but it's something that's been handy for me while hacking on ld.so. Basically before installing a new version of ld.so, it just runs a simple test program to double check that ld.so isn't completely hosed. There's plenty of issues with this diff as is (e.g., doesn't work for cross builds, assumes make obj, make clean doesn't work), but I'm curious what people think of the concept and whether anyone has suggestions how to improve it. Index: Makefile === RCS file: /cvs/src/libexec/ld.so/Makefile,v retrieving revision 1.37 diff -u -p -r1.37 Makefile --- Makefile3 Feb 2010 20:49:00 - 1.37 +++ Makefile12 Sep 2012 17:26:32 - @@ -35,4 +35,13 @@ ELF_LDFLAGS+=--shared -Bsymbolic --no-un $(PROG): $(LD) -x -e _dl_start $(ELF_LDFLAGS) -o $(PROG) $(OBJS) $(LDADD) +# XXX: Only for native builds +realinstall: sanitycheck + +.PHONY: sanitycheck +sanitycheck: sanitycheckprog $(PROG) + ./sanitycheckprog +sanitycheckprog: sanitycheckprog.o + $(CC) -o sanitycheckprog sanitycheckprog.o -Wl,--dynamic-linker,${.CURDIR}/obj/$(PROG) + .include bsd.prog.mk Index: sanitycheckprog.c === RCS file: sanitycheckprog.c diff -N sanitycheckprog.c --- /dev/null 1 Jan 1970 00:00:00 - +++ sanitycheckprog.c 12 Sep 2012 17:26:32 - @@ -0,0 +1,7 @@ +#include stdio.h + +int main() +{ + puts(ld.so works); + return 0; +}
Re: GCC diff needs testing on multiple arches
Just pinging this again since I've only received one test report so far, and this really needs more testing on architectures other than just amd64. alpha and hppa don't have -fstack-protector, so really that means arm, i386, mips, powerpc, sh, and sparc64. Please please please test if you have one of these machines! Making sure this works is an important step towards enabling -fstack-protector-all. It also makes -fstack-protector emit slightly smaller, slightly faster, and slightly more secure code for shared libraries, all of which are nice perks. On Wed, Aug 29, 2012 at 11:16:16AM -0700, Matthew Dempsky wrote: First, make sure you're using a reasonably up to date snapshot. You're up to date if nm /usr/lib/crtbegin.o mentions __guard_local. Next, apply the patch below to GCC, recompile, and install: $ cd /usr/src/gnu/gcc/gcc $ patch /path/to/diff $ cd /usr/src/gnu/usr.bin/cc $ make depend make make install Finally, test that stuff still works after you build it. :) In particular, make sure that you can still do a full make build by following release(8). Please email me back with test reports. I'm interested in getting a variety of (ELF) architectures before committing this. Thanks! Index: targhooks.c === RCS file: /cvs/src/gnu/gcc/gcc/targhooks.c,v retrieving revision 1.2 diff -u -p -r1.2 targhooks.c --- targhooks.c 9 May 2010 11:48:50 - 1.2 +++ targhooks.c 29 Aug 2012 18:05:48 - @@ -372,7 +372,7 @@ default_stack_protect_guard (void) if (t == NULL) { - t = build_decl (VAR_DECL, get_identifier (__guard), ptr_type_node); + t = build_decl (VAR_DECL, get_identifier (__guard_local), ptr_type_node); TREE_STATIC (t) = 1; TREE_PUBLIC (t) = 1; DECL_EXTERNAL (t) = 1; @@ -380,6 +380,8 @@ default_stack_protect_guard (void) TREE_THIS_VOLATILE (t) = 1; DECL_ARTIFICIAL (t) = 1; DECL_IGNORED_P (t) = 1; + DECL_VISIBILITY (t) = VISIBILITY_HIDDEN; + DECL_VISIBILITY_SPECIFIED (t) = 1; stack_chk_guard_decl = t; }
Re: GCC diff needs testing on multiple arches
On Wed, Aug 29, 2012 at 11:16 AM, Matthew Dempsky matt...@dempsky.org wrote: First, make sure you're using a reasonably up to date snapshot. You're up to date if nm /usr/lib/crtbegin.o mentions __guard_local. At least two people have commented that the snapshots they downloaded don't have __guard_local yet. Here's are some extra preliminary steps to help users with old snapshots: First, binutils needs to be up to date (i.e., less than 9 days old). If it's up to date, then ld --verbose | grep -c randomdata will print a non-zero value. If it prints zero, follow the steps at http://www.openbsd.org/faq/current.html#20120823 to update it. Next, your C runtime files need to be up to date. If it's up to date, then nm /usr/lib/crtbegin.o | grep -c __guard_local will print a non-zero value. If it prints zero, follow these steps to update them: $ cd /usr/src/lib/csu $ make depend make make install You can now follow my original steps for patching GCC.
GCC diff needs testing on multiple arches
First, make sure you're using a reasonably up to date snapshot. You're up to date if nm /usr/lib/crtbegin.o mentions __guard_local. Next, apply the patch below to GCC, recompile, and install: $ cd /usr/src/gnu/gcc/gcc $ patch /path/to/diff $ cd /usr/src/gnu/usr.bin/cc $ make depend make make install Finally, test that stuff still works after you build it. :) In particular, make sure that you can still do a full make build by following release(8). Please email me back with test reports. I'm interested in getting a variety of (ELF) architectures before committing this. Thanks! Index: targhooks.c === RCS file: /cvs/src/gnu/gcc/gcc/targhooks.c,v retrieving revision 1.2 diff -u -p -r1.2 targhooks.c --- targhooks.c 9 May 2010 11:48:50 - 1.2 +++ targhooks.c 29 Aug 2012 18:05:48 - @@ -372,7 +372,7 @@ default_stack_protect_guard (void) if (t == NULL) { - t = build_decl (VAR_DECL, get_identifier (__guard), ptr_type_node); + t = build_decl (VAR_DECL, get_identifier (__guard_local), ptr_type_node); TREE_STATIC (t) = 1; TREE_PUBLIC (t) = 1; DECL_EXTERNAL (t) = 1; @@ -380,6 +380,8 @@ default_stack_protect_guard (void) TREE_THIS_VOLATILE (t) = 1; DECL_ARTIFICIAL (t) = 1; DECL_IGNORED_P (t) = 1; + DECL_VISIBILITY (t) = VISIBILITY_HIDDEN; + DECL_VISIBILITY_SPECIFIED (t) = 1; stack_chk_guard_decl = t; }
Re: acpihpet quality
This reminds me... On Wed, Aug 15, 2012 at 8:02 AM, Ted Unangst t...@tedunangst.com wrote: 0x, /* counter_mask (24 bits) */ ... this comment has been bugging me for a while. I'm pretty sure it should be 32 bits. Seems like a bad copy/paste from acpitimer.c.
Re: add WARNINGS infrastructure to ftp
While I recognize there's some precedence in the tree for it already, this seems ugly to me. Is there a reason different programs/libraries need different diagnostic flags? Why doesn't this belong in bsd.own.mk or mk.conf?
Re: make acpiec _GLK aware
On Thu, Jul 12, 2012 at 9:48 AM, Paul Irofti p...@irofti.net wrote: + if (aml_evalname(sc-sc_acpi, sc-sc_devnode, _GLK, 0, NULL, res)) + sc-sc_glk = 0; + if (res.type != AML_OBJTYPE_INTEGER) + sc-sc_glk = 0; + else + sc-sc_glk = res.v_integer ? 1 : 0; The second if should be an else if, no? Or is res.type guaranteed to be initialized even if aml_evalname() fails?
Re: Virtio drivers for OpenBSD
On Wed, Jul 11, 2012 at 8:28 AM, Stefan Fritsch s...@sfritsch.de wrote: There is a virtio-scsi device, too, but this is only supported in very recent versions of qemu. To attach the simpler virtio-block device as scsi, the driver would have to emulate the scsi commands. Is there some generic infrastructure to do that? Or which driver would you recommend to take the emulation code from? sys/dev/ata/atascsi.c seems rather complex (it is twice the size of sys/dev/pci/viod.c). There are quite a few pretendy-SCSI drivers in the tree, but most of them deal with fairly complicated hardware so they're not good reference points. sparc64's vdsk(4) driver is probably the best reference point, since it's also for a virtual disk driver (see sys/arch/sparc64/dev/vdsk.c).
Re: Virtio drivers for OpenBSD
On Wed, Jul 11, 2012 at 8:20 AM, Kenneth R Westerback kwesterb...@rogers.com wrote: And we dream of the day when wd and vnd attach as scsi too! Stop peaking at my dev box's src tree!!
Re: Virtio drivers for OpenBSD
On Wed, Jul 11, 2012 at 6:01 AM, Jiri B ji...@devio.us wrote: Or even better, it would be much important to make OpenBSD to see added disks without reboot (it means rescanning the bus). I don't understand what you're referring to here. OpenBSD already handles hotplugged disks in quite a few cases; e.g., USB and some other SCSI drivers. (Notably ahci(4) does not yet though.)
Re: Virtio drivers for OpenBSD
On Wed, Jul 11, 2012 at 12:40 PM, Jiri B ji...@devio.us wrote: Like this one... mpi0 at pci0 dev 16 function 0 Symbios Logic 53c1030 rev 0x01: apic 2 int 17 scsibus1 at mpi0: 16 targets, initiator 7 sd0 at scsibus1 targ 0 lun 0: VMware, Virtual disk, 1.0 SCSI2 0/direct fixed sd0: 61440MB, 512 bytes/sector, 125829120 sectors For example on Linux you can add new disk and rescan, the disk is visible. I can test on one ESXi 5. I'm not terribly familiar with mpi(4), but I see it has code to call scsi_req_probe() if it receives a MPI_EVT_SASCH_REASON_ADDED or MPI_EVT_SASCH_REASON_NO_PERSIST_ADDED event from the adapter controller. However, it appears to only watch for events from Serial Attached SCSI and Fibre Channel controllers, whereas 53c1030 is a Parallel SCSI controller. Are you able to configure VMWare to emulate a different mpi(4) device? E.g., one that uses SAS or FC instead? Do you use the 53c1030 with Linux too?
Re: Typo st_mtim in stat manpage.
On Tue, Jul 10, 2012 at 11:49 AM, Han Boetes h...@boetes.org wrote: I found this mistake in the stat manpage: Not a mistake; see /usr/include/sys/stat.h.
Re: perl 5.16 update / macppc / signal dispatch test failure
It looks like arm and sh are broken too. I've added a regress test to /usr/src/regress/sys/kern/sigpending if anyone wants to try tackling one or more of these broken implementations. On Thu, Jun 28, 2012 at 5:10 PM, Matthew Dempsky matt...@dempsky.org wrote: On Thu, Jun 28, 2012 at 5:09 PM, Matthew Dempsky matt...@dempsky.org wrote: So the sigpending() system call returns the pending signals as an int value via register, but then the libc stub is supposed to take care of storing this as appropriate. If you look at libc/arch/amd64/sys/sigreturn.S, you'll notice there's some followup assembly code for this. It looks like the analogous code is missing from libc/arch/powerpc/sys/sigreturn.S. Er, sigpending.S of course. On Thu, Jun 28, 2012 at 4:12 PM, Stuart Henderson s...@spacehopper.org wrote: I'm doing some test runs with Perl 5.16 for Andrew Fresh and seeing a signal dispatch-related test failure on macppc kern.version=OpenBSD 5.2-beta (GENERIC.MP) #198: Sat Jun 23 11:30:52 MDT 2012 dera...@macppc.openbsd.org:/usr/src/sys/arch/macppc/compile/GENERIC.MP t/op/sigdispatch .. # Failed test 8 - sigpending at op/sigdispatch.t line 60 # got 536870912 # expected 0 but true # Failed test 9 - SIGUSR1 is pending at op/sigdispatch.t line 61 # got 0 # expected 1 FAILED at test 8 This is a standard (not threaded) build. I can give files to reproduce etc if wanted but wondered if anyone has ideas. It's not happening on amd64: kern.version=OpenBSD 5.2-beta (GENERIC.MP) #325: Thu Jun 21 10:08:05 MDT 2012 dera...@amd64.openbsd.org:/usr/src/sys/arch/amd64/compile/GENERIC.MP Tests are around http://perl5.git.perl.org/perl.git/blame/67288365cab33e76a48b697c001c11d4dc5b 1912:/t/op/sigdispatch.t#l60
Re: Support for UCD-SNMP-MIB in snmpd (part one)
On Tue, Jun 19, 2012 at 1:35 PM, Vincent Bernat ber...@luffy.cx wrote: Maybe it would be better to implement standard MIB like HOST-RESOURCES-MIB. snmpd already supports HOST-RESOURCES-MIB. Seth, does your diff expose any new information that's not already available via snmpd? If not, I think we'd prefer to stick with standard MIBs where possible. I implemented UCD-DISKIO-MIB only because I couldn't find any existing MIB that was a good fit for disk IO stats. (Arguably it might go under the interface statistics since you can think of SCSI as just a specialized network protocol, but I wasn't sure if that would fly and UCD-DISKIO-MIB was less work.)
Re: Support for UCD-SNMP-MIB in snmpd (part one)
[+joel] On Tue, Jun 19, 2012 at 2:11 PM, Seth Wright s...@crosse.org wrote: I kind of thought the UCD-SNMP-MIB was fairly standard and/or widespread, but perhaps not? HOST-RESOURCES-MIB is an RFC standard, which I assume gives it more stature than UCD-SNMP-MIB, but I honestly haven't used SNMP much in the last few years, so I'm not the best person to make a decision here. * HR-MIB includes hrProcessorLoad which is a percentage-based value for each CPU (if I'm reading the MIB right) for the one-minute timeframe. UCD-SNMP presents 1-, 5-, and 15-minute aggregate values, which may be more useful. I think you're referring to how UCD-SNMP-MIB exposes 1, 5, and 15 minute load averages, which are different from CPU utilization. In particular, it tends to be a common source of confusion among users because OpenBSD calculates load averages very differently than other operating systems (e.g., http://www.undeadly.org/cgi?action=articlesid=20090715034920mode=expanded). You should be able to calculate 5 and 15 minute rolling averages in your SNMP monitoring software too. * HR-MIB, under the hrStorage* attributes, includes free and used values for memory, but doesn't get any more granular than that. UCD-SNMP tries to break it down a bit further. That's valid, though I believe we could expose much more fine grained memory via HOST-RESOURCES-MIB too. E.g., buffer cache, kernel pool(9) and malloc(9) buckets, and so on. I'm not sure off hand if the memory groups in UCD-SNMP-MIB maps well to our VM system either, but I guess they all say they can be left out if the OS doesn't support the category. * UCD-SNMP includes a bunch of CPU stats that I'm only starting to look at (the third blank graph in my monitoring templates) that I can't find in a walk against base snmpd. Those were going to be in diff number next. These might be interesting, and it's data not exposed otherwise currently. The closest HOST-RESOURCES-MIB has is hrSWRunPerfCPU and hrSWRunPerfMem, but those don't track interrupt CPU usage. So yes, I think it does provide some value over what's already in there, but that's just the opinion of a guy scratching a proverbial itch. Understood, I'm just trying to make sure the diff makes sense to be committed.
Support for UCD-DISKIO-MIB in snmpd
I'd like to monitor disk IO on my workstation, but snmpd doesn't currently export hw.diskstats as far as I can tell. I found net-snmp supports UCD-DISKIO-MIB and it's not too complex, so I went ahead and coded up an implementation. It's been a while since I've touched SNMP and/or snmpd, but it seems to work correctly: $ snmpwalk -m UCD-DISKIO-MIB -v1 -c public 127.0.0.1 diskIOTable UCD-DISKIO-MIB::diskIOIndex.1 = INTEGER: 1 UCD-DISKIO-MIB::diskIOIndex.2 = INTEGER: 2 UCD-DISKIO-MIB::diskIOIndex.3 = INTEGER: 3 UCD-DISKIO-MIB::diskIODevice.1 = STRING: sd0 UCD-DISKIO-MIB::diskIODevice.2 = STRING: sd1 UCD-DISKIO-MIB::diskIODevice.3 = STRING: cd0 UCD-DISKIO-MIB::diskIONRead.1 = Counter32: 3584 UCD-DISKIO-MIB::diskIONRead.2 = Counter32: 3128056320 UCD-DISKIO-MIB::diskIONRead.3 = Counter32: 0 UCD-DISKIO-MIB::diskIONWritten.1 = Counter32: 0 UCD-DISKIO-MIB::diskIONWritten.2 = Counter32: 1418602496 UCD-DISKIO-MIB::diskIONWritten.3 = Counter32: 0 UCD-DISKIO-MIB::diskIOReads.1 = Counter32: 7 UCD-DISKIO-MIB::diskIOReads.2 = Counter32: 18550342 UCD-DISKIO-MIB::diskIOReads.3 = Counter32: 0 UCD-DISKIO-MIB::diskIOWrites.1 = Counter32: 0 UCD-DISKIO-MIB::diskIOWrites.2 = Counter32: 19485959 UCD-DISKIO-MIB::diskIOWrites.3 = Counter32: 0 UCD-DISKIO-MIB::diskIONReadX.1 = Counter64: 3584 UCD-DISKIO-MIB::diskIONReadX.2 = Counter64: 539998968320 UCD-DISKIO-MIB::diskIONReadX.3 = Counter64: 0 UCD-DISKIO-MIB::diskIONWrittenX.1 = Counter64: 0 UCD-DISKIO-MIB::diskIONWrittenX.2 = Counter64: 357900888064 UCD-DISKIO-MIB::diskIONWrittenX.3 = Counter64: 0 Test reports from regular SNMP users would be great. ok? Index: mib.h === RCS file: /home/mdempsky/anoncvs/cvs/src/usr.sbin/snmpd/mib.h,v retrieving revision 1.25 diff -u -p -r1.25 mib.h --- mib.h 20 Mar 2012 03:01:26 - 1.25 +++ mib.h 13 Jun 2012 19:19:23 - @@ -397,6 +397,22 @@ #define MIB_vantronix MIB_enterprises, 26766 #define MIB_openBSDMIB_enterprises, 30155 +/* UCD-DISKIO-MIB */ +#define MIB_ucdExperimentalMIB_ucDavis, 13 +#define MIB_ucdDiskIOMIB MIB_ucdExperimental, 15 +#define MIB_diskIOTableMIB_ucdDiskIOMIB, 1 +#define MIB_diskIOEntryMIB_diskIOTable, 1 +#define OIDIDX_diskIO 11 +#define OIDIDX_diskIOEntry 12 +#define MIB_diskIOIndexMIB_diskIOEntry, 1 +#define MIB_diskIODevice MIB_diskIOEntry, 2 +#define MIB_diskIONReadMIB_diskIOEntry, 3 +#define MIB_diskIONWritten MIB_diskIOEntry, 4 +#define MIB_diskIOReadsMIB_diskIOEntry, 5 +#define MIB_diskIOWrites MIB_diskIOEntry, 6 +#define MIB_diskIONReadX MIB_diskIOEntry, 12 +#define MIB_diskIONWrittenXMIB_diskIOEntry, 13 + /* OPENBSD-MIB */ #define MIB_pfMIBObjects MIB_openBSD, 1 #define MIB_pfInfo MIB_pfMIBObjects, 1 @@ -892,6 +908,19 @@ { MIBDECL(microSystems) }, \ { MIBDECL(vantronix) }, \ { MIBDECL(openBSD) }, \ + \ + { MIBDECL(ucdExperimental) }, \ + { MIBDECL(ucdDiskIOMIB) }, \ + { MIBDECL(diskIOTable) }, \ + { MIBDECL(diskIOEntry) }, \ + { MIBDECL(diskIOIndex) }, \ + { MIBDECL(diskIODevice) }, \ + { MIBDECL(diskIONRead) }, \ + { MIBDECL(diskIONWritten) },\ + { MIBDECL(diskIOReads) }, \ + { MIBDECL(diskIOWrites) }, \ + { MIBDECL(diskIONReadX) }, \ + { MIBDECL(diskIONWrittenX) }, \ \ { MIBDECL(pfMIBObjects) }, \ { MIBDECL(pfInfo) },\ Index: mib.c === RCS file: /home/mdempsky/anoncvs/cvs/src/usr.sbin/snmpd/mib.c,v retrieving revision 1.52 diff -u -p -r1.52 mib.c --- mib.c 20 Mar 2012 03:01:26 - 1.52 +++ mib.c 13 Jun 2012 19:19:09 - @@ -32,6 +32,7 @@ #include sys/socket.h #include sys/mount.h #include sys/ioctl.h +#include sys/disk.h #include net/if.h #include net/if_types.h @@ -3362,6 +3363,99 @@ mib_ipfroute(struct oid *oid, struct ber } /* + * Defined in UCD-DISKIO-MIB.txt. + */ + +intmib_diskio(struct oid *oid, struct ber_oid *o, struct ber_element **elm); + +static struct oid diskio_mib[] = { + { MIB(ucdDiskIOMIB),OID_MIB }, + { MIB(diskIOIndex), OID_TRD, mib_diskio }, + {
Re: add statfs64 to compat/linux
Are you missing the header bits for linux_statfs64? I don't see where it's defined. + /* +* Convert BSD filesystem names to Linux filesystem type numbers +* where possible. Linux statfs uses a value of -1 to indicate +* an unsupported field. +*/ + if (!strcmp(bsp-f_fstypename, MOUNT_FFS) || + !strcmp(bsp-f_fstypename, MOUNT_MFS)) + lsp-l_ftype = 0x11954; + else if (!strcmp(bsp-f_fstypename, MOUNT_NFS)) + lsp-l_ftype = 0x6969; + else if (!strcmp(bsp-f_fstypename, MOUNT_MSDOS)) + lsp-l_ftype = 0x4d44; + else if (!strcmp(bsp-f_fstypename, MOUNT_PROCFS)) + lsp-l_ftype = 0x9fa0; + else if (!strcmp(bsp-f_fstypename, MOUNT_EXT2FS)) + lsp-l_ftype = 0xef53; + else if (!strcmp(bsp-f_fstypename, MOUNT_CD9660)) + lsp-l_ftype = 0x9660; + else if (!strcmp(bsp-f_fstypename, MOUNT_NCPFS)) + lsp-l_ftype = 0x6969; + else + lsp-l_ftype = -1; Can this code go into a separate function so we don't have to duplicate it for both statfs and statfs64? +int +linux_sys_statfs64(struct proc *p, void *v, register_t *retval) +{ + struct linux_sys_statfs64_args /* { + syscallarg(char *) path; + syscallarg(struct linux_statfs *) sp; + } */ *uap = v; The comment is wrong: sp should be linux_statfs64. -268UNIMPL linux_sys_statfs64 +268STD { int linux_sys_statfs64(char *path, \ + struct linux_statfs64 *sp); } 269UNIMPL linux_sys_fstatfs64 It looks like linux_sys_fstatfs64 should be easy to implement too now if you wanted to do that in this diff as well.
Re: add statfs64 to compat/linux
On Tue, May 22, 2012 at 9:42 AM, Paul Irofti p...@irofti.net wrote: + /* +* Convert BSD filesystem names to Linux filesystem type numbers +* where possible. Linux statfs uses a value of -1 to indicate +* an unsupported field. +*/ + if (!strcmp(bsp-f_fstypename, MOUNT_FFS) || + !strcmp(bsp-f_fstypename, MOUNT_MFS)) + lsp-l_ftype = 0x11954; + else if (!strcmp(bsp-f_fstypename, MOUNT_NFS)) + lsp-l_ftype = 0x6969; + else if (!strcmp(bsp-f_fstypename, MOUNT_MSDOS)) + lsp-l_ftype = 0x4d44; + else if (!strcmp(bsp-f_fstypename, MOUNT_PROCFS)) + lsp-l_ftype = 0x9fa0; + else if (!strcmp(bsp-f_fstypename, MOUNT_EXT2FS)) + lsp-l_ftype = 0xef53; + else if (!strcmp(bsp-f_fstypename, MOUNT_CD9660)) + lsp-l_ftype = 0x9660; + else if (!strcmp(bsp-f_fstypename, MOUNT_NCPFS)) + lsp-l_ftype = 0x6969; + else + lsp-l_ftype = -1; Can this code go into a separate function so we don't have to duplicate it for both statfs and statfs64? I thought about it. Best would be a macro, but I hate C macro's. What about a function like this: long bsd_to_linux_filesystem_map(const char *fstypename) { if (!strcmp(fstypename, MOUNT_FFS) || !strcmp(fstypename, MOUNT_MFS)) return (0x11954); else if (!strcmp(fstypename, MOUNT_NFS)) return (0x6969); [...] else return (-1); } (I'm assuming linux_statfs64's l_ftype is also a long.)
Re: Question about wscons/wsksymdef.h
On Fri, May 18, 2012 at 4:27 PM, Juan Francisco Cantero Hurtado i...@juanfra.info wrote: In short. Can someone say me what are these values?. I don't want a master class, just a link or a bit of documentation is sufficient. Are you just trying to swap caps and ctrl at the console? If so, just put us.swapctrlcaps (or whatever) in /etc/kbdtype and reboot. (Or run kbd us.swapctrlcaps as root to do it instantly.)
Re: c++ headers w/ -pedantic, overflow in implicit constant conversion
On Thu, Mar 15, 2012 at 3:19 AM, Marc Espie es...@nerim.net wrote: #define __glibcxx_max(T) \ - (__glibcxx_signed (T) ? ((T)1 __glibcxx_digits (T)) - 1 : ~(T)0) + (__glibcxx_signed (T) ? \ + (T)1 (__glibcxx_digits (T) - 1)) - 1) 1) + 1) : ~(T)0) + How about (T)(((unsigned T)1 __glibc_digits(T)) - 1)? Also, we should use (T)-1 __glibc_digits(T) in __glibc_min to avoid relying on overflow behavior for signed types.
Re: c++ headers w/ -pedantic, overflow in implicit constant conversion
I'm pretty sure unsigned int is never a signed type.
Re: c++ headers w/ -pedantic, overflow in implicit constant conversion
Oh even if it's not signed that ternary branch will still be in code. I see. Hm. On May 10, 2012 9:23 AM, Matthew Dempsky matt...@dempsky.org wrote: