Re: pool related crashes, but "kernel did no panic"
On Tue, May 31, 2016 at 7:16 PM, Theo de Raadtwrote: >> is exactly 80 characters long (such a long printf violates "80 chars" >> rule, isn't it?). > > there is no hard and fast rule for that at all; printing extra newlines > has other downsides such as the screen scrolling sooner. Hi. I finally have a trace with pfsync related panic. See here http://article.gmane.org/gmane.os.openbsd.bugs/23666
Re: pool related crashes, but "kernel did no panic"
On Mon, May 30, 2016 at 9:02 PM, Ted Unangst <t...@tedunangst.com> wrote: > Alexey Suslikov wrote: >> On Thu, May 12, 2016 at 4:14 PM, Bob Beck <b...@openbsd.org> wrote: >> > Thank you!now that's a bug report.. >> >> Hi. >> >> Moved to 6.0-beta some time ago to make crash dumps more up >> to date. Also, removed some services to minimize their impact. >> >> Fresh build against today's cvs don't survived even half of the day. >> >> http://article.gmane.org/gmane.os.openbsd.bugs/23593 >> >> For me, it looks like: 5.7-5.8 - rare crashes, 5.9-6.0 - more frequent >> crashes. >> >> Backtrace differs from crash to crash, but this remains the same: >> >> Stopped at pool_put+0x1dd: xorq0x8(%rax),%rcx >> >> Do you have any idea where should I look in a source code? > > sys/kern/subr_pool.c Thanks for your replies. Especially Stefan who noticed "show pools" output being truncated for some reason. Here, kernel output is redirected to com, which is redirected to kvm, browser with java applet is connected to kvm. This is how I get it. amappl1: pool(0x81974640:amappl1): page inconsistency: page 0xff01e0 is exactly 80 characters long (such a long printf violates "80 chars" rule, isn't it?). Maybe there's a bug in kvm (java applet?) and output gets truncated. Anyway, let's see, because now I run with the following: Index: sys/kern/subr_pool.c === RCS file: /cvs/src/sys/kern/subr_pool.c,v retrieving revision 1.194 diff -u -p -u -p -r1.194 subr_pool.c --- sys/kern/subr_pool.c15 Jan 2016 11:21:58 -1.194 +++ sys/kern/subr_pool.c31 May 2016 09:10:21 - @@ -1160,7 +1160,8 @@ pool_chk_page(struct pool *pp, struct po page = (caddr_t)((u_long)ph & pp->pr_pgmask); if (page != ph->ph_page && POOL_INPGHDR(pp)) { printf("%s: ", label); -printf("pool(%p:%s): page inconsistency: page %p; " +printf("pool(%p:%s):\n" +"page inconsistency: page %p;\n" "at page head addr %p (p %p)\n", pp, pp->pr_wchan, ph->ph_page, ph, page); return 1; @@ -1172,9 +1173,10 @@ pool_chk_page(struct pool *pp, struct po if ((caddr_t)pi < ph->ph_page || (caddr_t)pi >= ph->ph_page + pp->pr_pgsize) { printf("%s: ", label); -printf("pool(%p:%s): page inconsistency: page %p;" -" item ordinal %d; addr %p\n", pp, -pp->pr_wchan, ph->ph_page, n, pi); +printf("pool(%p:%s):\n" +"page inconsistency: page %p;\n" +"item ordinal %d; addr %p\n", +pp, pp->pr_wchan, ph->ph_page, n, pi); return (1); } @@ -1204,16 +1206,18 @@ pool_chk_page(struct pool *pp, struct po #endif /* DIAGNOSTIC */ } if (n + ph->ph_nmissing != pp->pr_itemsperpage) { -printf("pool(%p:%s): page inconsistency: page %p;" -" %d on list, %d missing, %d items per page\n", pp, -pp->pr_wchan, ph->ph_page, n, ph->ph_nmissing, +printf("pool(%p:%s):\n" +"page inconsistency: page %p;\n" +"%d on list, %d missing, %d items per page\n", +pp, pp->pr_wchan, ph->ph_page, n, ph->ph_nmissing, pp->pr_itemsperpage); return 1; } if (expected >= 0 && n != expected) { -printf("pool(%p:%s): page inconsistency: page %p;" -" %d on list, %d missing, %d expected\n", pp, -pp->pr_wchan, ph->ph_page, n, ph->ph_nmissing, +printf("pool(%p:%s):\n" +"page inconsistency: page %p;\n" +"%d on list, %d missing, %d expected\n", +pp, pp->pr_wchan, ph->ph_page, n, ph->ph_nmissing, expected); return 1; }
Re: pool related crashes, but "kernel did no panic"
On Thu, May 12, 2016 at 4:14 PM, Bob Beckwrote: > Thank you!now that's a bug report.. Hi. Moved to 6.0-beta some time ago to make crash dumps more up to date. Also, removed some services to minimize their impact. Fresh build against today's cvs don't survived even half of the day. http://article.gmane.org/gmane.os.openbsd.bugs/23593 For me, it looks like: 5.7-5.8 - rare crashes, 5.9-6.0 - more frequent crashes. Backtrace differs from crash to crash, but this remains the same: Stopped at pool_put+0x1dd: xorq0x8(%rax),%rcx Do you have any idea where should I look in a source code? Thanks.
Re: pool related crashes, but "kernel did no panic"
On Fri, May 13, 2016 at 3:59 AM, David Gwynne <da...@gwynne.id.au> wrote: > >> On 12 May 2016, at 20:28, Alexey Suslikov <alexey.susli...@gmail.com> wrote: >> >> On Wed, Apr 27, 2016 at 7:22 PM, Theo de Raadt <dera...@cvs.openbsd.org> >> wrote: >>>> On 27/04/16(Wed) 15:45, Alexey Suslikov wrote: >>>>> Theo de Raadt cvs.openbsd.org> writes: >>>>> >>>>>> >>>>>> Most of these bug reports completely stink. >>>>>> >>>>>> ALWAYS include *ALL* information in a report. >>>>> >>>>> In an idealistic world, yes. >>>> >>>> In an idealistic world their would be no bug. >>> >>> In an idealistic world, Alexey Suslikov wouldn't feel compelled to >>> defend sloppiness. >> >> follow up is here >> >> http://marc.info/?l=openbsd-bugs=146304833425471=2 >> http://marc.info/?l=openbsd-bugs=146304864925575=2 >> > > this shoudl be fixed in stable. can you make sure you have the following: > > http://cvsweb.openbsd.org/cgi-bin/cvsweb/src/sys/kern/uipc_mbuf.c.diff?r1=1.219=1.219.2.1 what do you think about this (new) one http://marc.info/?l=openbsd-bugs=146312050712969=2 I really can do more to debug this and asked for an advice from the begging of this thread.
Re: pool related crashes, but "kernel did no panic"
On Wed, Apr 27, 2016 at 7:22 PM, Theo de Raadt <dera...@cvs.openbsd.org> wrote: >> On 27/04/16(Wed) 15:45, Alexey Suslikov wrote: >> > Theo de Raadt cvs.openbsd.org> writes: >> > >> > > >> > > Most of these bug reports completely stink. >> > > >> > > ALWAYS include *ALL* information in a report. >> > >> > In an idealistic world, yes. >> >> In an idealistic world their would be no bug. > > In an idealistic world, Alexey Suslikov wouldn't feel compelled to > defend sloppiness. follow up is here http://marc.info/?l=openbsd-bugs=146304833425471=2 http://marc.info/?l=openbsd-bugs=146304864925575=2
Re: pool related crashes, but "kernel did no panic"
Theo de Raadt cvs.openbsd.org> writes: > > Most of these bug reports completely stink. > > ALWAYS include *ALL* information in a report. In an idealistic world, yes. Above are not parts of the "chain", but different statements of the same bug. To have both blue screen and ddb, I need to keep kvm console running in a browser for undefined period of time (crash can occur twice per day, or once per 2 months), which isn't as easy as it seems. But sure I'll try to fill more complete report.
Re: pool related crashes, but "kernel did no panic"
Stuart Henderson spacehopper.org> writes: > There should be some lines printed before you get dumped into DDB > (probably a uvm_fault), the information in them is important. I either have a screenshot, or ddb. Not both at the same time. Here is one of screenshots from 5.9 transcribed: uvm_fault(0x81940240, 0x10, 0, 1) -> e fatal page fault in supervisor mode trap type 6 code 0 rip 811a5c3e cs 8 rflags 10206 cr 2 10 cpl a rsp 800022171e20 panic: trap type 6, code=0, pc=811a5c3e Starting stack trace... panic() at panic+0x10b trap() at trap+0x7b8 --- trap (number 6) --- pool_p_free() at pool_p_free+0x7e pool_gc_pages() at pool_gc_pages+0xe4 taskq_thread() at taskq_thread+0x6c end trace frame: 0x0, count: 252 End of stack trace. syncing disks... 5 done
Re: pool related crashes, but "kernel did no panic"
Another one from my collection. Apr 16: ddb{0}> show panic the kernel did not panic ddb{0}> trace pool_do_get() at pool_do_get+0x90 pool_get() at pool_get+0xb5 m_get() at m_get+0x28 sbappendaddr() at sbappendaddr+0x9a uipc_usrreq() at uipc_usrreq+0x3b8 sosend() at sosend+0x3d8 dosendsyslog() at dosendsyslog+0x110 sys_sendsyslog2() at sys_sendsyslog2+0xbd syscall() at syscall+0x368 --- syscall (number 112) --- end of kernel end trace frame: 0x183f8dab6913, count: -9 0x1842755e571a: ddb{0}> show registers rdi 0x7 rsi 0x9ff5c49ed229ae92 rbp 0x8000222f5b00 rbx 0xff022d80d6d0 rdx 0x8000222f5b64 rcx 0x818c76e0cpu_info_primary rax 0x7293fa06e984af44 r8 0 r9 0x1 r10 0x811c7c00uipc_usrreq r11 0x81344be0copy_fault r12 0x8194c000mbpool r13 0xff40b152a900 r14 0x2 r15 0x818b4570sun_noname rip 0x811a5340pool_do_get+0x90 cs 0x8 rflags 0x10282__ALIGN_SIZE+0xf282 rsp 0x8000222f5ab0 ss 0x10 pool_do_get+0x90: movq0(%r13),%rdi
pool related crashes, but "kernel did no panic"
Hi tech@. (Maybe related to http://marc.info/?l=openbsd-bugs=146174654219490=2). Crashing server acts as a carp backup (master has same hardware config but don't crash, in contrast to backup). Will post additional information if necessary. There's a collection of crashes (including pre 5.9) but see below for most recent ones. Any advice to track down the issue? Thanks, Alexey OpenBSD 5.9-stable (GENERIC.MP) #0: Sun Mar 27 16:03:33 EEST 2016 ***@***:/usr/src/sys/arch/amd64/compile/GENERIC.MP Apr 15: ddb{2}> show panic the kernel did not panic ddb{2}> trace pool_do_get() at pool_do_get+0x90 pool_get() at pool_get+0xb5 ffs_vget() at ffs_vget+0xa7 ufs_lookup() at ufs_lookup+0x36f VOP_LOOKUP() at VOP_LOOKUP+0x39 vfs_lookup() at vfs_lookup+0x277 namei() at namei+0x24c dofstatat() at dofstatat+0x94 syscall() at syscall+0x368 --- syscall (number 40) --- end of kernel end trace frame: 0x45ea97030a0, count: -9 0x45e29dc70fa: ddb{2}> show registers rdi 0x rsi 0x957581e21a424e5c rbp 0x8000224a2a10 rbx 0xff02290e7810 rdx 0x8000224a2a74 rcx 0x80067000 rax 0x5abd427fd20d77f3 r8 0x30 r9 0 r100 r11 0x8000224a2a10 r12 0x819694c0ffs_ino_pool r13 0xff122c4b0968 r14 0x9 r150x407 rip 0x811a5340pool_do_get+0x90 cs 0x8 rflags 0x10286__ALIGN_SIZE+0xf286 rsp 0x8000224a29c0 ss 0x10 pool_do_get+0x90: movq0(%r13),%rdi Apr 23: ddb{2}> show panic the kernel did not panic ddb{2}> trace pool_p_free() at pool_p_free+0x7e pool_gc_pages() at pool_gc_pages+0xe4 taskq_thread() at taskq_thread+0x6c end trace frame: 0x0, count: -3 ddb{2}> show registers rdi 0x8194c000mbpool rsi 0x60329ee8bc5a0776 rbp 0x800022171e70 rbx 0xff009e7b3300 rdx 0x9fcd61e822213476 rcx 0xddbc8af92f3ff41a rax 0x10 r8 0x1 r90xff0108eeda00 r10 0x1 r11 0x811a3e70pool_page_free r12 0xff022d8b7a50 r130 r14 0x8194c000mbpool r15 0x800022171e30 rip 0x811a5c3epool_p_free+0x7e cs 0x8 rflags 0x10206__ALIGN_SIZE+0xf206 rsp 0x800022171e20 ss 0x10 pool_p_free+0x7e: movq0(%rax),%rsi
Re: Optimize pledge-related notes in 59.html
Theo de Raadt openbsd.org> writes: > so thanks for your suggestion. have you ever noticed how suggestions > are taken less seriously when they are not formatted as a diff? --- 59.html.origThu Feb 18 11:45:24 2016 +++ 59.html Thu Feb 18 12:03:29 2016 @@ -100,21 +100,21 @@ http://www.openbsd.org/cgi-bin/man.cgi? query=dhclientsektion=8">dhclient(8) no longer exits if a desired route cannot be added. It now just reports the fact. http://www.openbsd.org/cgi-bin/man.cgi? query=dhclientsektion=8">dhclient(8) now takes a much more careful approach to received packets to ensure only received data is used to process the packet. Packets with incorrect length information or lacking appropriate header information are now dropped. http://www.openbsd.org/cgi-bin/man.cgi? query=dhclientsektion=8">dhclient(8) again disables pending timeouts if the interface link is lost, preventing endless retries at obtaining a lease. -http://www.openbsd.org/cgi-bin/man.cgi? query=dhclientsektion=8">dhclient(8) was pledged. http://www.openbsd.org/cgi-bin/man.cgi? query=dhcpdsektion=8">dhcpd(8) again properly utilizes default- lease-time, max-lease-time and bootp-lease-time options. -http://www.openbsd.org/cgi-bin/man.cgi? query=dhcpdsektion=8">dhcpd(8) was pledged. ... Security improvements: -... +http://www.openbsd.org/cgi-bin/man.cgi/? query=pledge">pledge(2), a new subsystem for restricting operations in programs, was added. +More than 200 daemons and programs was pledged, among them: http://www.openbsd.org/cgi-bin/man.cgi? query=dhclient=8">dhclient(8), http://www.openbsd.org/cgi-bin/man.cgi? query=dhcpd=8">dhcpd(8), http://www.openbsd.org/cgi- bin/man.cgi?query=fdisk=8">fdisk(8), http://www.openbsd.org/cgi-bin/man.cgi/? query=pdisk=8=macppc">pdisk(8). Support for looking up hosts via YP has been removed from libc. The 'yp' lookup method in http://www.openbsd.org/cgi-bin/man.cgi? query=resolv.conf=5">resolv.conf is no longer available. Support for the HOSTALIASES environment variable has been removed from libc. + ... @@ -123,7 +123,7 @@ doas is a little friendlier to use Updated flex Updated and improved less -http://www.openbsd.org/cgi-bin/man.cgi/OpenBSD- current/man8/macppc/pdisk.8?query=pdisk">pdisk(8) was largely rewritten and pledged. +http://www.openbsd.org/cgi-bin/man.cgi/? query=pdisk=8=macppc">pdisk(8) was largely rewritten. Renaming files in the root directory of a MSDOS filesystem was fixed. Many obsolete http://www.openbsd.org/cgi- bin/man.cgi/OpenBSD-current/man5/disktab.5?query=disktab">disktab(5) attributes and entries were removed. http://www.openbsd.org/cgi-bin/man.cgi/OpenBSD- current/man4/softraid.4?query=softraid">softraid(4) volumes now correctly look for the disklabel in the first OpenBSD disk partition, not the last. @@ -132,7 +132,6 @@ http://www.openbsd.org/cgi-bin/man.cgi? query=fdisksektion=8">fdisk(8) now has a '-b' flag that specifies the size of the EFI System partition to create. http://www.openbsd.org/cgi-bin/man.cgi? query=fdisksektion=8">fdisk(8) now has a '-v' flag that causes a verbose display of both MBR and GPT information. http://www.openbsd.org/cgi-bin/man.cgi? query=fdisksektion=8">fdisk(8) now provides full interactive GPT editing. -http://www.openbsd.org/cgi-bin/man.cgi? query=fdisksektion=8">fdisk(8) was pledged. Disks with sector sizes other than 512 bytes can now be partitioned with a GPT. The GPT kernel option was removed and GPT support is part of all GENERIC and GENERIC derived kernels. Many improvements were made to the GPT kernel support to ensure safe and reliable operation of GPT and MBR processing.
Optimize pledge-related notes in 59.html
Hi tech@. pledge itself is a security feature, so maybe it is better to put pledge under "Security improvements", like "More than 200 daemons and programs was pledged, among them: dhclient, dhcpd, fdisk" etc I found myself happy to understand what a) pledge is a security feature, b) how deep pledge usage is, from a single block of text, not collecting pieces here and there.
Re: Make em(4) more mpsafe again
Juuso Lapinlampi partyvan.eu> writes: > > - * These parameters control when the driver calls the routine to reclaim > > - * transmit descriptors. > > + * Thise parameter controls the minimum number of available transmit > > + * descriptors needed before we attempt transmission of a packet. > > */ > > There seems to be a typo in there. s/Thise/This/. Hey, it's fun to copy-paste all around. http://marc.info/?l=openbsd-tech=144362501114184=2
Re: pledge telnet
On Fri, Nov 13, 2015 at 8:40 PM, Theo de Raadtwrote: >> > On 2015/11/13 09:59, Theo de Raadt wrote: >> > > > > I really want to delete telnet entirely, >> > > > >> > > > I often use it for testing unencrypted SMTP and HTTP across the >> > > > Internet. Which tool would you recommend for that purpose? >> > > >> > > nc(1). >> > I use telnet fairly often for connecting to things like crappy switches, >> > crappy routers, APs of varying crappiness, etc. nc -t isn't close to being >> > good enough for this, also with nc it's difficult to send things like ^C >> > (even worse, if you use it much you forget about this and end up killing >> > your connection). I wouldn't mind having it removed from base, but would >> > need to go in ports unless nc gets a lot of polishing. >> >> I always thought of telnet as a kind of discipline over the wire. There are >> even extensions (like RFC 2217) well-fitting discipline model. > > Like a horse buggy in the inside lane of a 4-lane highway, there are going > to fatalities. > > "discipline" applies to the user of this code -- it means "avoid any and all > unnecessary use". > >> >From other hand, nc(1) is a "raw" tool with decent client-server model. >> >> Is there any possibility to run nc(1) as a privsep server, and a telnet(1) as >> a client, talking to nc(1) server via IMSG (instead of doing network stuff >> directly)? > > What's the goal. To continue the lifetime of telne? To make the nc code > more complicated and fragile? Those are the only outcomes I see. It is similar to (optional) XMODEM/ZMODEM disciplines over serial, IMO. The goal is to delete classic telnet entirely and make it an (optional) discipline frontend for nc(1). In "telnet mode" nc(1) will only attach discipline and let user use flow control features (like ^C). It is not about extending a lifetime of telnet, it is about making telnet truly optional by making it a discipline (or flow control protocol), not a separate tool.
Re: pledge telnet
On Fri, Nov 13, 2015 at 9:00 PM, Theo de Raadtwrote: >> It is similar to (optional) XMODEM/ZMODEM disciplines over serial, IMO. > > No, it is similar to over the INTERNET, because the INTERNET > is nothing at all like a serial line, the later generally being nicely > contained to a single room. > >> The goal is to delete classic telnet entirely and make it an >> (optional) discipline frontend for nc(1). In "telnet mode" nc(1) >> will only attach discipline and let user use flow control features (like >> ^C). > > You have a diff? > >> It is not about extending a lifetime of telnet, it is about making telnet >> truly >> optional by making it a discipline (or flow control protocol), not a separate >> tool. > > If you can do it without adding *any complexity* to nc, fine. > > Except I know you can't do that, it will add substantial complexity. > So this seems like a pointless discussion. nc is already more than > complex enough. Probably best to focus on making it more secure, > before making it support the stone age. Can telnet be extended to coexist with nc -F? Manual only mentions ssh.
Re: pledge telnet
Stuart Henderson wrote: > On 2015/11/13 09:59, Theo de Raadt wrote: > > > > I really want to delete telnet entirely, > > > > > > I often use it for testing unencrypted SMTP and HTTP across the > > > Internet. Which tool would you recommend for that purpose? > > > > nc(1). > I use telnet fairly often for connecting to things like crappy switches, > crappy routers, APs of varying crappiness, etc. nc -t isn't close to being > good enough for this, also with nc it's difficult to send things like ^C > (even worse, if you use it much you forget about this and end up killing > your connection). I wouldn't mind having it removed from base, but would > need to go in ports unless nc gets a lot of polishing. I always thought of telnet as a kind of discipline over the wire. There are even extensions (like RFC 2217) well-fitting discipline model. >From other hand, nc(1) is a "raw" tool with decent client-server model. Is there any possibility to run nc(1) as a privsep server, and a telnet(1) as a client, talking to nc(1) server via IMSG (instead of doing network stuff directly)?
Re: mpsafe gem(4)
Martin Pieuchot openbsd.org> writes: > + /* > + * If we have enough room, clear IFF_OACTIVE to tell the stack > + * that it iss OK to send packets. > + */ there's a typo here. "that it iss" should be "that it is".
Re: ifdef DIAGNOSTIC in azalia.c
Alexey Suslikov gmail.com> writes: > > Alexey Suslikov gmail.com> writes: > > > If there is a need to debug something in azalia.c, defining DIAGNOSTIC > > is overkill so replace two instances of DIAGNOSTIC with AZALIA_DEBUG > > (DPRINTF->printf suggested by ratchov ). > > > > Also, entirely remove 3rd instance of DIAGNOSTIC. Normally it is not > > compiled and, ratchov thinks, related to deleted code. > > > > Okays? Comments? > > Is there any interest in this? Alexandre, did I correctly understood an > output of our discussion? ping re-sending in case diff got missed. --- azalia.c.orig Wed Sep 23 16:10:19 2015 +++ azalia.cWed Sep 23 16:11:47 2015 @@ -1170,9 +1170,9 @@ uint32_t verb; uint16_t corbwp; -#ifdef DIAGNOSTIC +#ifdef AZALIA_DEBUG if ((AZ_READ_1(az, CORBCTL) & HDA_CORBCTL_CORBRUN) == 0) { - DPRINTF(("%s: CORB is not running.\n", XNAME(az))); + printf(("%s: CORB is not running.\n", XNAME(az))); return(-1); } #endif @@ -1196,9 +1196,9 @@ int i; uint16_t wp; -#ifdef DIAGNOSTIC +#ifdef AZALIA_DEBUG if ((AZ_READ_1(az, RIRBCTL) & HDA_RIRBCTL_RIRBDMAEN) == 0) { - DPRINTF(("%s: RIRB is not running.\n", XNAME(az))); + printf(("%s: RIRB is not running.\n", XNAME(az))); return(-1); } #endif @@ -4054,12 +4054,6 @@ /* number of blocks must be <= HDA_BDL_MAX */ az = v; size = az->pstream.buffer.size; -#ifdef DIAGNOSTIC - if (size <= 0) { - printf("%s: size is 0", __func__); - return 256; - } -#endif if (size > HDA_BDL_MAX * blk) { blk = size / HDA_BDL_MAX; if (blk & 0x7f)
Re: ifdef DIAGNOSTIC in azalia.c
Alexey Suslikov gmail.com> writes: > If there is a need to debug something in azalia.c, defining DIAGNOSTIC > is overkill so replace two instances of DIAGNOSTIC with AZALIA_DEBUG > (DPRINTF->printf suggested by ratchov ). > > Also, entirely remove 3rd instance of DIAGNOSTIC. Normally it is not > compiled and, ratchov thinks, related to deleted code. > > Okays? Comments? Is there any interest in this? Alexandre, did I correctly understood an output of our discussion?
Re: Unlocking ix(4) a bit further
Mark Kettenis xs4all.nl> writes: > + * Thise parameter controls the minimum number of available transmit "Thise" should be "This" here.
ifdef DIAGNOSTIC in azalia.c
Hi tech@. If there is a need to debug something in azalia.c, defining DIAGNOSTIC is overkill so replace two instances of DIAGNOSTIC with AZALIA_DEBUG (DPRINTF->printf suggested by ratchov@). Also, entirely remove 3rd instance of DIAGNOSTIC. Normally it is not compiled and, ratchov@ thinks, related to deleted code. Okays? Comments? --- azalia.c.orig Wed Sep 23 16:10:19 2015 +++ azalia.cWed Sep 23 16:11:47 2015 @@ -1170,9 +1170,9 @@ uint32_t verb; uint16_t corbwp; -#ifdef DIAGNOSTIC +#ifdef AZALIA_DEBUG if ((AZ_READ_1(az, CORBCTL) & HDA_CORBCTL_CORBRUN) == 0) { - DPRINTF(("%s: CORB is not running.\n", XNAME(az))); + printf(("%s: CORB is not running.\n", XNAME(az))); return(-1); } #endif @@ -1196,9 +1196,9 @@ int i; uint16_t wp; -#ifdef DIAGNOSTIC +#ifdef AZALIA_DEBUG if ((AZ_READ_1(az, RIRBCTL) & HDA_RIRBCTL_RIRBDMAEN) == 0) { - DPRINTF(("%s: RIRB is not running.\n", XNAME(az))); + printf(("%s: RIRB is not running.\n", XNAME(az))); return(-1); } #endif @@ -4054,12 +4054,6 @@ /* number of blocks must be <= HDA_BDL_MAX */ az = v; size = az->pstream.buffer.size; -#ifdef DIAGNOSTIC - if (size <= 0) { - printf("%s: size is 0", __func__); - return 256; - } -#endif if (size > HDA_BDL_MAX * blk) { blk = size / HDA_BDL_MAX; if (blk & 0x7f)
Re: Dropping needless globals (ksh)
Michael McConville sccs.swarthmore.edu> writes: > RCS file: /cvs/src/bin/ksh/c_ksh.c,v > - shprintf(newline); > + shprintf("\n"); In terms of portability, are you sure newline is \n on all platforms?
Re: Avoid grabbing the kernel lock in pool backend allocator
Mark Kettenis xs4all.nl> writes: > RCS file: /cvs/src/sys/kern/subr_pool.c,v > kd.kd_waitok = ISSET(flags, PR_WAITOK); > + /* > + * XXX Until we can call msleep(9) without holding the kernel > + * lock. > + */ > + if (ISSET(flags, PR_WAITOK)) It there a reason to re-evaluate ISSET while it is already de-normalized into kd.kd_waitok?
Wrong man links on 58.html
In wscons(4) works with even more odd trackpads. Added pvbus(4) paravirtual device tree root on virtual machines that are running on hypervisors. http://www.openbsd.org/cgi-bin/man.cgi?query=wscons(4)sec=4 http://www.openbsd.org/cgi-bin/man.cgi?query=pvbus(4)sec=4 are wrong. Should be http://www.openbsd.org/cgi-bin/man.cgi?query=wsconssec=4 http://www.openbsd.org/cgi-bin/man.cgi?query=pvbussec=4
size: cannot read a.out: No such file or directory
Hi tech@. size(1) DESCRIPTION says: ... If no file is specified size attempts to report on the file a.out. And, indeed, it warns: $ size size: cannot read a.out: No such file or directory Above message looks misleading in a.out-less world. Cheers, Alexey
Re: [Patch] pf refactoring
Martin Pieuchot mpi at openbsd.org writes: On 17/08/15(Mon) 17:39, Richard Procter wrote: Hi, This series of 29 small diffs slims pf.o by 2640 bytes and pf.c by 113 non-comment lines. We generally discuss one diff per mail. It makes it easier for people to comment and as you can imagine deal with possible conflicts in their tree :) As far as I understood, Richard provided step-by-step refactor diffs. What you want to discuss is a cumulative diff he referred to.
sys/arch/{hppa,hppa64}/dev/apic.c cosmetics, Was:Re: Brainy: User-Triggerable Kernel Memory Leak in execve()
Christian Schulte cs at schulte.it writes: _14/ UNINITIALIZED VARIABLE: sys/arch/hppa64/dev/apic.c rev1.8 At l.176, 'cnt' is not initialized. I came up with the following. --- sys/arch/hppa/dev/apic.c.orig Sun Aug 9 14:16:56 2015 +++ sys/arch/hppa/dev/apic.cSun Aug 9 14:30:47 2015 @@ -171,12 +171,11 @@ aiv = malloc(sizeof(struct apic_iv), M_DEVBUF, M_NOWAIT); if (aiv == NULL) { - free(cnt, M_DEVBUF, 0); - return NULL; + return (NULL); } cnt = malloc(sizeof(struct evcount), M_DEVBUF, M_NOWAIT); - if (!cnt) { + if (cnt == NULL) { free(aiv, M_DEVBUF, 0); return (NULL); } --- sys/arch/hppa64/dev/apic.c.orig Sun Aug 9 14:16:47 2015 +++ sys/arch/hppa64/dev/apic.c Sun Aug 9 14:31:14 2015 @@ -173,8 +173,7 @@ aiv = malloc(sizeof(struct apic_iv), M_DEVBUF, M_NOWAIT); if (aiv == NULL) { - free(cnt, M_DEVBUF, 0); - return NULL; + return (NULL); } aiv-sc = sc; @@ -185,7 +184,7 @@ aiv-cnt = NULL; if (apic_intr_list[irq]) { cnt = malloc(sizeof(struct evcount), M_DEVBUF, M_NOWAIT); - if (!cnt) { + if (cnt == NULL) { free(aiv, M_DEVBUF, 0); return (NULL); }
Re: Brainy: User-Triggerable Kernel Memory Leak in execve()
Theo de Raadt deraadt at cvs.openbsd.org writes: I would like to point out the noise is coming from *users* -- not from actual developers in the project. http://www.imdb.com/title/tt1278449/ you'll get the idea.
Re: Brainy: User-Triggerable Kernel Memory Leak in execve()
On Sat, Aug 8, 2015 at 2:21 PM, Christian Schulte c...@schulte.it wrote: Am 08/07/15 um 23:46 schrieb Alexey Suslikov: Christian Schulte cs at schulte.it writes: Now, I believe that this effort is too much for my spare time. Then why not release that scanner? That effort could be shared. What's so secret about it? You have been asked several times already. Start sharing right now. Brainy OpenBSD page contains info about lot of bugs already found. There is no secret to start writing diffs and pushing them. I was thinking about automating that process. Scan-before-commit, for example. Need not be that particular scanner. Some pre-commit analysis beyond what the compiler can warn about. How can I be sure the issues found by that scanner are not issues with the scanner itself? Looks like you haven't read carefully. Quote: Developing, improving and maintaining Brainy takes time and energy, as well as investigating and packaging the bugs and vulnerabilities it finds. You already have bugs found. Next step in the process is to write diffs.
Re: Brainy: User-Triggerable Kernel Memory Leak in execve()
Christian Schulte cs at schulte.it writes: Now, I believe that this effort is too much for my spare time. Then why not release that scanner? That effort could be shared. What's so secret about it? You have been asked several times already. Start sharing right now. Brainy OpenBSD page contains info about lot of bugs already found. There is no secret to start writing diffs and pushing them.
Re: Warning on assembly of RdRand instructions
Michael McConville mmcconv1 at sccs.swarthmore.edu writes: https://www.hyperelliptic.org/tanja/vortraege/random.pdf made my day: “The way RDRAND is being used in kernels = 3.12.3 allows it to cancel out the other entropy. See extract buf().” “if I make RDRAND return [EDX] ^ 0x41414141, /dev/urandom output will be all ’A’.”
Re: audio: recover after missed interrupts
Alexandre Ratchov alex at caoua.org writes: DPRINTFN(1, %s: rec ptr wrapped, moving %d blocs\n, DPRINTFN(1, %s: play ptr wrapped, moving %d blocs\n, blocs in above DPRINTFNs should be blocks, I think.
Re: Update to /etc/services
Denis Fondras openbsd at ledeuns.net writes: krb524 /tcp# Kerberos 5-4 I would tweak krb524 comment to be # Kerberos 5 to 4 because this is how krb524 reads.
Re: Brainy: User-Triggerable Kernel Memory Leak in execve()
Ville Valkonen weezelding at gmail.com writes: On Jul 21, 2015 9:32 AM, Maxime Villard max at m00nbsd.net wrote: It is not the last bug Brainy has found, but it is the last one I report. I don't have time for that. Maxime Why such a dramatic tone? Because that famous thank you small people sounds more and more ridiculous (some says Goebels'ish), no?
Re: Brainy: User-Triggerable Kernel Memory Leak in execve()
sam sam at cmpct.info writes: How about you release the Brainy Code Scanner then? I have so many bugs; in fact, there are so many, I don't even have the time to report them! My scanner is so good! Or perhaps you should report 'just' the relatively important ones? Made my day. Searching for bugs is for brainy. Victims of propaganda don't even search archives.
Re: Use m_defrag in intel wireless drivers
Mark Kettenis mark.kettenis at xs4all.nl writes: Index: if_ipw.c /* too many fragments, linearize */ Index: if_iwi.c /* too many fragments, linearize */ Index: if_iwn.c /* Too many DMA segments, linearize mbuf. */ Index: if_wpi.c /* Too many DMA segments, linearize mbuf. */ This comments can be homogenized. I'd choose iwn/wpi variant.
Tcl/Tk entry in www/57.html
Hi tech@. Tcl/Tk 8.5.16 and 8.6.2 line (Some highlights section) appears twice: ... Tcl/Tk 8.5.16 and 8.6.2 TeX Live 2013 Tcl/Tk 8.5.16 and 8.6.2 ...
Re: ntpd:support adjusting initial time = y2k36 on 32-bit time_t platforms
Brent Cook busterb at gmail.com writes: + T4 += (uint64_t)tv.tv_sec + JAN_1970 + 1.0e-6 * tv.tv_usec; snip + return ((uint64_t)tv.tv_sec + JAN_1970 + 1.0e-6 * tv.tv_usec); snip Can gettime_from_timeval be used over the code instead of repeating same chunk? T4 += gettime_from_timeval(... return gettime_from_timeval(...
Re: Questions about 802.11n support
T. Jameson Little beatgammit at gmail.com writes: Well, I'm much more capable of fixing existing drivers to make it work well than building something from scratch, and I imagine the same is true for many developers, because you work on whatever affects you. IMO, fixing existing drivers should take popularity into account. I asked sthen@ some time ago (in early 2013) about 802.11 drivers usage (according to dmesg logs), and he replied: we already have information about chips from dmesglog. since may 2009: 2 an 2 malo 2 urtwn 4 atu 4 zyd 7 acx 7 otus 7 ural 13 rsu 13 uath 16 ipw 33 wi 43 iwi 44 run 50 rum 67 bwi 105 urtw 107 ral 114 wpi 171 ath 199 athn 547 iwn (end of quote). So, IMO, fixing Intel's drivers maybe be kinda preferred way to go because of higher usage and better quality/documentation.
Re: Questions about 802.11n support
On Thu, Mar 5, 2015 at 11:45 PM, Stefan Sperling s...@stsp.name wrote: On Thu, Mar 05, 2015 at 09:22:51PM +, Alexey Suslikov wrote: T. Jameson Little beatgammit at gmail.com writes: Well, I'm much more capable of fixing existing drivers to make it work well than building something from scratch, and I imagine the same is true for many developers, because you work on whatever affects you. IMO, fixing existing drivers should take popularity into account. I asked sthen@ some time ago (in early 2013) about 802.11 drivers usage (according to dmesg logs), and he replied: we already have information about chips from dmesglog. since may 2009: 2 an 2 malo 2 urtwn 4 atu 4 zyd 7 acx 7 otus 7 ural 13 rsu 13 uath 16 ipw 33 wi 43 iwi 44 run 50 rum 67 bwi 105 urtw 107 ral 114 wpi 171 ath 199 athn 547 iwn (end of quote). So, IMO, fixing Intel's drivers maybe be kinda preferred way to go because of higher usage and better quality/documentation. This list doesn't count unsupported devices. It is skewed towards built-in devices, e.g. urtwn is quite common but it is at the bottom of this list. I think these numbers just mean that most laptop installs happen on thinkpads. Yes. I understand. I have urtwn too, because of built-in Ralink RT3290 rev 0x00 at pci2 dev 0 function 0 not configured is not supported (I tried to hack on top of linux driver with no success). http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=a89534edaaa7008992b878680490e9b02a665563 My point was, development should start around something widespread so people can test easily. This maybe urtwn, iwn and iwm, for instance.
Re: Questions about 802.11n support
T. Jameson Little beatgammit at gmail.com writes: Since USB 2.0 has a maximum throughput of 480Mbit/s, anything higher than 300Mbit/s is not particularly important, and many consumer devices only support 150Mbit/s anyway. 72Mbit/s is completely fine for an initial implementation. I slightly disagree here because newer PCIe connected chips can do more.
Re: Brainy: User-Triggerable Kernel Memory Leak
Maxime Villard max at M00nBSD.net writes: 'lsa' being user-controllable, it is easy for a local (un)privileged user to cause the kernel to run out of memory and become unresponsive. OpenBSD 5.6/i386 is affected, and perhaps previous releases. compat_linux(8) says: The Linux compatibility feature is active for kernels compiled with the COMPAT_LINUX option and kern.emul.linux sysctl(8) enabled.
Re: elantech-v4 clickpad support
Ulf Brosziewski ulf.brosziewski at t-online.de writes: I have written two patches that provide these options (I'm using them on an Acer V5-131 netbook with OpenBSD 5.6/amd64, the clickpad hardware and firmware is identified as Elantech Clickpad, version 4, firmware 0x461f02). There is, however, an open question concerning wsconscomm. Should I try on bios0: ASUSTeK COMPUTER INC. X200CA pms0: Elantech Clickpad, version 4, firmware 0x361f01 or it is somewhat another hardware/firmware?
Re: Kernel does not compile with option LOCKDEBUG
Philip Guenther guenther at gmail.com writes: It's dead, Jim, let's bury LOCKDEBUG. There is an define AZALIA_LOG_MP and accompanying code in sys/dev/pci/azalia.c which looks like a debug left-over. azalia(4) is considered MP-safe for over a year from now.
PATCH: azalia(4) invalid index crash
Hi tech@. See http://marc.info/?l=openbsd-bugsm=141867088702648w=2 Reported by t...@openmailbox.org, John M. Molloy moll...@acm.org and confirmed this diff to fix an issue. --- azalia.c.orig Mon Dec 15 23:23:14 2014 +++ azalia.cWed Dec 17 13:42:41 2014 @@ -2348,14 +2348,23 @@ if (ret = 0) return ret; } - } else { - index = w-connections[w-selected]; - if (VALID_WIDGET_NID(index, this)) { - ret = azalia_codec_find_defdac(this, index, - depth); - if (ret = 0) - return ret; - } + /* 7.3.3.2 Connection Select Control +* If an attempt is made to Set an index value greater than +* the number of list entries (index is equal to or greater +* than the Connection List Length property for the widget) +* the behavior is not predictable. +*/ + + /* negative index values are wrong too */ + } else if (w-selected = 0 + w-selected sizeof(w-connections)) { + index = w-connections[w-selected]; + if (VALID_WIDGET_NID(index, this)) { + ret = azalia_codec_find_defdac(this, + index, depth); + if (ret = 0) + return ret; + } } } @@ -2393,14 +2402,23 @@ if (ret = 0) return ret; } - } else { - index = w-connections[w-selected]; - if (VALID_WIDGET_NID(index, this)) { - ret = azalia_codec_find_defadc_sub(this, node, - index, depth); - if (ret = 0) - return ret; - } + /* 7.3.3.2 Connection Select Control +* If an attempt is made to Set an index value greater than +* the number of list entries (index is equal to or greater +* than the Connection List Length property for the widget) +* the behavior is not predictable. +*/ + + /* negative index values are wrong too */ + } else if (w-selected = 0 + w-selected sizeof(w-connections)) { + index = w-connections[w-selected]; + if (VALID_WIDGET_NID(index, this)) { + ret = azalia_codec_find_defadc_sub(this, + node, index, depth); + if (ret = 0) + return ret; + } } } return -1;
PATCH: more of airport
Hi tech@. Fixing existing names, plus some new. --- airport.origWed Dec 17 14:16:04 2014 +++ airport Wed Dec 17 14:25:51 2014 @@ -328,6 +328,7 @@ CJB:Peelamedu, Coimbatore, India CJS:International Abraham Gonzalez, Ciudad Juarez, Chihuahua, Mexico CJU:Cheju, Cheju, South Korea +CKC:Cherkasy International, Cherkasy, Ukraine CKB:North Central West Virginia, Bridgeport, West Virginia, USA CKS:International / Brasilia Brazil, Carajas, Para, Brazil CKY:Conakry, Conakry, Guinea @@ -389,7 +390,7 @@ CVN:Clovis, New Mexico, USA CWA:Central Wisconsin, Wausau, Wisconsin, USA CWB:Afonso Pena, Curitiba, Parana, Brazil -CWC:Chernovtsy, Ukraine +CWC:Chernivtsi International, Chernivtsi, Ukraine CWT:Cowra, Cowra, New South Wales, Australia CXH:Vancouver Harbour, British Columbia, Canada CYB:Cayman Brac Island, Cayman Islands @@ -433,11 +434,11 @@ DMK:Don Mueang International, Bangkok, Thailand DMR:Dhamar, Yemen DND:Dundee, Scotland, United Kingdom -DNK:Dnepropetrovsk, Ukraine +DNK:Dnipropetrovs'k International, Dnipropetrovs'k, Ukraine DNM:Denham, Western Australia, Australia DNV:Vermilion County, Danville, Illinois, USA DOH:Doha, Qatar -DOK:Donetsk, Ukraine +DOK:Donets'k Sergey Prokofiev International, Donets'k, Ukraine DOL:Saint Gatien, Deauville, France DOM:Melville Hall, Dominica DPL:Dipolog, Dipolog, Philippines @@ -694,7 +695,7 @@ HRB:Harbin, China HRE:Harare, Zimbabwe HRG:Hurghada, Egypt -HRK:Krarkov, Ukraine +HRK:Kharkiv International, Kharkiv, Ukraine HRL:Harlingen, Texas, USA HRO:Boone County, Harrison, Arkansas, USA HSI:Hastings, Nebraska, USA @@ -721,7 +722,8 @@ IBZ:Ibiza, Spain ICT:Wichita Mid-Continent, Kansas, USA IDA:Idaho Falls, Idaho, USA -IEV:Zhulhany, Kiev, Ukraine +IEV:Kyiv Zhulyany International, Kyiv, Ukraine +IFO:Ivano-Frankivs'k International, Ivano-Frankivs'k, Ukraine IFP:Bullhead City, Arizona, USA IGA:Inagua, Bahamas IGM:Mohave County, Kingman, Arizona, USA @@ -825,7 +827,7 @@ KAL:Kaltag, Alaska, USA KAN:Aminu Kano International, Nigeria KAT:Kaitaia, New Zealand -KBP:Borispol, Kiev, Ukraine +KBP:Kyiv Borispil International, Kyiv, Ukraine KBR:Sultan Ismail Petra, Kota Bharu, Malaysia KCG:Fisheries, Chignik, Alaska, USA KCH:Kuching, Sarawak, Malaysia @@ -842,6 +844,7 @@ KER:Kerman, Iran KGC:Kingscote, South Australia, Australia KGD:Kaliningrad, Russia +KHE:Kherson International, Kherson, Ukraine KHH:Kaohsiung, Taiwan KHI:Karachi, Pakistan KHV:Novy, Khabarovsk, Russia @@ -900,6 +903,7 @@ KVB:Skovde, Sweden KWA:Kwajalein, Marshall Islands KWI:Kuwait International +KWG:Kryvyi Rih International, Dnipropetrovs'k, Ukraine KWL:Guilin, China KZI:Kozani, Greece KZN:Kazan, Russia @@ -1000,7 +1004,7 @@ LVK:Livermore, California, USA LWB:Greenbrier Valley, West Virginia, USA LWK:Tingwall, Shetland Islands /Shetland Isd, Scotland, United Kingdom -LWO:Snilow, Lvov, Ukraine +LWO:Lviv Danylo Halytskyi International, Lviv, Ukraine LWT:Lewistown Municipal, Montana, USA LWY:Lawas, Sarawak, Malaysia LXR:Luxor, Egypt @@ -1105,6 +1109,7 @@ MPB:Miami Public Seaplane Base, Florida, USA MPL:Frejorgues, Montpellier, France MPM:Maputo International, Mozambique +MPW:Mariupol International, Donets'k, Ukraine MQL:Mildura, Victoria, Australia MQN:Rossvoll, Mo I Rana, Norway MQP:Kruger Mpumalanga International, Nelspruit, South Africa @@ -1167,6 +1172,7 @@ NKG:Nanjing, China NLA:Ndola, Zambia NLD:Nuevo Laredo, Tamaulipas, Mexico +NLV:Mykolayiv International, Mykolayiv, Ukraine NNG:Nanning, China NOC:Rep Of Ireland, Connaught, Ireland NOU:Tontouta, Noumea, New Caledonia @@ -1191,7 +1197,7 @@ OAX:Xoxocotlan, Oaxaca, Oaxaca, Mexico OBO:Obihiro, Japan ODE:Odense, Denmark -ODS:Odessa Central, Ukraine +ODS:Odessa International, Odessa, Ukraine ODW:Oak Harbor, Washington, USA OER:Ornskoldsvik, Sweden OFK:Norfolk Karl Stefan Memorial, Nebraska, USA @@ -1238,6 +1244,7 @@ OWB:Owensboro, Kentucky, USA OWD:Memorial Code: Owd, Norwood, Massachusetts, USA OXR:Oxnard / Ventura, California, USA +OZH:Zaporizhya International, Zaporizhya, Ukraine OZZ:Ouarzazate, Morocco PAD:Paderborn, Germany PAH:Paducah, Kentucky, USA @@ -1436,6 +1443,7 @@ RUN:Roland Garros Airport, Reunion Island, France RUT:Rutland, Vermont, USA RWI:Wilson, Rocky Mount, North Carolina, USA +RWN:Rivne International, Rivne, Ukraine SAB:Saba Island, Netherlands Antilles SAF:Santa Fe, New Mexico, USA SAH:Sanaa International, Yemen @@ -1492,7 +1500,7 @@ SHV:Shreveport, Louisiana, USA SID:Amilcar Cabral International, Sal, Cape Verde SIN:Changi International, Singapore -SIP:Simferopol, Ukraine +SIP:Simferopol International, Crimea, Ukraine SIR:Sion, Switzerland SIT:Sitka, Alaska, USA SJC:San Jose International, California, USA @@ -1650,6 +1658,7 @@ TMS:Sao Tome International, Sao Tome and Principe TMW:Tamworth, New South Wales, Australia TNG:Boukhalef Souahel, Tangier, Morocco +TNL:Ternopil International, Ternopil, Ukraine TNR:Ivato, Antananarivo, Madagascar TOL:Toledo
Re: Binary code patching and paravirtualization
CVSROOT: /cvs Module name: src Changes by: s...@cvs.openbsd.org 2014/12/16 14:02:58 Modified files: sys/arch/amd64/amd64: identcpu.c sys/arch/amd64/include: specialreg.h Log message: Define and print HV cpuid flag. This is set by many hypervisors, including kvm, vmware, hyper-v. do they set HV flag only for amd64 guests? how about i386 ones?
Re: Binary code patching and paravirtualization
Stefan Fritsch sf at sfritsch.de writes: --- a/sys/arch/amd64/include/specialreg.h +++ b/sys/arch/amd64/include/specialreg.h at at -158,6 +158,7 at at #define CPUIDECX_AVX0x1000 /* Advanced Vector Extensions */ #define CPUIDECX_F16C 0x2000 /* 16bit fp conversion */ #define CPUIDECX_RDRAND 0x4000 /* RDRAND instruction */ +#define CPUIDECX_HYPERV 0x8000 /* Hypervisor present */ Is this flag standardized? Last time I have tried to push this, there was an objection based on reserved for future use status of this flag. See http://marc.info/?l=openbsd-bugsm=136907278229145w=2 If it is a standard nowadays, could CPUIDECX_HYPERV be committed as a separate chunk? Cheers, Alexey
amd64 intro(4) refs
Hello tech@. I noticed isapnp(4) and eisa(4) refs in amd64 intro(4) while amd64 kernel config doesn't do neither isapnp, nor eisa. Looks like a remnant after i386 intro(4). Cheers, Alexey
Re: amd64 intro(4) refs
Jason McIntyre jmc at kerhand.co.uk writes: On Tue, Dec 09, 2014 at 10:27:45PM +0200, Alexey Suslikov wrote: I noticed isapnp(4) and eisa(4) refs in amd64 intro(4) while amd64 kernel config doesn't do neither isapnp, nor eisa. those pages are not MD, so they display for all archs. that's normal for stuff supported by more than one arch. From what I see, intro(4) is MD. In LIST OF DEVICES, armish, for instance, says: iic(4) Inter IC (I2C) bus onewire(4) 1-Wire bus pci(4) introduction to PCI bus support usb(4) introduction to Universal Serial Bus support while amd64 says: cardbus(4) introduction to CardBus support eisa(4) introduction to EISA bus support iic(4) Inter IC (I2C) bus isa(4) introduction to ISA bus support isapnp(4) introduction to ISA Plug-and-Play support onewire(4) 1-Wire bus pci(4) introduction to PCI bus support pcmcia(4) introduction to PCMCIA (PC Card) support usb(4) introduction to Universal Serial Bus support
Re: LibreSSL: GOST ciphers implementation
Chris Cappuccio chris at nmedia.net writes: So, you're saying, he's really dmitry at svr.gov.ru, the source of Russian backdoors into technology worldwide!!! I guess the open-source ecosystem has been thoroughly poisoned! Putin is going to take us over. OpenBSD and Linux are ruined! Fuck, I'm switching to Windows 8. Not enough played with RSA government backdoors, you just said you trust another GOST (which stands for 'GOvernment STandard').
Re: LibreSSL: GOST ciphers implementation
Bob Beck beck at openbsd.org writes: 1) It can't mess up the code base for everyone. 2) Everyone should not need to eat the dog food 3) I try to convince myself that our grant means a half of a cruise missile doesn't get built (c)
Re: armv7: banana pi, Allwinner A20 board
SASANO Takayoshi uaa at mx5.nisiq.net writes: Try this[1] kernel and have a look if it has the same issue or not. Kernel did not started... U-Boot says checksum is ok, so maybe .umg file is not corrupted. When using OpenBSD 5.6 (RAMDISK-SUNXI) #3: Sun Aug 31 18:46:49 EDT 2014 could you drop into config (pass -c to boot) and try to disable echi?
Re: run(4) firmware update; please test
Stefan Sperling stsp at openbsd.org writes: Are you able to try your run(4) device with FreeBSD-current (10 isn't new enough)? They claim to support your device and use the updated firmware. Please take a look at my (unfinished) attempt to bring MediaTek/Ralink RT5370/RT5372 support to run(4). http://marc.info/?l=openbsd-techm=138903287819764w=2
Re: vlan tagging surgery
Henning Brauer lists-openbsdtech at bsws.de writes: I must admit I am getting tired of all these good proposals/ideas. don't you think we've gone thru this before? Look, I haven't called them good or bad. what you propose would require a custom vlan_output function which does nothing but setting the flag and then calls ether_output. what exactly is won with that? except making things less obvious? preparing for the highly likely case that something but a vlan interface (as in, IFT_L2VLAN) needs to add a L2VLAN ethernet header? (I understand you want code - not theoretical speculations). I assume, there is an input and output of a stack. And lot of (possible) encapsulation subsystems in the middle: vlan, vlan-in-vlan, ipsec, you name it. And if I understood your cksum plan right, being in the middle, given packet doesn't know its destiny, but different subsystems may assign tags so on the output packet may assemble itself right (by calling necessary methods) Given a number of subsystems, delayed processing (promise pattern variation, actually) is way to go, imo, because stack will have homogeneous approach for entire packet assembly logic. In terms of above pattern, right: vlan_output will only set a flag and call ether_output - this is what you already did with cksums.
Re: vlan tagging surgery
Henning Brauer lists-openbsdtech at bsws.de writes: And lot of (possible) encapsulation subsystems in the middle: vlan, vlan-in-vlan, ipsec, you name it. VLAN IS NOT AN ENCAPSULATION. Well, vlan(4) says: vlan, svlan - IEEE 802.1Q/1AD encapsulation/decapsulation pseudo-device Given a number of subsystems, delayed processing (promise pattern variation, actually) is way to go, imo, because stack will have homogeneous approach for entire packet assembly logic. you cannot delay this reasonably, it IS far down the road, basically right before sending the frame out. In terms of above pattern, right: vlan_output will only set a flag and call ether_output - this is what you already did with cksums. no, not even remotely. sigh. Functionally, no, - I understand your point. But I'm talking about *pattern* you used. Looking at what Martin is doing, imo, you guys trying to achieve a) concentrate all packet (re)assembly in one place to minimize memory operations (so you need to delay some things); b) put one lock in and one lock out (you also need to delay to be able to put one single block of code somewhere in the output). What I see, old (spaghetti) approach and new (delayed) approach are trying to coexist.
WIP: MediaTek/Ralink RT5370/RT5372
This is based on http://svnweb.freebsd.org/base?view=revisionrevision=257955 For now, my DWA-140 rev B3 is able to * attach to run(4) and correctly identify MAC address; * load firmware on ifconfig up; * blink LED (so I think something is going thru a radio); * ifconfig down (LED stops blinking). run0 at uhub3 port 1 D-Link 11n Adapter rev 2.00/1.01 addr 6 run0: MAC/BBP RT5392 (rev 0x0222), RF RT5372 (MIMO 2T2R), address fc:75:16:85:ae:80 $ ifconfig run0 run0: flags=8843UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST mtu 1500 lladdr fc:75:16:85:ae:80 priority: 4 groups: wlan media: IEEE802.11 autoselect (DS1) status: no network ieee80211: nwid But ifconfig run0 scan gives nothing and adapter is unable to associate with AP if directly specified by nwid/wpakey. Any clue is welcome. (The following diff requires usbdevs diff I have sent previously). Index: sys/dev/ic/rt2860reg.h === RCS file: /cvs/src/sys/dev/ic/rt2860reg.h,v retrieving revision 1.31 diff -u -p -r1.31 rt2860reg.h --- sys/dev/ic/rt2860reg.h 26 Nov 2013 20:33:16 - 1.31 +++ sys/dev/ic/rt2860reg.h 6 Jan 2014 13:45:14 - @@ -696,6 +696,7 @@ /* possible flags for RT3020 RF register 1 */ #define RT3070_RF_BLOCK(1 0) +#define RT3070_PLL_PD (1 1) #define RT3070_RX0_PD (1 2) #define RT3070_TX0_PD (1 3) #define RT3070_RX1_PD (1 4) @@ -747,6 +748,15 @@ #define RT3090_DEF_LNA 10 +/* Possible flags for RT5390 RF register 3. */ +#define RT5390_VCOCAL (1 7) + +/* Possible flags for RT5390 RF register 38. */ +#define RT5390_RX_LO1 (1 5) + +/* Possible flags for RT5390 RF register 39. */ +#define RT5390_RX_LO2 (1 7) + /* RT2860 TX descriptor */ struct rt2860_txd { uint32_tsdp0; /* Segment Data Pointer 0 */ @@ -880,17 +890,19 @@ struct rt2860_rxwi { #define RT2860_RF3 1 #define RT2860_RF4 3 -#define RT2860_RF_2820 1 /* 2T3R */ -#define RT2860_RF_2850 2 /* dual-band 2T3R */ -#define RT2860_RF_2720 3 /* 1T2R */ -#define RT2860_RF_2750 4 /* dual-band 1T2R */ -#define RT3070_RF_3020 5 /* 1T1R */ -#define RT3070_RF_2020 6 /* b/g */ -#define RT3070_RF_3021 7 /* 1T2R */ -#define RT3070_RF_3022 8 /* 2T2R */ -#define RT3070_RF_3052 9 /* dual-band 2T2R */ -#define RT3070_RF_3320 11 /* 1T1R */ -#define RT3070_RF_3053 13 /* dual-band 3T3R */ +#define RT2860_RF_2820 0x0001 /* 2T3R */ +#define RT2860_RF_2850 0x0002 /* dual-band 2T3R */ +#define RT2860_RF_2720 0x0003 /* 1T2R */ +#define RT2860_RF_2750 0x0004 /* dual-band 1T2R */ +#define RT3070_RF_3020 0x0005 /* 1T1R */ +#define RT3070_RF_2020 0x0006 /* b/g */ +#define RT3070_RF_3021 0x0007 /* 1T2R */ +#define RT3070_RF_3022 0x0008 /* 2T2R */ +#define RT3070_RF_3052 0x0009 /* dual-band 2T2R */ +#define RT3070_RF_3320 0x000b /* 1T1R */ +#define RT3070_RF_3053 0x000d /* dual-band 3T3R */ +#define RT5390_RF_5370 0x5370 /* 1T1R */ +#define RT5390_RF_5372 0x5372 /* 2T2R */ /* USB commands for RT2870 only */ #define RT2870_RESET 1 @@ -1084,63 +1096,94 @@ static const struct rt2860_rate { { 105, 0x05 }, \ { 106, 0x35 } +#define RT5390_DEF_BBP \ + { 31, 0x08 }, \ + { 65, 0x2c }, \ + { 66, 0x38 }, \ + { 68, 0x0b }, \ + { 69, 0x0d }, \ + { 70, 0x06 }, \ + { 73, 0x13 }, \ + { 75, 0x46 }, \ + { 76, 0x28 }, \ + { 77, 0x59 }, \ + { 81, 0x37 }, \ + { 82, 0x62 }, \ + { 83, 0x7a }, \ + { 84, 0x9a }, \ + { 86, 0x38 }, \ + { 91, 0x04 }, \ + { 92, 0x02 }, \ + { 103, 0xc0 }, \ + { 104, 0x92 }, \ + { 105, 0x3c }, \ + { 106, 0x03 }, \ + { 128, 0x12 } + /* * Default settings for RF registers; values derived from the reference driver. */ -#define RT2860_RF2850 \ - { 1, 0x100bb3, 0x1301e1, 0x05a014, 0x001402 },\ - { 2, 0x100bb3, 0x1301e1, 0x05a014, 0x001407 },\ - { 3, 0x100bb3, 0x1301e2, 0x05a014, 0x001402 },\ - { 4, 0x100bb3, 0x1301e2, 0x05a014, 0x001407 },\ - { 5, 0x100bb3, 0x1301e3, 0x05a014, 0x001402 },\ - { 6, 0x100bb3, 0x1301e3, 0x05a014, 0x001407 },\ - { 7, 0x100bb3, 0x1301e4, 0x05a014, 0x001402 },\ - { 8, 0x100bb3, 0x1301e4, 0x05a014, 0x001407 },\ - { 9, 0x100bb3, 0x1301e5, 0x05a014, 0x001402 },\ - { 10, 0x100bb3, 0x1301e5, 0x05a014, 0x001407 },\ - { 11, 0x100bb3, 0x1301e6, 0x05a014, 0x001402 },\ - { 12, 0x100bb3, 0x1301e6, 0x05a014, 0x001407 },\ - { 13, 0x100bb3, 0x1301e7, 0x05a014, 0x001402 },\ - { 14, 0x100bb3, 0x1301e8, 0x05a014, 0x001404 },\ - { 36, 0x100bb3, 0x130266, 0x056014, 0x001408
Randomization from the bootblocks
Theo de Raadt deraadt at cvs.openbsd.org writes: This requires an upgrade of the bootblocks and at least /etc/rc (which saves an entropy file for future use). Some bootblocks will be able to use machine-dependent features to improve the entropy even further (for instance using random instructions or fast-running counters or such). As a result, the kernel can start using arc4random() exceedingly early on, even before interrupt entropy is collected. The randomization subsystem can hopefully become simpler due to this early entropy.. there is more work do here. I have a question. Having no interrupt (and such) entropy means less entropy. From other hand, there are lot of speculations about some hardware entropy sources are suspected (proven?) bad (or intentionally hijacked?). So question here is, does moving random generation closer to hardware paves a way to more predictable numbers? Cheers, Alexey
pfsync(4) mangles prio in master/slave setup
Hi. This is on 5.4-stable. Trivial master/slave carp(4) setup. vlan(4) is to make picture clear wrt prio. Test 1 (without using match). pf.conf (BOX1 and BOX2). ext_if=vlan101 dmz_if=vlan10 pf_sync=vlan50 block log all pass quick on $pf_sync proto pfsync keep state (no-sync) set prio 7 pass quick on { $ext_if, $dmz_if } proto carp keep state (no-sync) pass quick on $dmz_if inet proto icmp all icmp-type echoreq set prio 5 pass quick on $dmz_if pass out quick on $ext_if inet proto icmp all icmp-type echoreq set prio 5 pass out quick on $ext_if BOX1 is Master, BOX2 is Slave. BOX1: 00:07:36.108948 802.1Q vid 10 pri 3 X.X.185.145 X.X.36.14: icmp: echo request 00:07:36.109281 802.1Q vid 101 pri 5 X.X.185.145 X.X.36.14: icmp: echo request 00:07:36.110013 802.1Q vid 101 pri 0 X.X.36.14 X.X.185.145: icmp: echo reply 00:07:36.110030 802.1Q vid 10 pri 5 X.X.36.14 X.X.185.145: icmp: echo reply BOX1 is Slave, BOX2 is Master. BOX2: 00:12:43.981979 802.1Q vid 10 pri 3 X.X.185.145 X.X.36.14: icmp: echo request 00:12:43.982013 802.1Q vid 101 pri 0 X.X.185.145 X.X.36.14: icmp: echo request 00:12:43.982693 802.1Q vid 101 pri 0 X.X.36.14 X.X.185.145: icmp: echo reply 00:12:43.982713 802.1Q vid 10 pri 0 X.X.36.14 X.X.185.145: icmp: echo reply Test 2 (using match). pf.conf (BOX1 and BOX2). ext_if=vlan101 dmz_if=vlan10 pf_sync=vlan50 block log all match quick on { $ext_if, $dmz_if } inet proto icmp all icmp-type echoreq set prio 5 pass quick on $pf_sync proto pfsync keep state (no-sync) set prio 7 pass quick on { $ext_if, $dmz_if } proto carp keep state (no-sync) pass quick on $dmz_if pass out quick on $ext_if BOX1 is Master, BOX2 is Slave. BOX1: 00:27:47.442820 802.1Q vid 10 pri 3 X.X.185.145 X.X.36.14: icmp: echo request 00:27:47.442839 802.1Q vid 101 pri 5 X.X.185.145 X.X.36.14: icmp: echo request 00:27:48.468709 802.1Q vid 101 pri 0 X.X.36.14 X.X.185.145: icmp: echo reply 00:27:47.443523 802.1Q vid 10 pri 5 X.X.36.14 X.X.185.145: icmp: echo reply BOX1 is Slave, BOX2 is Master. BOX2: 00:30:35.317329 802.1Q vid 10 pri 3 X.X.185.145 X.X.36.14: icmp: echo request 00:30:35.317354 802.1Q vid 101 pri 0 X.X.185.145 X.X.36.14: icmp: echo request 00:30:35.318065 802.1Q vid 101 pri 0 X.X.36.14 X.X.185.145: icmp: echo reply 00:30:35.318084 802.1Q vid 10 pri 0 X.X.36.14 X.X.185.145: icmp: echo reply Maybe ICMP is not a sort of traffic which makes difference, but think about TCP ACKs are prioritized. Switching to Slave in production setup makes things *REALLY* bad. Should I configure something, or this is an issue? (Speaking of pfsync code, I'm unable to find where prio is set inside pfsync_state_import). Thanks, Alexey
Re: pfsync(4) mangles prio in master/slave setup
On Wed, Nov 20, 2013 at 1:32 PM, Mike Belopuhov m...@belopuhov.com wrote: could you please add more description to this report since it's very hard to follow and interpret your mail. basically, when setup switches to slave, packets (matching given state) have wrong prio set (wrong means they were right when state was created on master). I will be glad to provide more information/tests/etc - just say what is needed. On 20 November 2013 12:11, Alexey Suslikov alexey.susli...@gmail.com wrote: Hi. This is on 5.4-stable. Trivial master/slave carp(4) setup. vlan(4) is to make picture clear wrt prio. Test 1 (without using match). pf.conf (BOX1 and BOX2). ext_if=vlan101 dmz_if=vlan10 pf_sync=vlan50 block log all pass quick on $pf_sync proto pfsync keep state (no-sync) set prio 7 pass quick on { $ext_if, $dmz_if } proto carp keep state (no-sync) pass quick on $dmz_if inet proto icmp all icmp-type echoreq set prio 5 pass quick on $dmz_if pass out quick on $ext_if inet proto icmp all icmp-type echoreq set prio 5 pass out quick on $ext_if BOX1 is Master, BOX2 is Slave. BOX1: 00:07:36.108948 802.1Q vid 10 pri 3 X.X.185.145 X.X.36.14: icmp: echo request 00:07:36.109281 802.1Q vid 101 pri 5 X.X.185.145 X.X.36.14: icmp: echo request 00:07:36.110013 802.1Q vid 101 pri 0 X.X.36.14 X.X.185.145: icmp: echo reply 00:07:36.110030 802.1Q vid 10 pri 5 X.X.36.14 X.X.185.145: icmp: echo reply BOX1 is Slave, BOX2 is Master. BOX2: 00:12:43.981979 802.1Q vid 10 pri 3 X.X.185.145 X.X.36.14: icmp: echo request 00:12:43.982013 802.1Q vid 101 pri 0 X.X.185.145 X.X.36.14: icmp: echo request 00:12:43.982693 802.1Q vid 101 pri 0 X.X.36.14 X.X.185.145: icmp: echo reply 00:12:43.982713 802.1Q vid 10 pri 0 X.X.36.14 X.X.185.145: icmp: echo reply Test 2 (using match). pf.conf (BOX1 and BOX2). ext_if=vlan101 dmz_if=vlan10 pf_sync=vlan50 block log all match quick on { $ext_if, $dmz_if } inet proto icmp all icmp-type echoreq set prio 5 pass quick on $pf_sync proto pfsync keep state (no-sync) set prio 7 pass quick on { $ext_if, $dmz_if } proto carp keep state (no-sync) pass quick on $dmz_if pass out quick on $ext_if BOX1 is Master, BOX2 is Slave. BOX1: 00:27:47.442820 802.1Q vid 10 pri 3 X.X.185.145 X.X.36.14: icmp: echo request 00:27:47.442839 802.1Q vid 101 pri 5 X.X.185.145 X.X.36.14: icmp: echo request 00:27:48.468709 802.1Q vid 101 pri 0 X.X.36.14 X.X.185.145: icmp: echo reply 00:27:47.443523 802.1Q vid 10 pri 5 X.X.36.14 X.X.185.145: icmp: echo reply BOX1 is Slave, BOX2 is Master. BOX2: 00:30:35.317329 802.1Q vid 10 pri 3 X.X.185.145 X.X.36.14: icmp: echo request 00:30:35.317354 802.1Q vid 101 pri 0 X.X.185.145 X.X.36.14: icmp: echo request 00:30:35.318065 802.1Q vid 101 pri 0 X.X.36.14 X.X.185.145: icmp: echo reply 00:30:35.318084 802.1Q vid 10 pri 0 X.X.36.14 X.X.185.145: icmp: echo reply Maybe ICMP is not a sort of traffic which makes difference, but think about TCP ACKs are prioritized. Switching to Slave in production setup makes things *REALLY* bad. Should I configure something, or this is an issue? (Speaking of pfsync code, I'm unable to find where prio is set inside pfsync_state_import). Thanks, Alexey
Re: pfsync(4) mangles prio in master/slave setup
On Wed, Nov 20, 2013 at 1:38 PM, Alexey Suslikov alexey.susli...@gmail.com wrote: On Wed, Nov 20, 2013 at 1:32 PM, Mike Belopuhov m...@belopuhov.com wrote: could you please add more description to this report since it's very hard to follow and interpret your mail. basically, when setup switches to slave, packets (matching given state) have wrong prio set (wrong means they were right when state was created on master). I will be glad to provide more information/tests/etc - just say what is needed. On 20 November 2013 12:11, Alexey Suslikov alexey.susli...@gmail.com wrote: Hi. This is on 5.4-stable. Trivial master/slave carp(4) setup. vlan(4) is to make picture clear wrt prio. Test 1 (without using match). pf.conf (BOX1 and BOX2). ext_if=vlan101 dmz_if=vlan10 pf_sync=vlan50 block log all pass quick on $pf_sync proto pfsync keep state (no-sync) set prio 7 pass quick on { $ext_if, $dmz_if } proto carp keep state (no-sync) pass quick on $dmz_if inet proto icmp all icmp-type echoreq set prio 5 pass quick on $dmz_if pass out quick on $ext_if inet proto icmp all icmp-type echoreq set prio 5 pass out quick on $ext_if BOX1 is Master, BOX2 is Slave. BOX1: 00:07:36.108948 802.1Q vid 10 pri 3 X.X.185.145 X.X.36.14: icmp: echo request 00:07:36.109281 802.1Q vid 101 pri 5 X.X.185.145 X.X.36.14: icmp: echo request 00:07:36.110013 802.1Q vid 101 pri 0 X.X.36.14 X.X.185.145: icmp: echo reply 00:07:36.110030 802.1Q vid 10 pri 5 X.X.36.14 X.X.185.145: icmp: echo reply BOX1 is Slave, BOX2 is Master. BOX2: 00:12:43.981979 802.1Q vid 10 pri 3 X.X.185.145 X.X.36.14: icmp: echo request 00:12:43.982013 802.1Q vid 101 pri 0 X.X.185.145 X.X.36.14: icmp: echo request 00:12:43.982693 802.1Q vid 101 pri 0 X.X.36.14 X.X.185.145: icmp: echo reply 00:12:43.982713 802.1Q vid 10 pri 0 X.X.36.14 X.X.185.145: icmp: echo reply While on Slave, having all zeroes prio in output path (echo request in vlan101 and reply in vlan10), imo, indicates a state being crafted by pfsync_state_import without a prio took in account. In contrast to set_tos, min_ttl and such, pf_state_export too isn't doing anything about prio, so I think prio neither exported to pfsync packet, nor imported from. Test 2 (using match). pf.conf (BOX1 and BOX2). ext_if=vlan101 dmz_if=vlan10 pf_sync=vlan50 block log all match quick on { $ext_if, $dmz_if } inet proto icmp all icmp-type echoreq set prio 5 pass quick on $pf_sync proto pfsync keep state (no-sync) set prio 7 pass quick on { $ext_if, $dmz_if } proto carp keep state (no-sync) pass quick on $dmz_if pass out quick on $ext_if BOX1 is Master, BOX2 is Slave. BOX1: 00:27:47.442820 802.1Q vid 10 pri 3 X.X.185.145 X.X.36.14: icmp: echo request 00:27:47.442839 802.1Q vid 101 pri 5 X.X.185.145 X.X.36.14: icmp: echo request 00:27:48.468709 802.1Q vid 101 pri 0 X.X.36.14 X.X.185.145: icmp: echo reply 00:27:47.443523 802.1Q vid 10 pri 5 X.X.36.14 X.X.185.145: icmp: echo reply BOX1 is Slave, BOX2 is Master. BOX2: 00:30:35.317329 802.1Q vid 10 pri 3 X.X.185.145 X.X.36.14: icmp: echo request 00:30:35.317354 802.1Q vid 101 pri 0 X.X.185.145 X.X.36.14: icmp: echo request 00:30:35.318065 802.1Q vid 101 pri 0 X.X.36.14 X.X.185.145: icmp: echo reply 00:30:35.318084 802.1Q vid 10 pri 0 X.X.36.14 X.X.185.145: icmp: echo reply Maybe ICMP is not a sort of traffic which makes difference, but think about TCP ACKs are prioritized. Switching to Slave in production setup makes things *REALLY* bad. Should I configure something, or this is an issue? (Speaking of pfsync code, I'm unable to find where prio is set inside pfsync_state_import). Thanks, Alexey
Re: pfsync(4) mangles prio in master/slave setup
On Wed, Nov 20, 2013 at 2:15 PM, Florian Obser flor...@openbsd.org wrote: On Wed, Nov 20, 2013 at 01:38:11PM +0200, Alexey Suslikov wrote: On Wed, Nov 20, 2013 at 1:32 PM, Mike Belopuhov m...@belopuhov.com wrote: could you please add more description to this report since it's very hard to follow and interpret your mail. basically, when setup switches to slave, packets (matching given state) have wrong prio set (wrong means they were right when state was created on master). I will be glad to provide more information/tests/etc - just say what is needed. Do you have the same ruleset checksum on both machines? check with pfctl -vs info | fgrep Checksum yes. checksums are same. On 20 November 2013 12:11, Alexey Suslikov alexey.susli...@gmail.com wrote: Hi. This is on 5.4-stable. Trivial master/slave carp(4) setup. vlan(4) is to make picture clear wrt prio. Test 1 (without using match). pf.conf (BOX1 and BOX2). ext_if=vlan101 dmz_if=vlan10 pf_sync=vlan50 block log all pass quick on $pf_sync proto pfsync keep state (no-sync) set prio 7 pass quick on { $ext_if, $dmz_if } proto carp keep state (no-sync) pass quick on $dmz_if inet proto icmp all icmp-type echoreq set prio 5 pass quick on $dmz_if pass out quick on $ext_if inet proto icmp all icmp-type echoreq set prio 5 pass out quick on $ext_if BOX1 is Master, BOX2 is Slave. BOX1: 00:07:36.108948 802.1Q vid 10 pri 3 X.X.185.145 X.X.36.14: icmp: echo request 00:07:36.109281 802.1Q vid 101 pri 5 X.X.185.145 X.X.36.14: icmp: echo request 00:07:36.110013 802.1Q vid 101 pri 0 X.X.36.14 X.X.185.145: icmp: echo reply 00:07:36.110030 802.1Q vid 10 pri 5 X.X.36.14 X.X.185.145: icmp: echo reply BOX1 is Slave, BOX2 is Master. BOX2: 00:12:43.981979 802.1Q vid 10 pri 3 X.X.185.145 X.X.36.14: icmp: echo request 00:12:43.982013 802.1Q vid 101 pri 0 X.X.185.145 X.X.36.14: icmp: echo request 00:12:43.982693 802.1Q vid 101 pri 0 X.X.36.14 X.X.185.145: icmp: echo reply 00:12:43.982713 802.1Q vid 10 pri 0 X.X.36.14 X.X.185.145: icmp: echo reply Test 2 (using match). pf.conf (BOX1 and BOX2). ext_if=vlan101 dmz_if=vlan10 pf_sync=vlan50 block log all match quick on { $ext_if, $dmz_if } inet proto icmp all icmp-type echoreq set prio 5 pass quick on $pf_sync proto pfsync keep state (no-sync) set prio 7 pass quick on { $ext_if, $dmz_if } proto carp keep state (no-sync) pass quick on $dmz_if pass out quick on $ext_if BOX1 is Master, BOX2 is Slave. BOX1: 00:27:47.442820 802.1Q vid 10 pri 3 X.X.185.145 X.X.36.14: icmp: echo request 00:27:47.442839 802.1Q vid 101 pri 5 X.X.185.145 X.X.36.14: icmp: echo request 00:27:48.468709 802.1Q vid 101 pri 0 X.X.36.14 X.X.185.145: icmp: echo reply 00:27:47.443523 802.1Q vid 10 pri 5 X.X.36.14 X.X.185.145: icmp: echo reply BOX1 is Slave, BOX2 is Master. BOX2: 00:30:35.317329 802.1Q vid 10 pri 3 X.X.185.145 X.X.36.14: icmp: echo request 00:30:35.317354 802.1Q vid 101 pri 0 X.X.185.145 X.X.36.14: icmp: echo request 00:30:35.318065 802.1Q vid 101 pri 0 X.X.36.14 X.X.185.145: icmp: echo reply 00:30:35.318084 802.1Q vid 10 pri 0 X.X.36.14 X.X.185.145: icmp: echo reply Maybe ICMP is not a sort of traffic which makes difference, but think about TCP ACKs are prioritized. Switching to Slave in production setup makes things *REALLY* bad. Should I configure something, or this is an issue? (Speaking of pfsync code, I'm unable to find where prio is set inside pfsync_state_import). Thanks, Alexey -- I'm not entirely sure you are real.
Re: Unexpected match set prio behaviour
On Mon, Nov 18, 2013 at 3:03 AM, Alexander Bluhm alexander.bl...@gmx.net wrote: On Thu, Nov 14, 2013 at 12:03:21AM +0200, Alexey Suslikov wrote: This is on 5.4-stable. vlan is only used to see what resulting prio is. #match on { $int_if } inet proto icmp all icmp-type echoreq set prio 5 pass quick on { $ext_if, $int_if } Can you test wether this diff matches your expected behaviour? Please try various combinations of pass and match rules. bluhm Index: net/pf.c === RCS file: /data/mirror/openbsd/cvs/src/sys/net/pf.c,v retrieving revision 1.861 diff -u -p -r1.861 pf.c --- net/pf.c16 Nov 2013 00:36:01 - 1.861 +++ net/pf.c18 Nov 2013 00:56:55 - @@ -3110,8 +3110,10 @@ pf_rule_to_actions(struct pf_rule *r, st a-max_mss = r-max_mss; a-flags |= (r-scrub_flags (PFSTATE_NODF|PFSTATE_RANDOMID| PFSTATE_SETTOS|PFSTATE_SCRUB_TCP|PFSTATE_SETPRIO)); - a-set_prio[0] = r-set_prio[0]; - a-set_prio[1] = r-set_prio[1]; + if (r-scrub_flags PFSTATE_SETPRIO) { + a-set_prio[0] = r-set_prio[0]; + a-set_prio[1] = r-set_prio[1]; + } } #define PF_TEST_ATTRIB(t, a) \ well, it seems like now I have expected results. at least for following test cases. please tell if you need more. for a record, issue in question was discovered by Roman Kravchuk, I just assisted with analysis and reporting. Test 1 (default prio): # cat /etc/pf.conf ext_if=em0 int_if=vlan2525 set skip on { lo enc0 em1 } block log all #match on { $int_if } inet proto icmp all icmp-type echoreq set prio 6 #match on { $int_if } inet proto udp to port domain set prio 5 #match on { $int_if } inet proto tcp set prio (2, 4) pass quick on { $ext_if, $int_if } ICMP 12:45:57.293179 802.1Q vid 2525 pri 3 192.168.100.1 192.168.100.2: icmp: echo request 12:45:57.293491 802.1Q vid 2525 pri 3 192.168.100.2 192.168.100.1: icmp: echo reply TCP 12:46:39.953468 802.1Q vid 2525 pri 3 192.168.100.1.17637 192.168.100.2.80: S 370622106:370622106(0) win 16384 mss 1460,nop,nop,sackOK,nop,wscale 3,nop,nop,timestamp 1183962946 0 (DF) 12:46:39.953944 802.1Q vid 2525 pri 3 192.168.100.2.80 192.168.100.1.17637: S 3464733189:3464733189(0) ack 370622107 win 16384 mss 1460,nop,nop,sackOK,nop,wscale 3,nop,nop,timestamp 448817884 1183962946 (DF) 12:46:39.954024 802.1Q vid 2525 pri 3 192.168.100.1.17637 192.168.100.2.80: . ack 1 win 2048 nop,nop,timestamp 1183962946 448817884 (DF) 12:46:39.963421 802.1Q vid 2525 pri 3 192.168.100.1.17637 192.168.100.2.80: P 1:230(229) ack 1 win 2048 nop,nop,timestamp 1183962946 448817884 (DF) 12:46:39.970068 802.1Q vid 2525 pri 3 192.168.100.2.80 192.168.100.1.17637: . 1:1449(1448) ack 230 win 2172 nop,nop,timestamp 448817884 1183962946 (DF) 12:46:39.970095 802.1Q vid 2525 pri 3 192.168.100.2.80 192.168.100.1.17637: P 1449:2516(1067) ack 230 win 2172 nop,nop,timestamp 448817884 1183962946 (DF) 12:46:39.970172 802.1Q vid 2525 pri 3 192.168.100.1.17637 192.168.100.2.80: . ack 2516 win 1733 nop,nop,timestamp 1183962946 448817884 (DF) 12:46:39.970214 802.1Q vid 2525 pri 3 192.168.100.2.80 192.168.100.1.17637: F 2516:2516(0) ack 230 win 2172 nop,nop,timestamp 448817884 1183962946 (DF) 12:46:39.970280 802.1Q vid 2525 pri 3 192.168.100.1.17637 192.168.100.2.80: . ack 2517 win 1733 nop,nop,timestamp 1183962946 448817884 (DF) 12:46:39.993600 802.1Q vid 2525 pri 3 192.168.100.1.17637 192.168.100.2.80: F 230:230(0) ack 2517 win 2048 nop,nop,timestamp 1183962946 448817884 (DF) 12:46:39.993927 802.1Q vid 2525 pri 3 192.168.100.2.80 192.168.100.1.17637: . ack 231 win 2172 nop,nop,timestamp 448817884 1183962946 (DF) UDP 12:47:58.298665 802.1Q vid 2525 pri 3 192.168.100.1.39295 192.168.100.2.53: 36561+ A? i.ua. (22) 12:47:58.552804 802.1Q vid 2525 pri 3 192.168.100.2.53 192.168.100.1.39295: 36561 1/2/0 A 91.198.36.14 (74) Test 2 (match takes care of prio): # cat /etc/pf.conf ext_if=em0 int_if=vlan2525 set skip on { lo enc0 em1 } block log all match on { $int_if } inet proto icmp all icmp-type echoreq set prio 6 match on { $int_if } inet proto udp to port domain set prio 5 match on { $int_if } inet proto tcp set prio (2, 4) pass quick on { $ext_if, $int_if } ICMP 12:52:44.783107 802.1Q vid 2525 pri 6 192.168.100.1 192.168.100.2: icmp: echo request 12:52:44.783516 802.1Q vid 2525 pri 6 192.168.100.2 192.168.100.1: icmp: echo reply TCP 12:53:28.007629 802.1Q vid 2525 pri 2 192.168.100.1.49012 192.168.100.2.80: S 2694025614:2694025614(0) win 16384 mss 1460,nop,nop,sackOK,nop,wscale 3,nop,nop,timestamp 80976101 0 (DF) 12:53:28.007915 802.1Q vid 2525 pri 3 192.168.100.2.80 192.168.100.1.49012: S 704605823:704605823(0) ack 2694025615 win 16384 mss 1460,nop,nop,sackOK,nop,wscale 3,nop,nop,timestamp 281624921 80976101 (DF) 12:53:28.007990 802.1Q vid 2525 pri 4 192.168.100.1.49012 192.168.100.2.80: . ack 1 win 2048 nop,nop,timestamp 80976101
Unexpected match set prio behaviour
Hi tech@. This is on 5.4-stable. vlan is only used to see what resulting prio is. The ruleset: --- ext_if=em0 int_if=vlan2525 set skip on { lo enc0 em1 } block log all #match on { $int_if } inet proto icmp all icmp-type echoreq set prio 5 pass quick on { $ext_if, $int_if } --- The vlan: --- vlan2525: flags=28843UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST,NOINET6 mtu 1500 lladdr 00:1a:4a:a8:0a:8c description: LAN priority: 0 vlan: 2525 parent interface: em1 groups: vlan status: active inet 192.168.100.1 netmask 0xff00 broadcast 192.168.100.255 --- Pinging 192.168.100.2 (which is behind vlan2525) gives expected result: 23:51:02.154928 802.1Q vid 2525 pri 3 192.168.100.1 192.168.100.2: icmp: echo request 23:51:02.155313 802.1Q vid 2525 pri 3 192.168.100.2 192.168.100.1: icmp: echo reply prio is set to 3 according to documentation. Now, after I uncomment match rule and ping 192.168.100.2, the result is: 23:54:02.865267 802.1Q vid 2525 pri 0 192.168.100.1 192.168.100.2: icmp: echo request 23:54:02.865485 802.1Q vid 2525 pri 0 192.168.100.2 192.168.100.1: icmp: echo reply prio 0 is somewhat unexpected. Am I doing something wrong? Cheers, Alexey
Re: 2 x em(4) and 2 x bnx(4) trunk(4)s behave differently
On Mon, Nov 11, 2013 at 11:42 AM, Stuart Henderson s...@spacehopper.org wrote: master on em0/em1/bnx0 is nothing to do with trunk, it is about the gigabit ethernet clocking source. ok, but it is obvious: documentation is unclear (silent) about that. lacp hashing policy is the same as for loadbalance, see the manpage and confirm in trunk_hashmbuf(). I see different inbound packet distribution on trunk on-top of em(4)s and on trunk on top of bnx(4)s - that's the real problem.
Re: 2 x em(4) and 2 x bnx(4) trunk(4)s behave differently
On Mon, Nov 11, 2013 at 12:19 PM, Janne Johansson icepic...@gmail.com wrote: I'm not sure if I am misunderstanding your direction of inbound, but that would be an effect of what the switch does, would it not? If the switch isn't configured for LACP correctly, then it would send the traffic to one of them, only. again, consider the following output IFACE STATE DESC IPKTS IBYTES IERRS OPKTS OBYTES OERRS COLLS bnx0 up:U2873 2956K 0 2977 0 0 bnx1 up:U 5360 0 3119 2604K 0 0 trunk0up:U2878 2956K 0 3121 2605K 0 0 (inbound is distributed via single interface, outbound - via 2nd interface in trunk) IFACE STATE DESC IPKTS IBYTES IERRS OPKTS OBYTES OERRS COLLS em0 up:U2711 2859K 0 5593 5222K 0 0 em1 up:U2867 2343K 0 10 3226 0 0 trunk0up:U5578 5202K 0 5603 5225K 0 0 (inbound is distributed via both interfaces, outbound - via 1st interface in trunk) I'm less worried about outbound, however it is interesting why em(4) setup uses first interface, but bnx(4) setup uses second. by 1st and 2nd I mean an order of addition inside hostname.if $ cat /etc/hostname.trunk0 trunkproto lacp trunkport bnx0 trunkport bnx1 up -inet6 $ cat /etc/hostname.trunk0 trunkproto lacp trunkport em0 trunkport em1 up -inet6 on switch itself, both trunks have no visible difference in configuration. 2013/11/11 Alexey Suslikov alexey.susli...@gmail.com On Mon, Nov 11, 2013 at 11:42 AM, Stuart Henderson s...@spacehopper.org wrote: master on em0/em1/bnx0 is nothing to do with trunk, it is about the gigabit ethernet clocking source. ok, but it is obvious: documentation is unclear (silent) about that. lacp hashing policy is the same as for loadbalance, see the manpage and confirm in trunk_hashmbuf(). I see different inbound packet distribution on trunk on-top of em(4)s and on trunk on top of bnx(4)s - that's the real problem. -- May the most significant bit of your life be positive.
Re: 2 x em(4) and 2 x bnx(4) trunk(4)s behave differently
On Mon, Nov 11, 2013 at 12:43 PM, Stuart Henderson st...@openbsd.org wrote: On 2013/11/11 12:15, Alexey Suslikov wrote: On Mon, Nov 11, 2013 at 11:42 AM, Stuart Henderson s...@spacehopper.org wrote: master on em0/em1/bnx0 is nothing to do with trunk, it is about the gigabit ethernet clocking source. ok, but it is obvious: documentation is unclear (silent) about that. Why would something listed as a media characteristic of the physical interface have anything to do with trunk? well, I just expected to see master media option documented somewhere, to make it clear what is trunk master and what is clocking master. lacp hashing policy is the same as for loadbalance, see the manpage and confirm in trunk_hashmbuf(). I see different inbound packet distribution on trunk on-top of em(4)s and on trunk on top of bnx(4)s - that's the real problem. The trunk driver can't influence inbound packet distribution, that is down to the device sending packets e.g. your switch.. yes, I know. but bnx(4) interfaces have master set differently, in contrast to em(4) interfaces. I'm really guessing, but maybe that clocking source has some effect for a switch.
Re: 2 x em(4) and 2 x bnx(4) trunk(4)s behave differently
On Mon, Nov 11, 2013 at 1:00 PM, Stuart Henderson st...@openbsd.org wrote: On 2013/11/11 12:15, Alexey Suslikov wrote: I see different inbound packet distribution on trunk on-top of em(4)s and on trunk on top of bnx(4)s - that's the real problem. On 2013/11/11 10:43, I wrote: The trunk driver can't influence inbound packet distribution, that is down to the device sending packets e.g. your switch.. ... for newer HP L3 switches, you might want to look at trunk-load-balance L4-based, for ciscos port-channel load-balance.. yes, I'm aware of above options, but I have SPS2024-G5 in this setup. did some tests with mode servers involved, and outbound is no worry. this is how trunk(4) hashing works. IFACE STATE DESC IPKTS IBYTES IERRS OPKTS OBYTES OERRS COLLS bnx0 up:U 487 237275 0129 41107 0 0 bnx1 up:U 5360 0348 65383 0 0 IFACE STATE DESC IPKTS IBYTES IERRS OPKTS OBYTES OERRS COLLS em0 up:U 228 54112 0136 51470 0 0 em1 up:U 218 65348 0322 79837 0 0 but bnx1 inbound is always showing 4-5 packets no matter how traffic is distributed :/
2 x em(4) and 2 x bnx(4) trunk(4)s behave differently
Hi tech@. Two machines (A and B) running recent 5.4-stable plugged into same switch. A has: em0 at pci4 dev 0 function 0 Intel 82573E rev 0x03: msi, address 00:30:48:66:a0:ec em1 at pci5 dev 0 function 0 Intel 82573L rev 0x00: msi, address 00:30:48:66:a0:ed B has: bnx0 at pci2 dev 0 function 0 Broadcom BCM5716 rev 0x20: apic 0 int 16 bnx1 at pci2 dev 0 function 1 Broadcom BCM5716 rev 0x20: apic 0 int 17 bnx0: address b8:ac:6f:91:48:da brgphy0 at bnx0 phy 1: BCM5709 10/100/1000baseT PHY, rev. 8 bnx1: address b8:ac:6f:91:48:db brgphy1 at bnx1 phy 1: BCM5709 10/100/1000baseT PHY, rev. 8 Both servers have LACP trunk(4)s built on-top the above mentioned interfaces: A has: em0: flags=28b43UP,BROADCAST,RUNNING,PROMISC,ALLMULTI,SIMPLEX,MULTICAST,NOINET6 mtu 1500 lladdr 00:30:48:66:a0:ec priority: 0 trunk: trunkdev trunk0 media: Ethernet autoselect (1000baseT full-duplex,master) status: active em1: flags=28b43UP,BROADCAST,RUNNING,PROMISC,ALLMULTI,SIMPLEX,MULTICAST,NOINET6 mtu 1500 lladdr 00:30:48:66:a0:ec priority: 0 trunk: trunkdev trunk0 media: Ethernet autoselect (1000baseT full-duplex,master) status: active trunk0: flags=28943UP,BROADCAST,RUNNING,PROMISC,SIMPLEX,MULTICAST,NOINET6 mtu 1500 lladdr 00:30:48:66:a0:ec priority: 0 trunk: trunkproto lacp trunk id: [(8000,00:30:48:66:a0:ec,402C,,), (0001,ec:30:91:25:c0:4f,03E8,,)] trunkport em1 active,collecting,distributing trunkport em0 active,collecting,distributing groups: trunk media: Ethernet autoselect status: active B has: bnx0: flags=28b43UP,BROADCAST,RUNNING,PROMISC,ALLMULTI,SIMPLEX,MULTICAST,NOINET6 mtu 1500 lladdr b8:ac:6f:91:48:da priority: 0 trunk: trunkdev trunk0 media: Ethernet autoselect (1000baseT full-duplex) status: active bnx1: flags=28b43UP,BROADCAST,RUNNING,PROMISC,ALLMULTI,SIMPLEX,MULTICAST,NOINET6 mtu 1500 lladdr b8:ac:6f:91:48:da priority: 0 trunk: trunkdev trunk0 media: Ethernet autoselect (1000baseT full-duplex,master) status: active trunk0: flags=28843UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST,NOINET6 mtu 1500 lladdr b8:ac:6f:91:48:da priority: 0 trunk: trunkproto lacp trunk id: [(8000,b8:ac:6f:91:48:da,402C,,), (0001,ec:30:91:25:c0:4f,03EA,,)] trunkport bnx1 active,collecting,distributing trunkport bnx0 active,collecting,distributing groups: trunk media: Ethernet autoselect status: active Now about the difference. The A receives on both em0 and em1 and transmits on em0: IFACE STATE DESC IPKTS IBYTES IERRS OPKTS OBYTES OERRS COLLS em0 up:U2711 2859K 0 5593 5222K 0 0 em1 up:U2867 2343K 0 10 3226 0 0 trunk0up:U5578 5202K 0 5603 5225K 0 0 The B receives *only* on bnx0 and transmits *only* on bnx1: IFACE STATE DESC IPKTS IBYTES IERRS OPKTS OBYTES OERRS COLLS bnx0 up:U2873 2956K 0 2977 0 0 bnx1 up:U 5360 0 3119 2604K 0 0 trunk0up:U2878 2956K 0 3121 2605K 0 0 The only difference ifconfig shows, both em(4)s are master interfaces on A, but only bnx1 is a master interface on B (I haven't found any description of master media option in ifconfig man page, trunk man page saying about master but only wrt failover mode). Whole situation smells like trunk(4) receives on *all* master interfaces, but transmits on *first available* master. The question here is, why both em(4)s are master interfaces on A, but only bnx1 is master interface on B? Another question is, what is transmit hash policy for a trunk in LACP mode? If it matters, while testing on B, different MACs and different VLANs been used, but effect is same: bnx0 only receives, bnx1 only transmits. Anybody with trunking experience please speak up. Thanks, Alexey
Re: defer routing table updates on link state changes
Reyk Floeter wrote: Yes, in theory if_index should be fixed and return a consistent number between 1 and the number of interfaces. But this is obviously difficult and I'm not sure if it's worth the effort. So the hack that you're going to remove was a best effort. But putting another interface index abstraction layer in userland (via snmpd or some shared db) is just not the right way to do it. We either have a reliable if_index from the kernel or we don't. But inventing another thing in userland doesn't make sense to me. If above theory doesn't dictate all interfaces must exist (it shouldn't because of hot-plug interfaces), kernel can operate on fixed predefined ifIndex table like this: tun ifIndex (only have 256 of them because of unit_no): 1 - 00:bd:xx:xx:xx:00 - tun0 256 - 00:bd:xx:xx:xx:ff - tun255 vether ifIndex (only have 65536 of them?): 257 - fe:e1:ba:d0:xx:xx - vether0 65,792 - fe:e1:ba:d0:xx:xx - vether65535 physical ifIndex (claim to support ~1M of physical interfaces): 65,793 - 00:25:90:xx:xx:aa - em0 65,794 - 00:25:90:xx:xx:ab - em1 1,179,906 - xx:xx:xx:xx:xx:xx - foo77 trunk ifIndex (claim to support ~17M of trunk interfaces, by unit_no): 1,179,907 - xx:xx:xx:xx:xx:xx - trunk0 19,005,699 - xx:xx:xx:xx:xx:xx - trunk1699 vlan ifIndex (claim to support ~280M of vlan interfaces, by unit_no): 19,005,700 - xx:xx:xx:xx:xx:xx - vlan0 304,218,372 - xx:xx:xx:xx:xx:xx - vlan27999 and so on, up to 2,147,483,647. IMO, cloners aren't so problematic (because of algorithmically controlled enumeration and unit number assignment) as physical interfaces are. I think, the best is to let ifIndexes be assigned to physical interfaces via ifconfig, but let cloners to do their assignments automatically. And do not let snmpd to operate on interface without an ifIndex: having no ifIndex means no interface available.
octeon bits on 54.html
Hi tech@. 54.html says: Ubiquiti Networks EdgeRouter LITE (no local storage) How should I read it: an EdgeRouter LITE variant with no local storage or local storage is currently not supported? Cheers, Alexey
Re: drm bits on 54.html
On Sat, Aug 10, 2013 at 11:58 AM, Brad Smith b...@comstyle.com wrote: - Original message - Hi tech@. 54.html says: Now mostly in sync with Linux 3.8.13 But there's no such thing as Linux X.X.X, there's a Linux kernel X.X.X. But there is. The later is redundant. Linux is a kernel. In geek world, maybe, but not in Real World (tm) http://en.wikipedia.org/wiki/Linux
Re: drm bits on 54.html
On Sat, Aug 10, 2013 at 12:09 PM, Brad Smith b...@comstyle.com wrote: - Original message - On Sat, Aug 10, 2013 at 11:58 AM, Brad Smith b...@comstyle.com wrote: - Original message - Hi tech@. 54.html says: Now mostly in sync with Linux 3.8.13 But there's no such thing as Linux X.X.X, there's a Linux kernel X.X.X. But there is. The later is redundant. Linux is a kernel. In geek world, maybe, but not in Real World (tm) http://en.wikipedia.org/wiki/Linux Yes, real world so often uses names and terms improperly. whats new. http://www.gnu.org/gnu/linux-and-gnu.html says Linux is the kernel: the program in the system that allocates the machine's resources to the other programs that you run. The kernel is an essential part of an operating system, but useless by itself; it can only function in the context of a complete operating system. Linux is normally used in combination with the GNU operating system: the whole system is basically GNU with Linux added, or GNU/Linux. All the so-called “Linux” distributions are really distributions of GNU/Linux. So I think you're right about using Linux term. Sorry for a noise.
a.out in gcc-local(1)
Hi tech@ Just found no longer relevant block in gcc-local(1): - On a.out platforms (i.e. vax), gcc uses a linker wrapper to write stubs that call global constructors and destructors. Those platforms use gcc 2.95.3, and those calls can be traced using -Wl,-trace-ctors-dtors, using syslog_r(3). Cheers, Alexey
Re: Stop printing excessive numbers of ACPI wakeup devices
On Sun, Jun 2, 2013 at 12:05 AM, Theo de Raadt dera...@cvs.openbsd.org wrote: Mike Larkin mlarkin at azathoth.net writes: It's sometimes nice to know what devices can wake up a machine, and from what sleep state. But I'm fine suppressing these also. Don't want this to end up being a bikeshed :) why not dnprintf them? good grief. We are displaying the information because we still want to see it in dmesglogs so that we can improve suspend/resume ... is there any possibility to end up with that information being under [...]?
Re: Stop printing excessive numbers of ACPI wakeup devices
On Sun, Jun 2, 2013 at 12:14 AM, Mark Kettenis mark.kette...@xs4all.nl wrote: Date: Sun, 2 Jun 2013 00:09:25 +0300 From: Alexey Suslikov alexey.susli...@gmail.com On Sun, Jun 2, 2013 at 12:05 AM, Theo de Raadt dera...@cvs.openbsd.org wrote: Mike Larkin mlarkin at azathoth.net writes: It's sometimes nice to know what devices can wake up a machine, and from what sleep state. But I'm fine suppressing these also. Don't want this to end up being a bikeshed :) why not dnprintf them? good grief. We are displaying the information because we still want to see it in dmesglogs so that we can improve suspend/resume ... is there any possibility to end up with that information being under [...]? acpidump and disassemble the aml anyway, if somebody actually starts working on proper acpi wakeup support, we can temporarily enable printing them all again. just an idea (I know more knobs are not good), but sysctl already have some acpi related information (indirectly, tho), like $ sysctl -a| grep -i acpi kern.malloc.kmemnames=free,,devbuf,debug,pcb,routetbl,,fragtbl,,ifaddr,soopts,sysctl,,,ioctlops,iov,mount,,NFS_req,NFS_mount,,vnodes,namecache,UFS_quota,UFS_mount,shm,VM_map,sem,dirhash,ACPI,VM_pmapfile,file_desc,,proc,subproc,VFS_cluster,,,MFS_node,,,Export_Host,NFS_srvsock,,NFS_daemon,ip_moptions,in_multi,ether_multi,mrt,ISOFS_mount,ISOFS_node,MSDOSFS_mount,MSDOSFS_fat,MSDOSFS_node,ttys,exec,miscfs_mount,,pfkey_data,tdb,xform_data,,pagedep,inodedep,newblk,,,indirdep,VM_swap,,UVM_amap,UVM_aobj,,USB,USB_device,USB_HC,,memdesc,,,crypto_data,,IPsec_credsemuldata,ip6_options,NDP,,,temp,NTFS_mount,NTFS_node,NTFS_fnode,NTFS_dir,NTFS_hash,NTFS_attr,NTFS_data,NTFS_decomp,NTFS_vrun,kqueue,bluetooth,bwmeter,UDF_mount,UDF_file_entry,UDF_file_id,Bluetooth_HID,AGP_Memory,DRM kern.malloc.kmemstat.ACPI=(inuse = 5429, calls = 20622, memuse = 638K, limblocks = 0, mapblocks = 0, maxused = 660K, limit = 78644K, spare = 0, sizes = (16,32,64,128,256,512,2048)) kern.timecounter.hardware=acpihpet0 kern.timecounter.choice=i8254(0) acpihpet0(1000) acpitimer0(1000) dummy(-100) so maybe wakeup devices may end up under some sysctl path.
Question about MP safe audio/video
Hi tech@. Are uvideo(4), bktr(4) and similar also MP safe or they somewhat different in terms of a technique used to make audio MP safe? Cheers, Alexey
amd64errata.c,v 1.4
For our crash, v 1.4 of amd64errata.c is no-op unless we de-static functions' prototypes. acpiprt0 at acpi0: bus 0 (PCI0) mpbios0 at bios0: Intel MP Specification 1.4 cpu0 at mainbus0: apid 0 (boot processor) cpu0: AMD Phenom(tm) 9550 Quad-Core Processor, 3600.54 MHz cpu0: FPU,VME,DE,PSE,TSC,MSR,PAE, MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CF LUSH,MMX,FXSR,SSE,SSE2,SSE3,CX16,POPCHT,HXE,MMXX,FFXSR,LOHG,3DNOW2,3DNOW,LAHF,CM PLEG,SVM,AMCR8,ABM,SSE4A,MASSE,3DNOWP cpu0: 64KB 64b/line 2-way I-cache, 64KB 64b/line 2-way D-cache, 512KB 64b/line 1 6-way L2 cache cpu0: ITLB 255 4KB entries direct-mapped, 255 4MB entries direct-mapped cpu0: DTLB 255 4KB entries direct-mapped, 255 4MB entries direct-mapped kernel: protection fault trap, code=0 Stopped at amd64_errata_setmsr()+0x10: rdmsr amd64_errata_setmsr() at amd64_errata_setmsr()+0x10 amd64_errata() at amd64 errata+0xc9 identifycpu() at identifycpu+0x729 cpu attach() at cpu_attach+0x2ce config_attach() at config_attach+0x1d4 mpbios_cpu() at mpbios_cpu+0x5b mpbios_scan() at mpbios_scan+0x355 config_attach() at config_attach+0x1d4 bios_attach() at bios_attach+0x296 config_attach() at config_attach+0x1d4 end trace frame: 0x81de9e30, count: 0 ddb{0} ddb{0} trace amd64_errata_setmsr() at amd64_errata_setmsr()+0x10 amd64_errata() at amd64_errata+0xc9 identifycpu() at identifycpu+0x729 cpu_attach() at cpu_attach+0x2ce config_attach() at config_attach+0x1d4 mpbios_cpu() at mpbios_cpu+0x5b mpbios_scan() at mpbios_scan+0x355 config_attach() at config_attach+0x1d4 bios_attach() at bios_attach+0x296 config_attach() at config_attach+0x1d4 mainbus_attach() at mainbus_attach+0x5b config_attach() at config_attach+0x1d4 cpu_configure() at cpu_configure+0x17 main() at main+0x3f5 end trace frame: 0x0, count: -14 ddb{0} Index: arch/amd64/amd64/amd64errata.c === RCS file: /cvs/src/sys/arch/amd64/amd64/amd64errata.c,v retrieving revision 1.4 diff -u -p -u -p -r1.4 amd64errata.c --- arch/amd64/amd64/amd64errata.c 20 May 2013 17:34:08 - 1.4 +++ arch/amd64/amd64/amd64errata.c 20 May 2013 20:17:00 - @@ -129,8 +129,8 @@ static const uint8_t amd64_errata_set9[] DA_C3, HY_D0, HY_D1, HY_D1_G34R1, PH_E0, LN_B0, OINK }; -static int amd64_errata_setmsr(struct cpu_info *, errata_t *); -static int amd64_errata_testmsr(struct cpu_info *, errata_t *); +int amd64_errata_setmsr(struct cpu_info *, errata_t *); +int amd64_errata_testmsr(struct cpu_info *, errata_t *); static errata_t errata[] = { /* Index: arch/i386/i386/amd64errata.c === RCS file: /cvs/src/sys/arch/i386/i386/amd64errata.c,v retrieving revision 1.4 diff -u -p -u -p -r1.4 amd64errata.c --- arch/i386/i386/amd64errata.c20 May 2013 17:34:08 - 1.4 +++ arch/i386/i386/amd64errata.c20 May 2013 20:17:01 - @@ -129,8 +129,8 @@ static const uint8_t amd64_errata_set9[] DA_C3, HY_D0, HY_D1, HY_D1_G34R1, PH_E0, LN_B0, OINK }; -static int amd64_errata_setmsr(struct cpu_info *, errata_t *); -static int amd64_errata_testmsr(struct cpu_info *, errata_t *); +int amd64_errata_setmsr(struct cpu_info *, errata_t *); +int amd64_errata_testmsr(struct cpu_info *, errata_t *); static errata_t errata[] = { /*
Re: Possible relayd memory leak analysis
recent snaps don't have above mentioned problem. no sure what was the cause, but leak is gone. On Tue, Apr 9, 2013 at 1:47 AM, Alexey Suslikov alexey.susli...@gmail.com wrote: hi tech@ tools used: * ps auxwww | grep relayd * httperf --hog --server=192.168.5.201 --wsess=25,1000,0.1 --rate=50 --timeout=5 target machine: OpenBSD 5.3-current (GENERIC.MP) #0: Sun Apr 7 15:14:10 EEST 2013 *@*:/usr/src/sys/arch/amd64/compile/GENERIC.MP /etc/relayd.conf: ext_addr=192.168.5.201 webhost1=192.168.5.202 webhost2=192.168.5.203 prefork 2 table web { $webhost1 $webhost2 } http protocol proto_pool_http { header append $REMOTE_ADDR to X-Forwarded-For header append $SERVER_ADDR:$SERVER_PORT to X-Forwarded-By header change Connection to close } relay cluster_pool_http { listen on $ext_addr port www protocol proto_pool_http forward to web port www mode roundrobin check http /index.html host test.local code 200 } cold ps auxwww: root 31403 0.0 0.1 1160 1916 ?? Ss12:21AM0:00.03 relayd: parent (relayd) _relayd 18684 0.0 0.1 1044 2056 ?? S 12:21AM0:00.01 relayd: pfe (relayd) _relayd 29554 0.0 0.1 912 1948 ?? S 12:21AM0:00.01 relayd: hce (relayd) _relayd 7937 0.0 0.1 1108 2020 ?? S 12:21AM0:00.02 relayd: relay (relayd) _relayd 28352 0.0 0.1 1108 2036 ?? S 12:21AM0:00.00 relayd: relay (relayd) ps auxwww after 1st httperf run: _relayd 28352 4.1 0.6 10280 11672 ?? S 12:21AM0:08.83 relayd: relay (relayd) _relayd 7937 4.8 0.6 10620 12004 ?? S 12:21AM0:09.17 relayd: relay (relayd) root 31403 0.0 0.1 1160 1916 ?? Is12:21AM0:00.03 relayd: parent (relayd) _relayd 18684 0.0 0.1 1044 2056 ?? S 12:21AM0:00.02 relayd: pfe (relayd) _relayd 29554 0.0 0.1 912 1948 ?? S 12:21AM0:00.03 relayd: hce (relayd) ps auxwww after 2nd httperf run: _relayd 28352 1.5 1.0 19424 20816 ?? S 12:21AM0:17.77 relayd: relay (relayd) _relayd 7937 1.4 1.0 19724 21108 ?? S 12:21AM0:18.11 relayd: relay (relayd) root 31403 0.0 0.1 1160 1916 ?? Is12:21AM0:00.03 relayd: parent (relayd) _relayd 18684 0.0 0.1 1044 2056 ?? S 12:21AM0:00.02 relayd: pfe (relayd) _relayd 29554 0.0 0.1 912 1952 ?? S 12:21AM0:00.05 relayd: hce (relayd) on busy production setup relayd continuously leaks and eventually crashes.
cvsweb says 'No viewable change' for i915_drv.c diffs
Hi. Try this http://www.openbsd.org/cgi-bin/cvsweb/src/sys/dev/pci/drm/i915/i915_drv.c.diff?r1=1.26;r2=1.27;f=h and, for instance, this http://www.openbsd.org/cgi-bin/cvsweb/src/sys/dev/pci/drm/i915/i915_dma.c.diff?r1=1.6;r2=1.7;f=h Former says No viewable change. I think it isn't normal. Am I wrong? Cheers, Alexey
Re: cvsweb says 'No viewable change' for i915_drv.c diffs
googled gnu cvs utf diff and found this http://stackoverflow.com/questions/778291/how-do-i-diff-utf-16-files-with-gnu-diff On Wed, May 15, 2013 at 2:03 PM, Stuart Henderson st...@openbsd.org wrote: On 2013/05/15 11:53, Stuart Henderson wrote: On 2013/05/15 10:43, Alexey E. Suslikov wrote: Mark Kettenis mark.kettenis at xs4all.nl writes: Try this http://www.openbsd.org/cgi-bin/cvsweb/src/sys/dev/pci/drm/i915/i915_drv.c.diff?r1=1.26;r2=1.27;f=h and, for instance, this http://www.openbsd.org/cgi-bin/cvsweb/src/sys/dev/pci/drm/i915/i915_dma.c.diff?r1=1.6;r2=1.7;f=h Former says No viewable change. I think it isn't normal. Am I wrong? Yes that's very annoying. I suspect cvsweb has problems with the UTF8 characters in the copyright header. cvsweb operates on individual diff chunks while preparing viewable output, right? if so, and you are right about UTF8, only one of these chunks is a showstopper. maybe cvsweb may say No viewable change for a problematic chunk only, instead of completely freaking out. it's not cvsweb. $ rcsdiff -u -r1.26 -r1.27 /cvs/src/sys/dev/pci/drm/i915/i915_drv.c,v === RCS file: /cvs/src/sys/dev/pci/drm/i915/i915_drv.c,v retrieving revision 1.26 retrieving revision 1.27 diff -u -r1.26 -r1.27 ...and yes, it is due to the UTF8 characters: replacing them in the ,v file lets it work.
Re: external ip/tcp (sysctl) variables
RCS file: /home/ncvs/src/sys/netinet/ip_var.h,v retrieving revision 1.44 diff -u -p -r1.44 ip_var.h --- netinet/ip_var.h 16 Jul 2012 18:05:36 - 1.44 +++ netinet/ip_var.h 8 Apr 2013 13:23:23 - @@ -149,8 +149,20 @@ extern struct ipstat ipstat; extern LIST_HEAD(ipqhead, ipq) ipq; /* ip reass. queue */ extern int ip_defttl; /* default IP ttl */ +extern struct socket *ip_mrouter; /* multicast routing daemon */ + extern int ip_mtudisc; /* mtu discovery */ extern u_int ip_mtudisc_timeout; /* seconds to timeout mtu discovery */ + +extern int ipport_firstauto; /* min port for port allocation */ +extern int ipport_lastauto; /* max port for port allocation */ +extern int ipport_hifirstauto; /* min dynamic/private port number */ +extern int ipport_hilastauto; /* max dynamic/private port number */ +extern int encdebug; /* enable message reporting */ +extern int ipforwarding; /* enable IP forwarding */ +extern int ipmforwarding; /* enable multicast forwarding */ previously, ipmforwarding and ip_mrouter were under #ifdef MROUTING is it normal to have them outside mentioned #ifdef in your diff?
Possible relayd memory leak analysis
hi tech@ tools used: * ps auxwww | grep relayd * httperf --hog --server=192.168.5.201 --wsess=25,1000,0.1 --rate=50 --timeout=5 target machine: OpenBSD 5.3-current (GENERIC.MP) #0: Sun Apr 7 15:14:10 EEST 2013 *@*:/usr/src/sys/arch/amd64/compile/GENERIC.MP /etc/relayd.conf: ext_addr=192.168.5.201 webhost1=192.168.5.202 webhost2=192.168.5.203 prefork 2 table web { $webhost1 $webhost2 } http protocol proto_pool_http { header append $REMOTE_ADDR to X-Forwarded-For header append $SERVER_ADDR:$SERVER_PORT to X-Forwarded-By header change Connection to close } relay cluster_pool_http { listen on $ext_addr port www protocol proto_pool_http forward to web port www mode roundrobin check http /index.html host test.local code 200 } cold ps auxwww: root 31403 0.0 0.1 1160 1916 ?? Ss12:21AM0:00.03 relayd: parent (relayd) _relayd 18684 0.0 0.1 1044 2056 ?? S 12:21AM0:00.01 relayd: pfe (relayd) _relayd 29554 0.0 0.1 912 1948 ?? S 12:21AM0:00.01 relayd: hce (relayd) _relayd 7937 0.0 0.1 1108 2020 ?? S 12:21AM0:00.02 relayd: relay (relayd) _relayd 28352 0.0 0.1 1108 2036 ?? S 12:21AM0:00.00 relayd: relay (relayd) ps auxwww after 1st httperf run: _relayd 28352 4.1 0.6 10280 11672 ?? S 12:21AM0:08.83 relayd: relay (relayd) _relayd 7937 4.8 0.6 10620 12004 ?? S 12:21AM0:09.17 relayd: relay (relayd) root 31403 0.0 0.1 1160 1916 ?? Is12:21AM0:00.03 relayd: parent (relayd) _relayd 18684 0.0 0.1 1044 2056 ?? S 12:21AM0:00.02 relayd: pfe (relayd) _relayd 29554 0.0 0.1 912 1948 ?? S 12:21AM0:00.03 relayd: hce (relayd) ps auxwww after 2nd httperf run: _relayd 28352 1.5 1.0 19424 20816 ?? S 12:21AM0:17.77 relayd: relay (relayd) _relayd 7937 1.4 1.0 19724 21108 ?? S 12:21AM0:18.11 relayd: relay (relayd) root 31403 0.0 0.1 1160 1916 ?? Is12:21AM0:00.03 relayd: parent (relayd) _relayd 18684 0.0 0.1 1044 2056 ?? S 12:21AM0:00.02 relayd: pfe (relayd) _relayd 29554 0.0 0.1 912 1952 ?? S 12:21AM0:00.05 relayd: hce (relayd) on busy production setup relayd continuously leaks and eventually crashes.
SSE4.2 CRC32 question
Hi tech@. Can OpenBSD use SSE4.2 CRC32 (found on Core i7) to speedup TCP/IP checksum calculations? Cheers, Alexey
Re: goodbye to some isa devices
On Wed, Mar 27, 2013 at 10:04 PM, Miod Vallat m...@online.fr wrote: Not sure about ancient 3Com's, but they are Ethernet at least, in contract to Token-Ring device like tr*. Do we support Token-Ring? We used to, on TRopic boards, but since public documentation for TR hardware amounts to zilch, and there is no interest in changing this situation, it was eventually removed from the tree to clear the way of other changes. And with no TR stack, is there any reason for sys/arch/i386/conf/GENERIC to contain these #tr0at isa? port 0xa20 iomem 0xd8000# IBM TROPIC based Token-Ring #tr1at isa? port 0xa24 iomem 0xd# IBM TROPIC based Token-Ring #tr*at isa? # 3COM TROPIC based Token-Ring ?
Re: goodbye to some isa devices
On Wed, Mar 27, 2013 at 10:24 PM, Miod Vallat m...@online.fr wrote: Do we support Token-Ring? We used to, on TRopic boards, but since public documentation for TR hardware amounts to zilch, and there is no interest in changing this situation, it was eventually removed from the tree to clear the way of other changes. And with no TR stack, is there any reason for sys/arch/i386/conf/GENERIC to contain these #tr0 at isa? port 0xa20 iomem 0xd8000# IBM TROPIC based Token-Ring #tr1 at isa? port 0xa24 iomem 0xd# IBM TROPIC based Token-Ring #tr* at isa? # 3COM TROPIC based Token-Ring ? Definitely not, this is a leftover of the token ring pruning. Thanks for noticing! btw, if you guys still looking for something to disable in sys/arch/i386/conf/RAMDISK_CD, take a look on these ie0 at isa? port 0x360 iomem 0xd irq 7 # StarLAN and 3C507 le0 at isa? port 0x360 irq 15 drq 6 # IsoLan, NE2100, and DEPCA le* at isapnp?
5.3 lyrics
hi tech@. despite of 5.3 lyrics being released recently, 53.html says opposite. is it normal? cheers, alexey
Re: Threads related SIGSEGV in random.c (diff, v2)
On Thu, Mar 14, 2013 at 6:48 PM, Ted Unangst t...@tedunangst.com wrote: On Thu, Mar 14, 2013 at 17:24, Antoine Jacoutot wrote: On Thu, Mar 14, 2013 at 11:41:52AM -0400, Ted Unangst wrote: On Thu, Mar 14, 2013 at 14:30, Antoine Jacoutot wrote: FYI I am seeing a somehow similar crash when using sysutils/bacula (both 5.2 and 5.3). It is 100% reproducible on my setup. Obviously painful since it means I cannot run backups anymore... The following is brought to you without testing or warranty. It did compile at least once though. Awesome, thanks! I ran several batches of concurrent backups and I cannot reproduce the crash anymore :-) I'm going to run with that patch for the time being... if I spot any regression, I'll let you know. Couple fixes. In some error cases, there are early returns I didn't notice before. Fixed diff below, though I don't think a correct program should be affected. Alexey, sorry, I didn't get to your final diff before. It's very similar to the diff below, so you were on the right track. One thing that's different is you created unique special functions. Thanks Ted. Glad to see this diff back. I stopped pushing the diff because of zero feedback. If you don't mind, put some credit to Roman Kravchuk when you will commit, as he did most of work. I just pushed diff.
Re: savecore on swap-less amd64 box
On Mon, Dec 17, 2012 at 2:44 PM, Mark Kettenis mark.kette...@xs4all.nl wrote: Date: Mon, 17 Dec 2012 14:14:40 +0200 From: Alexey Suslikov alexey.susli...@gmail.com Hello tech@. On swap-less amd64 box using 20121213 amd64 snap, I have noticed a difference in how savecore behaves on SP and MP kernels. During boot, I see OpenBSD 5.2-current (GENERIC) #6: Wed Dec 12 23:16:44 MST 2012 dera...@amd64.openbsd.org:/usr/src/sys/arch/amd64/compile/GENERIC savecore: /dev/sd0b: Device not configured OpenBSD 5.2-current (GENERIC.MP) #5: Wed Dec 12 23:22:46 MST 2012 dera...@amd64.openbsd.org:/usr/src/sys/arch/amd64/compile/GENERIC.MP savecore: can't find device 0/127 $ cat /etc/fstab 93e3a680795f1b55.a / ffs rw,softdep 1 1 Is it normal or I have missed something? reboot that GENERIC.MP kernel once more yep. that fixed the problem. just curious, what was it and why? Cheers, Alexey
Re: MBA remove unneeded quirk
On Tue, Dec 4, 2012 at 2:38 PM, Alexandre Ratchov a...@caoua.org wrote: On Sat, Dec 01, 2012 at 05:14:46PM +0800, Ray Lai wrote: I'm not sure why jakemsr's diff[1] has AZ_QRK_GPIO_UNMUTE_1, my MBA works fine without it. I've tested both left and right channels on both speakers and headphones. -Ray- [1]: http://marc.info/?l=openbsd-miscm=128919130029011w=2 Index: dev/pci/azalia_codec.c === RCS file: /home/cvs/src/sys/dev/pci/azalia_codec.c,v retrieving revision 1.152 diff -u -p -r1.152 azalia_codec.c --- dev/pci/azalia_codec.c30 Nov 2012 12:05:45 - 1.152 +++ dev/pci/azalia_codec.c1 Dec 2012 08:27:31 - @@ -67,8 +67,7 @@ azalia_codec_init_vtbl(codec_t *this) case 0x10134206: this-name = Cirrus Logic CS4206; if (this-subid == 0xcb8910de) {/* APPLE_MBA3_1 */ - this-qrks |= AZ_QRK_GPIO_UNMUTE_1 | - AZ_QRK_GPIO_UNMUTE_3; + this-qrks |= AZ_QRK_GPIO_UNMUTE_3; } break; case 0x10ec0260: Hey, Linux hda driver seems to use gpio 1 and 3 by default for most apple products. Does the gpio 1 quirk hurts in any way? If it doesn't, I'd leave it unless I'm missing the reason why it's not needed. BTW, did you get any test reports? -- Alexandre I agree. Other models/codecs/wiring may require both 1 and 3 gpio, so more general case is preferred here. Cheers, Alexey
Re: [PATCH, TEST] Make functions in random.c thread safe
Hi. Me and Roman are curious about zero comments on this. We'll try to improve the diff if it is not ok. Just let us know. Anyone? :) On Wed, Oct 3, 2012 at 4:06 PM, Alexey Suslikov alexey.susli...@gmail.com wrote: Hi. Is there any progress/comments on this? On Fri, Sep 28, 2012 at 11:29 PM, Alexey Suslikov alexey.susli...@gmail.com wrote: Hi. With input from tedu@, guenther@ and others, below are: 1) test case; 2) backtrace for test case; 3) locking diff; 4) dmesg (amd64 GENERIC.MP built from 2012-09-28 CVS). Diff introduces no changes to srandomdev(): correct me if I'm wrong, but no mutex can be used since sysctl can sleep. Rebuild and reinstall in src/lib/librthread and src/lib/libc after applying the diff. Expect test case (and Kannel port of course) not crashing after rebuild and reinstall. Cheers, Alexey 1) test case. #include pthread.h #include stdio.h #include stdlib.h #include assert.h #include unistd.h #define NUM_THREADS1800 void *TaskCode(void *argument) { struct timeval tv; gettimeofday(tv, 0); srandom((getpid() 16) ^ getuid() ^ tv.tv_sec ^ tv.tv_usec); return NULL; } int main(void) { pthread_t threads[NUM_THREADS]; int thread_args[NUM_THREADS]; int rc, i; /* create all threads */ for (i=0; iNUM_THREADS; ++i) { thread_args[i] = i; rc = pthread_create(threads[i], NULL, TaskCode, (void *) thread_args[i]); assert(0 == rc); } /* wait for all threads to complete */ for (i=0; iNUM_THREADS; ++i) { rc = pthread_join(threads[i], NULL); assert(0 == rc); } printf(Test srandom success\n); exit(EXIT_SUCCESS); } 2) backtrace for test case. Program received signal SIGSEGV, Segmentation fault. [Switching to thread 1030380] 0x19d34d618f8e in random () at /usr/src/lib/libc/stdlib/random.c:387 387 *fptr += *rptr; (gdb) bt #0 0x19d34d618f8e in random () at /usr/src/lib/libc/stdlib/random.c:387 #1 0x19d34d619169 in srandom (x=Variable x is not available. ) at /usr/src/lib/libc/stdlib/random.c:216 #2 0x19d14fe1 in TaskCode (argument=0x7f7ea004) at test_srandom.c:14 #3 0x19d34999d11e in _rthread_start (v=Variable v is not available. ) at /usr/src/lib/librthread/rthread.c:122 #4 0x19d34d5f0f9b in __tfork_thread () at /usr/src/lib/libc/arch/amd64/sys/tfork_thread.S:75 Cannot access memory at address 0x19d344efb000 3) locking diff. Index: lib/libc/include/thread_private.h === RCS file: /cvs/src/lib/libc/include/thread_private.h,v retrieving revision 1.25 diff -u -p -r1.25 thread_private.h --- lib/libc/include/thread_private.h 16 Oct 2011 06:29:56 - 1.25 +++ lib/libc/include/thread_private.h 27 Sep 2012 10:48:45 - @@ -172,4 +172,16 @@ void _thread_arc4_unlock(void); _thread_arc4_unlock();\ } while (0) +void _thread_random_lock(void); +void _thread_random_unlock(void); + +#define _RANDOM_LOCK() do {\ + if (__isthreaded) \ + _thread_random_lock(); \ + } while (0) +#define _RANDOM_UNLOCK() do {\ + if (__isthreaded) \ + _thread_random_unlock();\ + } while (0) + #endif /* _THREAD_PRIVATE_H_ */ Index: lib/libc/stdlib/random.c === RCS file: /cvs/src/lib/libc/stdlib/random.c,v retrieving revision 1.17 diff -u -p -r1.17 random.c --- lib/libc/stdlib/random.c1 Jun 2012 01:01:57 - 1.17 +++ lib/libc/stdlib/random.c27 Sep 2012 10:48:45 - @@ -35,6 +35,10 @@ #include stdio.h #include stdlib.h #include unistd.h +#include thread_private.h + +static void srandom_unlocked(unsigned int); +static long random_unlocked(void); /* * random.c: @@ -186,8 +190,8 @@ static int rand_sep = SEP_3; * introduced by the L.C.R.N.G. Note that the initialization of randtbl[] * for default usage relies on values produced by this routine. */ -void -srandom(unsigned int x) +static void +srandom_unlocked(unsigned int x) { int i; int32_t test; @@ -213,10 +217,18 @@ srandom(unsigned int x) fptr = state[rand_sep]; rptr = state[0]; for (i = 0; i 10 * rand_deg; i++) - (void)random(); + (void)random_unlocked(); } } +void +srandom
Re: [PATCH, TEST] Make functions in random.c thread safe
Hi. Is there any progress/comments on this? On Fri, Sep 28, 2012 at 11:29 PM, Alexey Suslikov alexey.susli...@gmail.com wrote: Hi. With input from tedu@, guenther@ and others, below are: 1) test case; 2) backtrace for test case; 3) locking diff; 4) dmesg (amd64 GENERIC.MP built from 2012-09-28 CVS). Diff introduces no changes to srandomdev(): correct me if I'm wrong, but no mutex can be used since sysctl can sleep. Rebuild and reinstall in src/lib/librthread and src/lib/libc after applying the diff. Expect test case (and Kannel port of course) not crashing after rebuild and reinstall. Cheers, Alexey 1) test case. #include pthread.h #include stdio.h #include stdlib.h #include assert.h #include unistd.h #define NUM_THREADS1800 void *TaskCode(void *argument) { struct timeval tv; gettimeofday(tv, 0); srandom((getpid() 16) ^ getuid() ^ tv.tv_sec ^ tv.tv_usec); return NULL; } int main(void) { pthread_t threads[NUM_THREADS]; int thread_args[NUM_THREADS]; int rc, i; /* create all threads */ for (i=0; iNUM_THREADS; ++i) { thread_args[i] = i; rc = pthread_create(threads[i], NULL, TaskCode, (void *) thread_args[i]); assert(0 == rc); } /* wait for all threads to complete */ for (i=0; iNUM_THREADS; ++i) { rc = pthread_join(threads[i], NULL); assert(0 == rc); } printf(Test srandom success\n); exit(EXIT_SUCCESS); } 2) backtrace for test case. Program received signal SIGSEGV, Segmentation fault. [Switching to thread 1030380] 0x19d34d618f8e in random () at /usr/src/lib/libc/stdlib/random.c:387 387 *fptr += *rptr; (gdb) bt #0 0x19d34d618f8e in random () at /usr/src/lib/libc/stdlib/random.c:387 #1 0x19d34d619169 in srandom (x=Variable x is not available. ) at /usr/src/lib/libc/stdlib/random.c:216 #2 0x19d14fe1 in TaskCode (argument=0x7f7ea004) at test_srandom.c:14 #3 0x19d34999d11e in _rthread_start (v=Variable v is not available. ) at /usr/src/lib/librthread/rthread.c:122 #4 0x19d34d5f0f9b in __tfork_thread () at /usr/src/lib/libc/arch/amd64/sys/tfork_thread.S:75 Cannot access memory at address 0x19d344efb000 3) locking diff. Index: lib/libc/include/thread_private.h === RCS file: /cvs/src/lib/libc/include/thread_private.h,v retrieving revision 1.25 diff -u -p -r1.25 thread_private.h --- lib/libc/include/thread_private.h 16 Oct 2011 06:29:56 - 1.25 +++ lib/libc/include/thread_private.h 27 Sep 2012 10:48:45 - @@ -172,4 +172,16 @@ void _thread_arc4_unlock(void); _thread_arc4_unlock();\ } while (0) +void _thread_random_lock(void); +void _thread_random_unlock(void); + +#define _RANDOM_LOCK() do {\ + if (__isthreaded) \ + _thread_random_lock(); \ + } while (0) +#define _RANDOM_UNLOCK() do {\ + if (__isthreaded) \ + _thread_random_unlock();\ + } while (0) + #endif /* _THREAD_PRIVATE_H_ */ Index: lib/libc/stdlib/random.c === RCS file: /cvs/src/lib/libc/stdlib/random.c,v retrieving revision 1.17 diff -u -p -r1.17 random.c --- lib/libc/stdlib/random.c1 Jun 2012 01:01:57 - 1.17 +++ lib/libc/stdlib/random.c27 Sep 2012 10:48:45 - @@ -35,6 +35,10 @@ #include stdio.h #include stdlib.h #include unistd.h +#include thread_private.h + +static void srandom_unlocked(unsigned int); +static long random_unlocked(void); /* * random.c: @@ -186,8 +190,8 @@ static int rand_sep = SEP_3; * introduced by the L.C.R.N.G. Note that the initialization of randtbl[] * for default usage relies on values produced by this routine. */ -void -srandom(unsigned int x) +static void +srandom_unlocked(unsigned int x) { int i; int32_t test; @@ -213,10 +217,18 @@ srandom(unsigned int x) fptr = state[rand_sep]; rptr = state[0]; for (i = 0; i 10 * rand_deg; i++) - (void)random(); + (void)random_unlocked(); } } +void +srandom(unsigned int x) +{ + _RANDOM_LOCK(); + srandom_unlocked(x); + _RANDOM_UNLOCK(); +} + /* * srandomdev: * @@ -273,12 +285,15 @@ initstate(u_int seed, char *arg_state, s { char *ostate
[PATCH, TEST] Make functions in random.c thread safe
Hi. With input from tedu@, guenther@ and others, below are: 1) test case; 2) backtrace for test case; 3) locking diff; 4) dmesg (amd64 GENERIC.MP built from 2012-09-28 CVS). Diff introduces no changes to srandomdev(): correct me if I'm wrong, but no mutex can be used since sysctl can sleep. Rebuild and reinstall in src/lib/librthread and src/lib/libc after applying the diff. Expect test case (and Kannel port of course) not crashing after rebuild and reinstall. Cheers, Alexey 1) test case. #include pthread.h #include stdio.h #include stdlib.h #include assert.h #include unistd.h #define NUM_THREADS1800 void *TaskCode(void *argument) { struct timeval tv; gettimeofday(tv, 0); srandom((getpid() 16) ^ getuid() ^ tv.tv_sec ^ tv.tv_usec); return NULL; } int main(void) { pthread_t threads[NUM_THREADS]; int thread_args[NUM_THREADS]; int rc, i; /* create all threads */ for (i=0; iNUM_THREADS; ++i) { thread_args[i] = i; rc = pthread_create(threads[i], NULL, TaskCode, (void *) thread_args[i]); assert(0 == rc); } /* wait for all threads to complete */ for (i=0; iNUM_THREADS; ++i) { rc = pthread_join(threads[i], NULL); assert(0 == rc); } printf(Test srandom success\n); exit(EXIT_SUCCESS); } 2) backtrace for test case. Program received signal SIGSEGV, Segmentation fault. [Switching to thread 1030380] 0x19d34d618f8e in random () at /usr/src/lib/libc/stdlib/random.c:387 387 *fptr += *rptr; (gdb) bt #0 0x19d34d618f8e in random () at /usr/src/lib/libc/stdlib/random.c:387 #1 0x19d34d619169 in srandom (x=Variable x is not available. ) at /usr/src/lib/libc/stdlib/random.c:216 #2 0x19d14fe1 in TaskCode (argument=0x7f7ea004) at test_srandom.c:14 #3 0x19d34999d11e in _rthread_start (v=Variable v is not available. ) at /usr/src/lib/librthread/rthread.c:122 #4 0x19d34d5f0f9b in __tfork_thread () at /usr/src/lib/libc/arch/amd64/sys/tfork_thread.S:75 Cannot access memory at address 0x19d344efb000 3) locking diff. Index: lib/libc/include/thread_private.h === RCS file: /cvs/src/lib/libc/include/thread_private.h,v retrieving revision 1.25 diff -u -p -r1.25 thread_private.h --- lib/libc/include/thread_private.h 16 Oct 2011 06:29:56 - 1.25 +++ lib/libc/include/thread_private.h 27 Sep 2012 10:48:45 - @@ -172,4 +172,16 @@ void _thread_arc4_unlock(void); _thread_arc4_unlock();\ } while (0) +void _thread_random_lock(void); +void _thread_random_unlock(void); + +#define _RANDOM_LOCK() do {\ + if (__isthreaded) \ + _thread_random_lock(); \ + } while (0) +#define _RANDOM_UNLOCK() do {\ + if (__isthreaded) \ + _thread_random_unlock();\ + } while (0) + #endif /* _THREAD_PRIVATE_H_ */ Index: lib/libc/stdlib/random.c === RCS file: /cvs/src/lib/libc/stdlib/random.c,v retrieving revision 1.17 diff -u -p -r1.17 random.c --- lib/libc/stdlib/random.c1 Jun 2012 01:01:57 - 1.17 +++ lib/libc/stdlib/random.c27 Sep 2012 10:48:45 - @@ -35,6 +35,10 @@ #include stdio.h #include stdlib.h #include unistd.h +#include thread_private.h + +static void srandom_unlocked(unsigned int); +static long random_unlocked(void); /* * random.c: @@ -186,8 +190,8 @@ static int rand_sep = SEP_3; * introduced by the L.C.R.N.G. Note that the initialization of randtbl[] * for default usage relies on values produced by this routine. */ -void -srandom(unsigned int x) +static void +srandom_unlocked(unsigned int x) { int i; int32_t test; @@ -213,10 +217,18 @@ srandom(unsigned int x) fptr = state[rand_sep]; rptr = state[0]; for (i = 0; i 10 * rand_deg; i++) - (void)random(); + (void)random_unlocked(); } } +void +srandom(unsigned int x) +{ + _RANDOM_LOCK(); + srandom_unlocked(x); + _RANDOM_UNLOCK(); +} + /* * srandomdev: * @@ -273,12 +285,15 @@ initstate(u_int seed, char *arg_state, s { char *ostate = (char *)(state[-1]); + _RANDOM_LOCK(); if (rand_type == TYPE_0) state[-1] = rand_type; else state[-1] = MAX_TYPES * (rptr - state) + rand_type; - if (n BREAK_0) + if (n BREAK_0) { +
Re: Threads related SIGSEGV in random.c (diff, v2)
On Thursday, September 27, 2012, Philip Guenther wrote: On Thu, 27 Sep 2012, Alexey Suslikov wrote: Removing only local variables part reverts us to previous behavior (i.e. crashes). My guess is your program is calling srandom(), srandomdev(), initstate() or setstate() as well. Your diff doesn't protect the alteration of state, end_ptr, fptr, and rptr on those paths, so a call to initstate() while another thread is in random() can walk fptr and/or rptr out of the state array. Add the necessary locking in them and run your tests again. If not, well, crank up your debugging skills. What was the line of code that actually triggered the crash? Where did the bogus pointer come from? Crash: Program received signal SIGSEGV, Segmentation fault. [Switching to thread 1006387] 0x0cb33345cf6e in random () at /usr/src/lib/libc/stdlib/random.c:387 387 *fptr += *rptr; Back trace: Thread 10 (thread 1003160): #0 0x0cb33344135a in _thread_sys___thrsleep () at stdin:2 #1 0x0cb3315fac2a in pthread_cond_wait (condp=0xcb32a79c4b0, mutexp=Variable mutexp is not available. ) at /usr/src/lib/librthread/rthread_sync.c:500 #2 0x0cb129f836ba in gwlist_consume () from /usr/local/sbin/bearerbox #3 0x0cb129f121f1 in boxc_sender () from /usr/local/sbin/bearerbox #4 0x0cb129f828dd in new_thread () from /usr/local/sbin/bearerbox #5 0x0cb3315f911e in _rthread_start (v=Variable v is not available. ) at /usr/src/lib/librthread/rthread.c:122 #6 0x0cb333434f9b in __tfork_thread () at /usr/src/lib/libc/arch/amd64/sys/tfork_thread.S:75 Cannot access memory at address 0xcb32b27c000 0x0cb33345cf6e 387 *fptr += *rptr; I'm starting to believe that static globals are not good. They are incredibly good at what they do. If you're trying to say that they fundamentally can't be thread-safe, you'll need some extraordinary evidence for such a claim. What good they do? Cheers, Alexey
Threads related SIGSEGV in random.c (diff, v2)
On Thursday, September 27, 2012, Alexey Suslikov wrote: On Thursday, September 27, 2012, Philip Guenther wrote: On Thu, 27 Sep 2012, Alexey Suslikov wrote: Removing only local variables part reverts us to previous behavior (i.e. crashes). My guess is your program is calling srandom(), srandomdev(), initstate() or setstate() as well. Your diff doesn't protect the alteration of state, end_ptr, fptr, and rptr on those paths, so a call to initstate() while another thread is in random() can walk fptr and/or rptr out of the state array. Add the necessary locking in them and run your tests again. If not, well, crank up your debugging skills. What was the line of code that actually triggered the crash? Where did the bogus pointer come from? Crash: Program received signal SIGSEGV, Segmentation fault. [Switching to thread 1006387] 0x0cb33345cf6e in random () at /usr/src/lib/libc/stdlib/random.c:387 387 *fptr += *rptr; Back trace: Thread 10 (thread 1003160): #0 0x0cb33344135a in _thread_sys___thrsleep () at stdin:2 #1 0x0cb3315fac2a in pthread_cond_wait (condp=0xcb32a79c4b0, mutexp=Variable mutexp is not available. ) at /usr/src/lib/librthread/rthread_sync.c:500 #2 0x0cb129f836ba in gwlist_consume () from /usr/local/sbin/bearerbox #3 0x0cb129f121f1 in boxc_sender () from /usr/local/sbin/bearerbox #4 0x0cb129f828dd in new_thread () from /usr/local/sbin/bearerbox #5 0x0cb3315f911e in _rthread_start (v=Variable v is not available. ) at /usr/src/lib/librthread/rthread.c:122 #6 0x0cb333434f9b in __tfork_thread () at /usr/src/lib/libc/arch/amd64/sys/tfork_thread.S:75 Cannot access memory at address 0xcb32b27c000 0x0cb33345cf6e 387 *fptr += *rptr; I'm starting to believe that static globals are not good. They are incredibly good at what they do. If you're trying to say that they fundamentally can't be thread-safe, you'll need some extraordinary evidence for such a claim. What good they do? Philip, can you help us to write threaded test case (spawning a number of threads each calling random)?
Re: Threads related SIGSEGV in random.c (diff, v2)
Hi. Any news on that? On Friday, September 21, 2012, Alexey Suslikov wrote: On Fri, Sep 21, 2012 at 10:36 AM, Alexey Suslikov alexey.susli...@gmail.com javascript:; wrote: On Wed, Sep 19, 2012 at 10:24 PM, Ted Unangst t...@tedunangst.comjavascript:; wrote: On Wed, Sep 19, 2012 at 18:50, Alexey Suslikov wrote: On Wednesday, September 19, 2012, Theo de Raadt wrote: arc4random() is also thread-safe (it has interal locking) and very desirable for other reasons. But no way to save state. The last part of this is intentional. Saving the state of pseudo random number generators is a stupid concept from the 80's. I see many rng functions behaving very differently. Is it a good idea to create a common locking layer on top of need-to-be-safe rng functions? Or we should deal only with original problem (and only port random.c code from netbsd)? just slap a mutex around it. With the diff below Kannel no longer crashes. Only protecting random() for now. Make random() thread-safe by surrounding real call with a mutex locking. Found by and diff from Roman Kravchuk. Mainly from NetBSD. Sorry. Here is correct diff. We kinda unsure about the approach. For now, we follow arc4random pattern. Should we use generic _thread_mutex_lock/_thread_mutex_unlock instead? Index: lib/libc/include/thread_private.h === RCS file: /cvs/src/lib/libc/include/thread_private.h,v retrieving revision 1.25 diff -u -p -r1.25 thread_private.h --- lib/libc/include/thread_private.h 16 Oct 2011 06:29:56 - 1.25 +++ lib/libc/include/thread_private.h 21 Sep 2012 07:59:34 - @@ -172,4 +172,16 @@ void _thread_arc4_unlock(void); _thread_arc4_unlock();\ } while (0) +void _thread_random_lock(void); +void _thread_random_unlock(void); + +#define _RANDOM_LOCK() do {\ + if (__isthreaded) \ + _thread_random_lock(); \ + } while (0) +#define _RANDOM_UNLOCK() do {\ + if (__isthreaded) \ + _thread_random_unlock();\ + } while (0) + #endif /* _THREAD_PRIVATE_H_ */ Index: lib/libc/stdlib/random.c === RCS file: /cvs/src/lib/libc/stdlib/random.c,v retrieving revision 1.17 diff -u -p -r1.17 random.c --- lib/libc/stdlib/random.c1 Jun 2012 01:01:57 - 1.17 +++ lib/libc/stdlib/random.c21 Sep 2012 07:59:35 - @@ -35,6 +35,7 @@ #include stdio.h #include stdlib.h #include unistd.h +#include thread_private.h /* * random.c: @@ -376,21 +377,38 @@ setstate(char *arg_state) * * Returns a 31-bit random number. */ -long -random(void) +static long +random_unlocked(void) { int32_t i; + int32_t *f, *r; if (rand_type == TYPE_0) i = state[0] = (state[0] * 1103515245 + 12345) 0x7fff; else { - *fptr += *rptr; - i = (*fptr 1) 0x7fff; /* chucking least random bit */ - if (++fptr = end_ptr) { - fptr = state; - ++rptr; - } else if (++rptr = end_ptr) - rptr = state; + /* +* Use local variables rather than static variables for speed. +*/ + f = fptr; r = rptr; + *f += *r; + i = (*f 1) 0x7fff; /* chucking least random bit */ + if (++f = end_ptr) { + f = state; + ++r; + } else if (++r = end_ptr) + r = state; + fptr = f; rptr = r; } return((long)i); +} + +long +random(void) +{ + long r; + + _RANDOM_LOCK(); + r = random_unlocked(); + _RANDOM_UNLOCK(); + return (r); } Index: lib/libc/thread/unithread_malloc_lock.c === RCS file: /cvs/src/lib/libc/thread/unithread_malloc_lock.c,v retrieving revision 1.8 diff -u -p -r1.8 unithread_malloc_lock.c --- lib/libc/thread/unithread_malloc_lock.c 13 Jun 2008 21:18:43 - 1.8 +++ lib/libc/thread/unithread_malloc_lock.c 21 Sep 2012 07:59:35 - @@ -21,6 +21,12 @@ WEAK_PROTOTYPE(_thread_arc4_unlock); WEAK_ALIAS(_thread_arc4_lock); WEAK_ALIAS(_thread_arc4_unlock); +WEAK_PROTOTYPE(_thread_random_lock); +WEAK_PROTOTYPE(_thread_random_unlock); + +WEAK_ALIAS(_thread_random_lock); +WEAK_ALIAS
Re: Threads related SIGSEGV in random.c (diff, v2)
On Wed, Sep 26, 2012 at 9:51 PM, Ted Unangst t...@tedunangst.com wrote: On Wed, Sep 26, 2012 at 11:18, Alexey Suslikov wrote: Hi. Any news on that? Can we do it without the local variables for speed part? I am not interested in making this function faster. Removing only local variables part reverts us to previous behavior (i.e. crashes). However, leaving current code as is but adding only local variables (see below) passes our test with no crashes. I'm starting to believe that static globals are not good. Can somebody help us with writing threaded test case? As I mentioned above, we use Kannel port as a test which is somewhat hard to share. Alexey Index: lib/libc/stdlib/random.c === RCS file: /cvs/src/lib/libc/stdlib/random.c,v retrieving revision 1.17 diff -u -p -r1.17 random.c --- lib/libc/stdlib/random.c1 Jun 2012 01:01:57 - 1.17 +++ lib/libc/stdlib/random.c26 Sep 2012 20:30:46 - @@ -380,17 +380,20 @@ long random(void) { int32_t i; + int32_t *f, *r; if (rand_type == TYPE_0) i = state[0] = (state[0] * 1103515245 + 12345) 0x7fff; else { - *fptr += *rptr; - i = (*fptr 1) 0x7fff; /* chucking least random bit */ - if (++fptr = end_ptr) { - fptr = state; - ++rptr; - } else if (++rptr = end_ptr) - rptr = state; + f = fptr; r = rptr; + *f += *r; + i = (*f 1) 0x7fff; /* chucking least random bit */ + if (++f = end_ptr) { + f = state; + ++r; + } else if (++r = end_ptr) + r = state; + fptr = f; rptr = r; } return((long)i); }
Re: Threads related SIGSEGV in random.c
On Wed, Sep 19, 2012 at 10:24 PM, Ted Unangst t...@tedunangst.com wrote: On Wed, Sep 19, 2012 at 18:50, Alexey Suslikov wrote: On Wednesday, September 19, 2012, Theo de Raadt wrote: arc4random() is also thread-safe (it has interal locking) and very desirable for other reasons. But no way to save state. The last part of this is intentional. Saving the state of pseudo random number generators is a stupid concept from the 80's. I see many rng functions behaving very differently. Is it a good idea to create a common locking layer on top of need-to-be-safe rng functions? Or we should deal only with original problem (and only port random.c code from netbsd)? just slap a mutex around it. With the diff below Kannel no longer crashes. Only protecting random() for now. Make random() thread-safe by surrounding real call with a mutex locking. Found by and diff from Roman Kravchuk. Mainly from NetBSD. Index: include/thread_private.h === RCS file: /cvs/src/lib/libc/include/thread_private.h,v retrieving revision 1.25 diff -u -p -r1.25 thread_private.h --- include/thread_private.h16 Oct 2011 06:29:56 - 1.25 +++ include/thread_private.h20 Sep 2012 22:10:49 - @@ -172,4 +172,16 @@ void _thread_arc4_unlock(void); _thread_arc4_unlock();\ } while (0) +void _thread_random_lock(void); +void _thread_random_unlock(void); + +#define _RANDOM_LOCK() do {\ + if (__isthreaded) \ + _thread_random_lock(); \ + } while (0) +#define _RANDOM_UNLOCK() do {\ + if (__isthreaded) \ + _thread_random_unlock();\ + } while (0) + #endif /* _THREAD_PRIVATE_H_ */ Index: stdlib/random.c === RCS file: /cvs/src/lib/libc/stdlib/random.c,v retrieving revision 1.17 diff -u -p -r1.17 random.c --- stdlib/random.c 1 Jun 2012 01:01:57 - 1.17 +++ stdlib/random.c 20 Sep 2012 22:10:50 - @@ -35,6 +35,7 @@ #include stdio.h #include stdlib.h #include unistd.h +#include thread_private.h /* * random.c: @@ -376,21 +377,38 @@ setstate(char *arg_state) * * Returns a 31-bit random number. */ -long -random(void) +static long +random_unlocked(void) { int32_t i; + int32_t *f, *r; if (rand_type == TYPE_0) i = state[0] = (state[0] * 1103515245 + 12345) 0x7fff; else { - *fptr += *rptr; - i = (*fptr 1) 0x7fff; /* chucking least random bit */ - if (++fptr = end_ptr) { - fptr = state; - ++rptr; - } else if (++rptr = end_ptr) - rptr = state; + /* +* Use local variables rather than static variables for speed. +*/ + f = fptr; r = rptr; + *f += *r; + i = (*f 1) 0x7fff; /* chucking least random bit */ + if (++f = end_ptr) { + f = state; + ++r; + } else if (++r = end_ptr) + r = state; + fptr = f; rptr = r; } return((long)i); +} + +long +random(void) +{ + long r; + + _RANDOM_LOCK(); + r = random_unlocked(); + _RANDOM_UNLOCK(); + return (r); } Index: thread/unithread_malloc_lock.c === RCS file: /cvs/src/lib/libc/thread/unithread_malloc_lock.c,v retrieving revision 1.8 diff -u -p -r1.8 unithread_malloc_lock.c --- thread/unithread_malloc_lock.c 13 Jun 2008 21:18:43 - 1.8 +++ thread/unithread_malloc_lock.c 20 Sep 2012 22:10:50 - @@ -21,6 +21,12 @@ WEAK_PROTOTYPE(_thread_arc4_unlock); WEAK_ALIAS(_thread_arc4_lock); WEAK_ALIAS(_thread_arc4_unlock); +WEAK_PROTOTYPE(_thread_random_lock); +WEAK_PROTOTYPE(_thread_random_unlock); + +WEAK_ALIAS(_thread_random_lock); +WEAK_ALIAS(_thread_random_unlock); + void WEAK_NAME(_thread_malloc_lock)(void) { @@ -53,6 +59,18 @@ WEAK_NAME(_thread_arc4_lock)(void) void WEAK_NAME(_thread_arc4_unlock)(void) +{ + return; +} + +void +WEAK_NAME(_thread_random_lock)(void) +{ + return; +} + +void +WEAK_NAME(_thread_random_unlock)(void) { return; }
Threads related SIGSEGV in random.c (diff, v2)
On Fri, Sep 21, 2012 at 10:36 AM, Alexey Suslikov alexey.susli...@gmail.com wrote: On Wed, Sep 19, 2012 at 10:24 PM, Ted Unangst t...@tedunangst.com wrote: On Wed, Sep 19, 2012 at 18:50, Alexey Suslikov wrote: On Wednesday, September 19, 2012, Theo de Raadt wrote: arc4random() is also thread-safe (it has interal locking) and very desirable for other reasons. But no way to save state. The last part of this is intentional. Saving the state of pseudo random number generators is a stupid concept from the 80's. I see many rng functions behaving very differently. Is it a good idea to create a common locking layer on top of need-to-be-safe rng functions? Or we should deal only with original problem (and only port random.c code from netbsd)? just slap a mutex around it. With the diff below Kannel no longer crashes. Only protecting random() for now. Make random() thread-safe by surrounding real call with a mutex locking. Found by and diff from Roman Kravchuk. Mainly from NetBSD. Sorry. Here is correct diff. We kinda unsure about the approach. For now, we follow arc4random pattern. Should we use generic _thread_mutex_lock/_thread_mutex_unlock instead? Index: lib/libc/include/thread_private.h === RCS file: /cvs/src/lib/libc/include/thread_private.h,v retrieving revision 1.25 diff -u -p -r1.25 thread_private.h --- lib/libc/include/thread_private.h 16 Oct 2011 06:29:56 - 1.25 +++ lib/libc/include/thread_private.h 21 Sep 2012 07:59:34 - @@ -172,4 +172,16 @@ void _thread_arc4_unlock(void); _thread_arc4_unlock();\ } while (0) +void _thread_random_lock(void); +void _thread_random_unlock(void); + +#define _RANDOM_LOCK() do {\ + if (__isthreaded) \ + _thread_random_lock(); \ + } while (0) +#define _RANDOM_UNLOCK() do {\ + if (__isthreaded) \ + _thread_random_unlock();\ + } while (0) + #endif /* _THREAD_PRIVATE_H_ */ Index: lib/libc/stdlib/random.c === RCS file: /cvs/src/lib/libc/stdlib/random.c,v retrieving revision 1.17 diff -u -p -r1.17 random.c --- lib/libc/stdlib/random.c1 Jun 2012 01:01:57 - 1.17 +++ lib/libc/stdlib/random.c21 Sep 2012 07:59:35 - @@ -35,6 +35,7 @@ #include stdio.h #include stdlib.h #include unistd.h +#include thread_private.h /* * random.c: @@ -376,21 +377,38 @@ setstate(char *arg_state) * * Returns a 31-bit random number. */ -long -random(void) +static long +random_unlocked(void) { int32_t i; + int32_t *f, *r; if (rand_type == TYPE_0) i = state[0] = (state[0] * 1103515245 + 12345) 0x7fff; else { - *fptr += *rptr; - i = (*fptr 1) 0x7fff; /* chucking least random bit */ - if (++fptr = end_ptr) { - fptr = state; - ++rptr; - } else if (++rptr = end_ptr) - rptr = state; + /* +* Use local variables rather than static variables for speed. +*/ + f = fptr; r = rptr; + *f += *r; + i = (*f 1) 0x7fff; /* chucking least random bit */ + if (++f = end_ptr) { + f = state; + ++r; + } else if (++r = end_ptr) + r = state; + fptr = f; rptr = r; } return((long)i); +} + +long +random(void) +{ + long r; + + _RANDOM_LOCK(); + r = random_unlocked(); + _RANDOM_UNLOCK(); + return (r); } Index: lib/libc/thread/unithread_malloc_lock.c === RCS file: /cvs/src/lib/libc/thread/unithread_malloc_lock.c,v retrieving revision 1.8 diff -u -p -r1.8 unithread_malloc_lock.c --- lib/libc/thread/unithread_malloc_lock.c 13 Jun 2008 21:18:43 - 1.8 +++ lib/libc/thread/unithread_malloc_lock.c 21 Sep 2012 07:59:35 - @@ -21,6 +21,12 @@ WEAK_PROTOTYPE(_thread_arc4_unlock); WEAK_ALIAS(_thread_arc4_lock); WEAK_ALIAS(_thread_arc4_unlock); +WEAK_PROTOTYPE(_thread_random_lock); +WEAK_PROTOTYPE(_thread_random_unlock); + +WEAK_ALIAS(_thread_random_lock); +WEAK_ALIAS(_thread_random_unlock); + void WEAK_NAME(_thread_malloc_lock)(void) { @@ -53,6 +59,18 @@ WEAK_NAME(_thread_arc4_lock)(void) void WEAK_NAME(_thread_arc4_unlock)(void) +{ + return; +} + +void +WEAK_NAME(_thread_random_lock)(void
Re: Threads related SIGSEGV in random.c
On Wednesday, September 19, 2012, Theo de Raadt wrote: arc4random() is also thread-safe (it has interal locking) and very desirable for other reasons. But no way to save state. The last part of this is intentional. Saving the state of pseudo random number generators is a stupid concept from the 80's. I see many rng functions behaving very differently. Is it a good idea to create a common locking layer on top of need-to-be-safe rng functions? Or we should deal only with original problem (and only port random.c code from netbsd)?
Re: Threads related SIGSEGV in random.c
On Wed, Sep 19, 2012 at 10:24 PM, Ted Unangst t...@tedunangst.com wrote: On Wed, Sep 19, 2012 at 18:50, Alexey Suslikov wrote: On Wednesday, September 19, 2012, Theo de Raadt wrote: arc4random() is also thread-safe (it has interal locking) and very desirable for other reasons. But no way to save state. The last part of this is intentional. Saving the state of pseudo random number generators is a stupid concept from the 80's. I see many rng functions behaving very differently. Is it a good idea to create a common locking layer on top of need-to-be-safe rng functions? Or we should deal only with original problem (and only port random.c code from netbsd)? just slap a mutex around it. Could you guide me how to rebuild/reinstall libc in a proper way?