Re: pthread_getspecific is too slow

2013-09-30 Thread John Carr
Here is a diff relative of OpenBSD 5.3 to avoid taking a lock in
pthread_getspecific in the common case where storage for the key is
already allocated.

I have only tested this patch for my use case where a single key is
created once and used many times.  In that case, the bottleneck I
observed is gone.  My parallel program sees linear speedup.

This patch would be unsafe if elements were ever removed from the
per-thread list of allocated storage.  In the 5.3 implementation the
list can only grow.  Its size is bounded by PTHREAD_KEYS_MAX so I
do not consider this a leak.  Possibly it should be an array instead
of a list.

Index: rthread_tls.c
===
RCS file: /cvs/src/lib/librthread/rthread_tls.c,v
retrieving revision 1.13
diff -c -r1.13 rthread_tls.c
*** rthread_tls.c	6 Nov 2011 11:48:59 -	1.13
--- rthread_tls.c	30 Sep 2013 14:27:13 -
***
*** 96,102 
  	struct rthread_storage *rs;
  	pthread_t self;
  
! 	_spinlock(rkeyslock);
  	if (!rkeys[key].used) {
  		rs = NULL;
  		goto out;
--- 96,107 
  	struct rthread_storage *rs;
  	pthread_t self;
  
! 	/* No lock is needed if
! 	   (1) the key is not valid (undefined behavior)
! 	   (2) storage is already allocated for the key (the linked
! 	   list is only ever modified by adding to the head)
! 	*/
! 
  	if (!rkeys[key].used) {
  		rs = NULL;
  		goto out;
***
*** 109,125 
  			break;
  	}
  	if (!rs) {
! 		rs = calloc(1, sizeof(*rs));
! 		if (!rs)
! 			goto out;
! 		rs-keyid = key;
! 		rs-data = NULL;
! 		rs-next = self-local_storage;
! 		self-local_storage = rs;
  	}
  
  out:
- 	_spinunlock(rkeyslock);
  	return (rs);
  }
  
--- 114,138 
  			break;
  	}
  	if (!rs) {
! 		_spinlock(rkeyslock);
! 		/* Repeat search with lock held */
! 		for (rs = self-local_storage; rs; rs = rs-next) {
! 			if (rs-keyid == key)
! break;
! 		}
! 		if (!rs) {
! 			rs = calloc(1, sizeof(*rs));
! 			if (rs) {
! rs-keyid = key;
! rs-data = NULL;
! rs-next = self-local_storage;
! self-local_storage = rs;
! 			}
! 		}
! 		_spinunlock(rkeyslock);
  	}
  
  out:
  	return (rs);
  }
  
 
 Problem in a nutshell: pthread_getspecific serializes execution of
 threads using it.  Without a better implementation my program doesn't
 work on OpenBSD.
 
 I am trying to port Cilk+ to OpenBSD (5.3).
 
 Cilk+ is a multithreaded extension to C/C++.  It adds some bookkeeping
 operations in the prologue of some functions.  In many Cilk programs
 most function calls do this bookkeeping (dynamic count, not static).
 
 The per-call bookkeeping calls pthread_getspecific.  pthread_getspecific
 takes out a spinlock.  The lock is apparently needed in case of a race
 with pthread_key_delete.  This is unlikely to happen, but I suppose it
 is possible.  Every function call in this multithreaded program
 serializes waiting on the lock.  Also, the cache line with the lock is
 constantly moving between processors.
 
 This is worse than useless for Cilk.  You're much better off with a
 single threaded program.
 
 An older version of Cilk used a thread local storage class (__thread).
 If memory serves, the switch to pthread_getspecific was driven by a few
 considerations:
 
 1. Thread local variables don't get along well with shared libraries.
 
 2. Thread local variables are less portable.  OpenBSD doesn't really
 support them, for example. They are emulated with pthread_getspecific.
 
 3. On Linux/x86 pthread_getspecific is very fast, essentially a move
 instruction with a segment override.
 
 It seems to me the implementation of pthread_getspecific doesn't need to
 be as slow as it is.
 
 It ought to be possible to have multiple readers be always nonblocking
 as long as the key list doesn't change, and possibly even if it does
 change.  pthread_getspecific only needs a read lock rather than a mutex.
 The rwlock in librthread starts with a spinlock, so it's not the answer.
 
 Any thoughts?
 


Re: pthread_getspecific is too slow

2013-09-30 Thread John Carr
I got an email saying unified diff is preferred, so here's a resend
in that format.

Index: rthread_tls.c
===
RCS file: /cvs/src/lib/librthread/rthread_tls.c,v
retrieving revision 1.13
diff -u -r1.13 rthread_tls.c
--- rthread_tls.c	6 Nov 2011 11:48:59 -	1.13
+++ rthread_tls.c	30 Sep 2013 14:48:18 -
@@ -96,7 +96,12 @@
 	struct rthread_storage *rs;
 	pthread_t self;
 
-	_spinlock(rkeyslock);
+	/* No lock is needed if
+	   (1) the key is not valid (undefined behavior)
+	   (2) storage is already allocated for the key (the linked
+	   list is only ever modified by adding to the head)
+	*/
+
 	if (!rkeys[key].used) {
 		rs = NULL;
 		goto out;
@@ -109,17 +114,25 @@
 			break;
 	}
 	if (!rs) {
-		rs = calloc(1, sizeof(*rs));
-		if (!rs)
-			goto out;
-		rs-keyid = key;
-		rs-data = NULL;
-		rs-next = self-local_storage;
-		self-local_storage = rs;
+		_spinlock(rkeyslock);
+		/* Repeat search with lock held */
+		for (rs = self-local_storage; rs; rs = rs-next) {
+			if (rs-keyid == key)
+break;
+		}
+		if (!rs) {
+			rs = calloc(1, sizeof(*rs));
+			if (rs) {
+rs-keyid = key;
+rs-data = NULL;
+rs-next = self-local_storage;
+self-local_storage = rs;
+			}
+		}
+		_spinunlock(rkeyslock);
 	}
 
 out:
-	_spinunlock(rkeyslock);
 	return (rs);
 }
 
 Here is a diff relative of OpenBSD 5.3 to avoid taking a lock in
 pthread_getspecific in the common case where storage for the key is
 already allocated.
 
 I have only tested this patch for my use case where a single key is
 created once and used many times.  In that case, the bottleneck I
 observed is gone.  My parallel program sees linear speedup.
 
 This patch would be unsafe if elements were ever removed from the
 per-thread list of allocated storage.  In the 5.3 implementation the
 list can only grow.  Its size is bounded by PTHREAD_KEYS_MAX so I
 do not consider this a leak.  Possibly it should be an array instead
 of a list.
 


Re: openbsd ioctl fix (in6.c)

2013-09-30 Thread Alexander Bluhm
On Wed, Sep 18, 2013 at 12:01:10AM -0700, Loganaden Velvindron wrote:
 Index: in6.c
 ===
 RCS file: /cvs/src/sys/netinet6/in6.c,v
 retrieving revision 1.118
 diff -u -p -r1.118 in6.c
 --- in6.c   26 Aug 2013 07:15:58 -  1.118
 +++ in6.c   18 Sep 2013 06:54:13 -
 @@ -426,8 +426,11 @@ in6_control(struct socket *so, u_long cm
 sa6 = ifr-ifr_addr;
 break;
 case SIOCSIFADDR:
 +   case SIOCSIFDSTADDR:
 +   case SIOCSIFBRDADDR:
 +   case SIOCSIFNETMASK:
 /*
 -* Do not pass this ioctl to driver handler since it is not
 +* Do not pass those ioctl to driver handler since they are 
 not
  * properly setup. Instead just error out.
  */
 return (EOPNOTSUPP);

This diff uses spaces instead of tabs.  Please use tabs to make
diffs apply cleanly.

The errno EAFNOSUPPORT Address family not supported by protocol
family is more specific for that error, at least if_ppp and if_sl
use that.

anyway, code is correct, OK bluhm@



revision 1.201 of sys/conf/GENERIC (enable mpath, rdac and sym) breaks suspend on Thinkpad X200s

2013-09-30 Thread Andreas Bartelt

Hi,

with this patch enabled on a Thinkpad X200s, the system doesn't wake up 
anymore after being suspended via apm -z. This problem is reproducible.


dmesg before and after this patch is attached to this mail.

Best Regards
Andreas

OpenBSD 5.4-current (GENERIC.MP) #0: Mon Sep 30 22:21:24 CEST 2013
a...@obsd.bartula.de:/usr/src/sys/arch/amd64/compile/GENERIC.MP
real mem = 4166717440 (3973MB)
avail mem = 4047679488 (3860MB)
mpath0 at root
scsibus0 at mpath0: 256 targets
mainbus0 at root
bios0 at mainbus0: SMBIOS rev. 2.4 @ 0xe0010 (68 entries)
bios0: vendor LENOVO version 6DET72WW (3.22 ) date 10/25/2012
bios0: LENOVO 746988G
acpi0 at bios0: rev 2
acpi0: sleep states S0 S3 S4 S5
acpi0: tables DSDT FACP SSDT ECDT APIC MCFG HPET SLIC BOOT ASF! SSDT TCPA DMAR 
SSDT SSDT SSDT
acpi0: wakeup devices LID_(S3) SLPB(S3) IGBE(S4) EXP0(S4) EXP1(S4) EXP2(S4) 
EXP3(S4) USB0(S3) USB3(S3) USB5(S3) EHC0(S3) EHC1(S3) HDEF(S4)
acpitimer0 at acpi0: 3579545 Hz, 24 bits
acpiec0 at acpi0
acpimadt0 at acpi0 addr 0xfee0: PC-AT compat
cpu0 at mainbus0: apid 0 (boot processor)
cpu0: Intel(R) Core(TM)2 Duo CPU L9600 @ 2.13GHz, 2128.28 MHz
cpu0: 
FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE,SSE3,DTES64,MWAIT,DS-CPL,VMX,SMX,EST,TM2,SSSE3,CX16,xTPR,PDCM,SSE4.1,XSAVE,NXE,LONG,LAHF,PERF
cpu0: 6MB 64b/line 16-way L2 cache
cpu0: smt 0, core 0, package 0
cpu0: apic clock running at 266MHz
cpu1 at mainbus0: apid 1 (application processor)
cpu1: Intel(R) Core(TM)2 Duo CPU L9600 @ 2.13GHz, 2128.01 MHz
cpu1: 
FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE,SSE3,DTES64,MWAIT,DS-CPL,VMX,SMX,EST,TM2,SSSE3,CX16,xTPR,PDCM,SSE4.1,XSAVE,NXE,LONG,LAHF,PERF
cpu1: 6MB 64b/line 16-way L2 cache
cpu1: smt 0, core 1, package 0
ioapic0 at mainbus0: apid 1 pa 0xfec0, version 20, 24 pins
ioapic0: misconfigured as apic 2, remapped to apid 1
acpimcfg0 at acpi0 addr 0xe000, bus 0-63
acpihpet0 at acpi0: 14318179 Hz
acpiprt0 at acpi0: bus 0 (PCI0)
acpiprt1 at acpi0: bus -1 (AGP_)
acpiprt2 at acpi0: bus 2 (EXP0)
acpiprt3 at acpi0: bus 3 (EXP1)
acpiprt4 at acpi0: bus -1 (EXP2)
acpiprt5 at acpi0: bus -1 (EXP3)
acpicpu0 at acpi0: C3, C2, C1, PSS
acpicpu1 at acpi0: C3, C2, C1, PSS
acpipwrres0 at acpi0: PUBS
acpitz0 at acpi0: critical temperature is 127 degC
acpitz1 at acpi0: critical temperature is 104 degC
acpibtn0 at acpi0: LID_
acpibtn1 at acpi0: SLPB
acpibat0 at acpi0: BAT0 model 42T4650 serial   355 type LION oem Panasonic
acpibat1 at acpi0: BAT1 not present
acpiac0 at acpi0: AC unit offline
acpithinkpad0 at acpi0
acpidock0 at acpi0: GDCK not docked (0)
cpu0: Enhanced SpeedStep 2128 MHz: speeds: 2134, 2133, 1600, 800 MHz
pci0 at mainbus0 bus 0
pchb0 at pci0 dev 0 function 0 Intel GM45 Host rev 0x07
vga1 at pci0 dev 2 function 0 Intel GM45 Video rev 0x07
intagp0 at vga1
agp0 at intagp0: aperture at 0xd000, size 0x1000
inteldrm0 at vga1
drm0 at inteldrm0
inteldrm0: 1440x900
wsdisplay0 at vga1 mux 1: console (std, vt100 emulation)
wsdisplay0: screen 1-5 added (std, vt100 emulation)
Intel GM45 Video rev 0x07 at pci0 dev 2 function 1 not configured
Intel GM45 HECI rev 0x07 at pci0 dev 3 function 0 not configured
em0 at pci0 dev 25 function 0 Intel ICH9 IGP M AMT rev 0x03: msi, address 
00:1f:16:1c:a3:34
uhci0 at pci0 dev 26 function 0 Intel 82801I USB rev 0x03: apic 1 int 20
uhci1 at pci0 dev 26 function 1 Intel 82801I USB rev 0x03: apic 1 int 21
uhci2 at pci0 dev 26 function 2 Intel 82801I USB rev 0x03: apic 1 int 22
ehci0 at pci0 dev 26 function 7 Intel 82801I USB rev 0x03: apic 1 int 23
usb0 at ehci0: USB revision 2.0
uhub0 at usb0 Intel EHCI root hub rev 2.00/1.00 addr 1
azalia0 at pci0 dev 27 function 0 Intel 82801I HD Audio rev 0x03: msi
azalia0: codecs: Conexant CX20561
audio0 at azalia0
ppb0 at pci0 dev 28 function 0 Intel 82801I PCIE rev 0x03: msi
pci1 at ppb0 bus 2
ppb1 at pci0 dev 28 function 1 Intel 82801I PCIE rev 0x03: msi
pci2 at ppb1 bus 3
iwn0 at pci2 dev 0 function 0 Intel WiFi Link 5300 rev 0x00: msi, MIMO 3T3R, 
MoW, address 00:21:6a:4f:9c:2a
uhci3 at pci0 dev 29 function 0 Intel 82801I USB rev 0x03: apic 1 int 16
uhci4 at pci0 dev 29 function 1 Intel 82801I USB rev 0x03: apic 1 int 17
uhci5 at pci0 dev 29 function 2 Intel 82801I USB rev 0x03: apic 1 int 18
ehci1 at pci0 dev 29 function 7 Intel 82801I USB rev 0x03: apic 1 int 19
usb1 at ehci1: USB revision 2.0
uhub1 at usb1 Intel EHCI root hub rev 2.00/1.00 addr 1
ppb2 at pci0 dev 30 function 0 Intel 82801BAM Hub-to-PCI rev 0x93
pci3 at ppb2 bus 13
pcib0 at pci0 dev 31 function 0 Intel 82801IEM LPC rev 0x03
ahci0 at pci0 dev 31 function 2 Intel 82801I AHCI rev 0x03: msi, AHCI 1.2
scsibus1 at ahci0: 32 targets
sd0 at scsibus1 targ 0 lun 0: ATA, INTEL SSDSA2CW12, 4PC1 SCSI3 0/direct 
fixed naa.5001517bb27a077b
sd0: 114473MB, 512 bytes/sector, 234441648 sectors, thin
ichiic0 at pci0 dev 31 function 3 Intel 82801I SMBus rev 0x03: apic 1 

Re: openbsd ioctl fix (in6.c)

2013-09-30 Thread Loganaden Velvindron
On Mon, Sep 30, 2013 at 10:51:47PM +0200, Alexander Bluhm wrote:
 On Wed, Sep 18, 2013 at 12:01:10AM -0700, Loganaden Velvindron wrote:
  Index: in6.c
  ===
  RCS file: /cvs/src/sys/netinet6/in6.c,v
  retrieving revision 1.118
  diff -u -p -r1.118 in6.c
  --- in6.c   26 Aug 2013 07:15:58 -  1.118
  +++ in6.c   18 Sep 2013 06:54:13 -
  @@ -426,8 +426,11 @@ in6_control(struct socket *so, u_long cm
  sa6 = ifr-ifr_addr;
  break;
  case SIOCSIFADDR:
  +   case SIOCSIFDSTADDR:
  +   case SIOCSIFBRDADDR:
  +   case SIOCSIFNETMASK:
  /*
  -* Do not pass this ioctl to driver handler since it is not
  +* Do not pass those ioctl to driver handler since they are 
  not
   * properly setup. Instead just error out.
   */
  return (EOPNOTSUPP);
 
 This diff uses spaces instead of tabs.  Please use tabs to make
 diffs apply cleanly.
 
 The errno EAFNOSUPPORT Address family not supported by protocol
 family is more specific for that error, at least if_ppp and if_sl
 use that.

Fixed style issues:

Index: netinet6/in6.c
===
RCS file: /cvs/src/sys/netinet6/in6.c,v
retrieving revision 1.118
diff -u -p -r1.118 in6.c
--- netinet6/in6.c  26 Aug 2013 07:15:58 -  1.118
+++ netinet6/in6.c  30 Sep 2013 21:14:43 -
@@ -426,8 +426,11 @@ in6_control(struct socket *so, u_long cm
sa6 = ifr-ifr_addr;
break;
case SIOCSIFADDR:
+   case SIOCSIFDSTADDR:
+   case SIOCSIFBRDADDR:
+   case SIOCSIFNETMASK:
/*
-* Do not pass this ioctl to driver handler since it is not
+* Do not pass those ioctl to driver handler since they are not
 * properly setup. Instead just error out.
 */
return (EOPNOTSUPP);

 
 anyway, code is correct, OK bluhm@

Can you please elaborate a bit concerning the cleanup for if_tun.c that
mpi@ mentioned ?



spamd: greyreader failed (Error 2) (was: Re: CVS: cvs.openbsd.org: src)

2013-09-30 Thread Constantine A. Murenin
Hello,

On OpenBSD 5.2 amd64, my spamd (which is used very selectively through 
pf(4)) seems to have died 20 days ago, after continuously running for 
many months, with the following final words in the logs:

Sep 10 09:49:25 Cns spamd[5220]: 87.225.1.10: connected (1/1), lists: 
spamd-greytrap
Sep 10 09:54:47 Cns spamd[29672]: greyreader failed (Error 2)

There were minor changes to grey.c since 5.2.

Back in 5.2, it would appear that at the end of grey.c#greyreader(), 
after the `grey` pipe is exhausted, greyreader() returns at the end 
of the function, and then grey.c#greywatcher() issues the error 
message, as above, unconditionally after greyreader() returns, 
and also exits.

Whereas it remains to be seen what kind of bug I'm facing here 
(Google reveals I'm not alone), it would appear that changes 
introduced in 5.4-current would no longer cause spamd to report 
such situation, because the 0 that would still be returned at the 
end of greyreader() would no longer cause greywatcher() to produce 
the error message that I have received (it'll still quit, though).

http://www.openbsd.org/cgi-bin/cvsweb/src/libexec/spamd/grey.c.diff?r1=1.52;r2=1.53;f=h
http://www.openbsd.org/cgi-bin/cvsweb/src/libexec/spamd/grey.c.diff?r2=1.53r1=1.52f=u

Basically, the below part of the recent commit seems to modify the 
behaviour that was not advertised to have been modified:


@@ -979,7 +979,7 @@ greyreader(void)
sync = 1;
if (grey == NULL) {
syslog_r(LOG_ERR, sdata, No greylist pipe stream!\n);
-   exit(1);
+   return (-1);
}
 
/* grab trap suffixes */
@@ -1140,10 +1140,11 @@ greywatcher(void)
 */
close(pfdev);
setproctitle((%s update), PATH_SPAMD_DB);
-   greyreader();
-   syslog_r(LOG_ERR, sdata, greyreader failed (%m));
-   /* NOTREACHED */
-   _exit(1);
+   if (greyreader() == -1) {
+   syslog_r(LOG_ERR, sdata, greyreader failed (%m));
+   _exit(1);
+   }
+   _exit(0);
}
 
 

http://bxr.su/o/libexec/spamd/grey.c#greywatcher

Is it indeed intentional that greyreader failed (Error 2) will be 
produced no more as a result of the above change?

Any ideas what caused greyreader to fail on my OpenBSD 5.2 in the first place?

Cheers,
Constantine.


On 2013-W34-3 10:13 -0600, Todd C. Miller wrote:
 CVSROOT:  /cvs
 Module name:  src
 Changes by:   mill...@cvs.openbsd.org 2013/08/21 10:13:30

 Modified files:
   usr.sbin/spamdb: Makefile spamdb.c 
   libexec/spamd  : Makefile grey.c grey.h 
   libexec/spamlogd: Makefile spamlogd.c 
 Added files:
   libexec/spamd  : gdcopy.c 

 Log message:
 Remove the use of time_t in the greylist db file and use int64_t instead
 with backwards compatibility for records with 32-bit times.
 OK deraadt@ beck@