Re: mutex question

2018-07-06 Thread Jason Thorpe



> On Jul 6, 2018, at 4:48 PM, Phil Nelson  wrote:
> 
> Hello,
> 
>The FreeBSD 802.11 code is using a call to mtx_sleep().  The define is:
> 
> #define mtx_sleep(chan, mtx, pri, wmesg, timo)  \
>_sleep((chan), &(mtx)->lock_object, (pri), (wmesg), \
>tick_sbt * (timo), 0, C_HARDCLOCK)

In NetBSD, you could use mtsleep(), but the best way to do this is with 
kcondvar_t's.  Each "wait channel" really become a discrete condition variable 
that is initialized separately, and you'd use cv_wait() for non-timeout sleeps 
and cv_timedwait() for timeout sleeps.  The "pri" argument is unneeded because 
the priority comes from the LWP that's sleeping, and the condvars do priority 
inheritance / boosting via turnstiles as needed.


> 
> 
> Just in case I can save time by getting an answer by asking before digging 
> deep ...
> does anyone know what I should translate this to in NetBSD?   Our mutex 
> routines
> do not appear to have any similar call.
> 
> --Phil

-- thorpej



mutex question

2018-07-06 Thread Phil Nelson
Hello,

The FreeBSD 802.11 code is using a call to mtx_sleep().  The define is:

#define mtx_sleep(chan, mtx, pri, wmesg, timo)  \
_sleep((chan), &(mtx)->lock_object, (pri), (wmesg), \
tick_sbt * (timo), 0, C_HARDCLOCK)


Just in case I can save time by getting an answer by asking before digging deep 
...
does anyone know what I should translate this to in NetBSD?   Our mutex routines
do not appear to have any similar call.

--Phil


Re: new errno ?

2018-07-06 Thread Phil Nelson
On Friday 06 July 2018 15:59:12 Jason Thorpe wrote:
> Anyway... in what situations is this absurd error code used in the 802.11 
> code?  EFAULT seems wrong because it means something very specific. 

The code is in ieee80211_output.c and says:

/* locate destination node */
switch (wh->i_fc[1] & IEEE80211_FC1_DIR_MASK) {
case IEEE80211_FC1_DIR_NODS:
case IEEE80211_FC1_DIR_FROMDS:
ni = ieee80211_find_txnode(vap, wh->i_addr1);
break;
case IEEE80211_FC1_DIR_TODS:
case IEEE80211_FC1_DIR_DSTODS:
ni = ieee80211_find_txnode(vap, wh->i_addr3);
break;
default:
senderr(EDOOFUS);
}

I agree,  EINVAL sounds closer.   Thanks.

--Phil


Re: new errno ?

2018-07-06 Thread Eitan Adler
On Fri, 6 Jul 2018 at 12:29, Phil Nelson  wrote:
>
> On Friday 06 July 2018 12:09:55 Greg Troxel wrote:
> >  I might just map it to EFAULT or EINVAL.
>
> I like this suggestion.  EFAULT

For those interested in some of the history:
https://lists.freebsd.org/pipermail/freebsd-hackers/2003-May/000791.html


-- 
Eitan Adler


Re: interesting skylake perf tidbit

2018-07-06 Thread Maxime Villard

Le 18/06/2018 à 14:29, m...@netbsd.org a écrit :

joerg called it stupid and said we should use monitor, he's probably
right. new arm also has a similar thing.


The thing is, there are several SPINLOCK_BACKOFF()s that we just can't
replace by monitor. For example because we want to measure the delay under
LOCKDEBUG.


(Perhaps a tunable read-mostly SPINLOCK_BACKOFF_* would be worthwhile?)


You mean, like, a configurable SPINLOCK_BACKOFF_MIN? Yes, that seems doable.

I guess we should do both; use "monitor" when possible, and in the places
that are still required to use "pause", use a lower BACKOFF_MIN (set at
boot time, depending on the cpu model) to compensate for the increased CPU
latency.

There will still be a loss in this case, because the cycle ratio is 14
(10c -> 140c), but BACKOFF_MIN equals 4.

By the way I'm wondering if KabyLake and later generations have the same
increased latency, if they do we definitely need to fix that.

I guess you have a Skylake, is there any seeable effect?


Also on the same topic: SPINLOCK_RUN_HOOK seems to be unused?


Yes, please remove.


Re: new errno ?

2018-07-06 Thread Phil Nelson
On Friday 06 July 2018 12:09:55 Greg Troxel wrote:
>  I might just map it to EFAULT or EINVAL.

I like this suggestion.  EFAULT

--Phil


Re: new errno ?

2018-07-06 Thread Greg Troxel

Phil Nelson  writes:

> Hello,
>
> In working on the 802.11 refresh, I ran into a new errno code from 
> FreeBSD:
>
> #define EDOOFUS 88  /* Programming error */
>
> Shall we add this one?  (Most likely with a different number since 88 is 
> taken
> in the NetBSD errno.h.)
>
>I could use EPROTO instead, but 

My immediate reaction is not to add it. It's pretty clearly not in
posix, unlikely to be added, and sounds unprofessional.

It seems like it would be used in cases where there is a KASSERT in the
non-DIAGNOSTIC case.  I might just map it to EFAULT or EINVAL.


signature.asc
Description: PGP signature


Re: new errno ?

2018-07-06 Thread Christos Zoulas
In article <201807061021.35171.p...@netbsd.org>,
Phil Nelson   wrote:
>Hello,
>
>In working on the 802.11 refresh, I ran into a new errno code from FreeBSD:
>
>#define EDOOFUS 88  /* Programming error */
>
>Shall we add this one?  (Most likely with a different number since
>88 is taken
>in the NetBSD errno.h.)
>
>   I could use EPROTO instead, but 

Of all the things to take from FreeBSD, this must be one of the least
desirable. ESTINKS :-)

christos



new errno ?

2018-07-06 Thread Phil Nelson
Hello,

In working on the 802.11 refresh, I ran into a new errno code from FreeBSD:

#define EDOOFUS 88  /* Programming error */

Shall we add this one?  (Most likely with a different number since 88 is 
taken
in the NetBSD errno.h.)

   I could use EPROTO instead, but 

--Phil


Re: 8.0 performance issue when running build.sh?

2018-07-06 Thread Maxime Villard

Le 06/07/2018 à 16:48, Martin Husemann a écrit :

On Fri, Jul 06, 2018 at 04:40:48PM +0200, Maxime Villard wrote:

This are all successfull builds of HEAD for alpha that happened after June 1:


What does that mean, are you building something *on* an Alpha CPU, or are
you building the Alpha port on another CPU?


It is all about the releng auto build cluster, which is running amd64.


Then it is likely caused by two things:

 * in EagerFPU I made a mistake initially, and it caused the FPU state to
   be restored when the kernel was leaving a softint. I sent a pullup
   already for netbsd-8, but it hasn't yet been applied. The fix removes a
   save+restore, which improves performance.

 * the XSTATE_BV bitmap is not zeroed at execve time in NetBSD-8. It
   probably causes some performance loss, because XRSTOR always restores
   each FPU state instead of just the ones we used. In recent CPUs there
   are many FPU states and we use only a few in the base system, so the
   extra restoring costs us. Even more so with EagerFPU, I guess, because
   we do save+restore unconditionally, rather than on-demand. I fixed that
   in October 2017 in -current, but it didn't make it to -8. I guess it
   will have to, now.

Beyond that we need to use XSAVEOPT, too.

Maxime


8.0 performance issue when running build.sh?

2018-07-06 Thread Martin Husemann
I have no scientific data yet, but just noticed that build times on the
auto build cluster did rise very dramatically since it has been updated
to run NetBSD 8.0 RC2.

Since builds move around build slaves sometimes (not exactly randomly,
but anyway) I picked the alpha port as an example (the first few
architectures in the alphabetical list get build slaves assigned pretty
consistently).

This are all successfull builds of HEAD for alpha that happened after June 1:

BuildSlave Seconds
HEAD:201806011030Z b43 3807
HEAD:201806020940Z b43 3785
HEAD:201806021830Z b43 3797
HEAD:201806040210Z b43 3814
HEAD:201806050210Z b43 3814
HEAD:201806050840Z b43 3766
HEAD:201806052050Z b43 3835
HEAD:201806060330Z b43 3817
HEAD:201806061300Z b43 3771
HEAD:201806062340Z b43 3825
HEAD:201806071500Z b43 3780
HEAD:201806081600Z b43 3800
HEAD:201806090120Z b43 3814
HEAD:201806091820Z b43 3816
HEAD:201806100640Z b43 3803
HEAD:201806101910Z b43 3806
HEAD:201806110430Z b43 3814
HEAD:201806112100Z b43 3822
HEAD:201806120750Z b43 3770
HEAD:201806122120Z b43 3817
HEAD:201806130400Z b43 3794
HEAD:201806131340Z b43 3820
HEAD:201806140020Z b43 3803
HEAD:201806151330Z b43 3806
HEAD:201806152250Z b43 3817
HEAD:201806161530Z b43 3810
HEAD:201806170310Z b43 3826
HEAD:201806171540Z b43 3788
HEAD:201806180110Z b43 3794
HEAD:201806181740Z b43 3827
HEAD:201806190430Z b43 3805
HEAD:201806191750Z b43 3763
HEAD:201806200020Z b43 3777
HEAD:201806200950Z b43 3838
HEAD:201806202030Z b43 3815
HEAD:201806211150Z b43 3775
HEAD:201806221250Z b43 3804
HEAD:20180610Z b43 3840
HEAD:201806231440Z b43 3800
HEAD:201806240250Z b43 3826
HEAD:201806242110Z b43 3797
HEAD:201806250640Z b43 3763
HEAD:201806260450Z b43 3812
HEAD:201806261520Z b43 3801
HEAD:201806270340Z b43 3656
HEAD:201806270810Z b43 3820
HEAD:201806271700Z b43 3763
HEAD:201806280320Z b43 3787
HEAD:201806281810Z b43 3790
HEAD:201806301810Z b43 5583
HEAD:201807012030Z b43 5617
HEAD:201807021930Z b43 5569
HEAD:201807030940Z b43 4246
HEAD:201807031950Z b43 4192
HEAD:201807042110Z b43 4256
HEAD:201807051620Z b43 4212

You will easily guess when the system was upgraded.

Average "alpha" build time was 3800s before and 4811s after the update.

I would not worry a lot about this, as other changes have been made to HEAD
so the things being build are not the same, but overall we see a very similar
pattern even for builds of branches that did not significantly change.

For some reason the netbsd-7* builds seem to be the worst affected, they went
from ~4h to > 12h.


As I said, no scientific data here - but ideas or explanations welcome!

Martin

P.S.: the CPUs are affected by all the recentish issues, we have svs
and fpu_eager enabled, but also mprotect, aslr, 

P.P.S.: you can see overall build times here:
http://releng.netbsd.org/cgi-bin/builds.cgi