Re: usb/xhci lock issue on HEAD

2018-07-11 Thread Phil Nelson
On Wednesday 11 July 2018 22:32:13 Patrick Welche wrote:
> "boot netbsd -a" ?

No, just "boot netbsd.wifi" to boot my special wifi kernel that I'm sure
will crash and don't want it doing an autoboot to.

--Phil


usb/xhci lock issue on HEAD

2018-07-11 Thread Phil Nelson
Hello,

   Has anyone run into this?   I created a special kernel for my 802.11 work
that removes a lot of unneeded drivers from my setup, stuff like raid, ntfs
and so forth.   I got a working kernel out of it.   Then, to work on the 802.11,
I commented out every 802.11 driver except the urtwn driver.   This new
kernel without the the 802.11 drivers no panics as follows:

panic: kernel diagnostic assertion "mutex_owned(>sc_lock)" failed: file 
"../../../../dev/usb/xhci.c", line 2049 
[6.459749] cpu0: Begin traceback...
[6.459749] vpanic() at netbsd:vpanic+0x16f
[6.459749] ch_voltag_convert_in() at netbsd:ch_voltag_convert_in
[6.459749] xhci_softintr() at netbsd:xhci_softintr+0x5d7
[6.459749] xhci_poll() at netbsd:xhci_poll+0x37
[6.459749] ukbd_cngetc() at netbsd:ukbd_cngetc+0x113
[6.459749] wskbd_cngetc() at netbsd:wskbd_cngetc+0xc8
[6.459749] wsdisplay_getc() at netbsd:wsdisplay_getc+0x2f
[6.459749] cngetc() at netbsd:cngetc+0x4d
[6.459749] cngetsn() at netbsd:cngetsn+0x71
[6.459749] setroot() at netbsd:setroot+0x46f
[6.459749] main() at netbsd:main+0x4a5
[6.459749] cpu0: End traceback...
[6.459749] fatal breakpoint trap in supervisor mode
[6.459749] trap type 1 code 0 rip 0x8021de15 cs 0x8 rflags 0x202 
cr2 0 ilevel 0x8 rsp 0x81451a70
[6.459749] curlwp 0x81020920 pid 0.1 lowest kstack 
0x8144d2c0
fatal protection fault in supervisor mode
[6.459749] trap type 4 code 0 rip 0x8087caea cs 0x8 rflags 0x10282 
cr2 0 ilevel 0x8 rsp 0x81451480
[6.459749] curlwp 0x81020920 pid 0.1 lowest kstack 
0x8144d2c0
rebooting...

I'm not sure why this kernel is calling cngetsn() at setroot() time. 

Has anyone seen this before?

--Phil


Re: Too many PMC implementations

2018-07-11 Thread Kamil Rytarowski
On 11.07.2018 18:22, Maxime Villard wrote:
> Right now we have three (or more?) different implementations for
> Performance
> Monitoring Counters:
> 
>  * PMC: this one is MI. It is used only on one ARM model (xscale I think).
>    There used to be an x86 code for it, but it was broken, and I removed
> it.
>    The implementation comes with libpmc, a library we provide. The code
>    hasn't moved these last 15 years. I don't like this implementation,
> it is
>    really invasive (see the numerous pmc.h files that are all empty).
> 
>  * X86PMC: this one is MD, and only available for x86. I wrote it myself.
>    The code is small (x86/pmc.c), and functional. The PMCs are system-wide,
>    and retrieved on a per-cpu basis. But this implementation does not
>    support tracking, that is, we get numbers (about the cache misses for
>    example), but we don't know where they happened.
> 
>  * TPROF: this one is MI, but only x86 support is present. TPROF provides
>    the backend needed to support tracking: via a device, that userland can
>    read from, in order to absorb the event samples produced by the kernel.
>    The backend is pretty good, but the frontend (where the user chooses
>    which PMC etc) is inexistent - the CPU/event detection is not there
>    either. The backend is MI (/dev/tprof/tprof.c), and can be used on other
>    architectures. The module already exists to dynamically modload.
> 
> I think it would be good to:
> 
>  * Remove PMC entirely. Then remove libpmc too.
> 
>  * Merge X86PMC into the x86 part of TPROF. That is to say, into
>    x86/tprof_*. Then remove X86PMC.
> 
>  * Later, maybe, someone will want to add other architectures in TPROF,
> like
>    all the recent ARMs.
> 
> Maxime

I'm not familiar with the internals myself, but from API point of view,
something usable for porting rr (https://github.com/mozilla/rr) or even
Linux perf-top is highly desirable. I treat personally perf-top as a
gold standard.



signature.asc
Description: OpenPGP digital signature


Re: Too many PMC implementations

2018-07-11 Thread Jason Thorpe
Speaking as someone who was peripherally involved in the PMC flavor below, I 
have no objections to this.

> On Jul 11, 2018, at 9:22 AM, Maxime Villard  wrote:
> 
> Right now we have three (or more?) different implementations for Performance
> Monitoring Counters:
> 
> * PMC: this one is MI. It is used only on one ARM model (xscale I think).
>   There used to be an x86 code for it, but it was broken, and I removed it.
>   The implementation comes with libpmc, a library we provide. The code
>   hasn't moved these last 15 years. I don't like this implementation, it is
>   really invasive (see the numerous pmc.h files that are all empty).
> 
> * X86PMC: this one is MD, and only available for x86. I wrote it myself.
>   The code is small (x86/pmc.c), and functional. The PMCs are system-wide,
>   and retrieved on a per-cpu basis. But this implementation does not
>   support tracking, that is, we get numbers (about the cache misses for
>   example), but we don't know where they happened.
> 
> * TPROF: this one is MI, but only x86 support is present. TPROF provides
>   the backend needed to support tracking: via a device, that userland can
>   read from, in order to absorb the event samples produced by the kernel.
>   The backend is pretty good, but the frontend (where the user chooses
>   which PMC etc) is inexistent - the CPU/event detection is not there
>   either. The backend is MI (/dev/tprof/tprof.c), and can be used on other
>   architectures. The module already exists to dynamically modload.
> 
> I think it would be good to:
> 
> * Remove PMC entirely. Then remove libpmc too.
> 
> * Merge X86PMC into the x86 part of TPROF. That is to say, into
>   x86/tprof_*. Then remove X86PMC.
> 
> * Later, maybe, someone will want to add other architectures in TPROF, like
>   all the recent ARMs.
> 
> Maxime

-- thorpej



Too many PMC implementations

2018-07-11 Thread Maxime Villard

Right now we have three (or more?) different implementations for Performance
Monitoring Counters:

 * PMC: this one is MI. It is used only on one ARM model (xscale I think).
   There used to be an x86 code for it, but it was broken, and I removed it.
   The implementation comes with libpmc, a library we provide. The code
   hasn't moved these last 15 years. I don't like this implementation, it is
   really invasive (see the numerous pmc.h files that are all empty).

 * X86PMC: this one is MD, and only available for x86. I wrote it myself.
   The code is small (x86/pmc.c), and functional. The PMCs are system-wide,
   and retrieved on a per-cpu basis. But this implementation does not
   support tracking, that is, we get numbers (about the cache misses for
   example), but we don't know where they happened.

 * TPROF: this one is MI, but only x86 support is present. TPROF provides
   the backend needed to support tracking: via a device, that userland can
   read from, in order to absorb the event samples produced by the kernel.
   The backend is pretty good, but the frontend (where the user chooses
   which PMC etc) is inexistent - the CPU/event detection is not there
   either. The backend is MI (/dev/tprof/tprof.c), and can be used on other
   architectures. The module already exists to dynamically modload.

I think it would be good to:

 * Remove PMC entirely. Then remove libpmc too.

 * Merge X86PMC into the x86 part of TPROF. That is to say, into
   x86/tprof_*. Then remove X86PMC.

 * Later, maybe, someone will want to add other architectures in TPROF, like
   all the recent ARMs.

Maxime


Re: 8.0 performance issue when running build.sh?

2018-07-11 Thread Kamil Rytarowski
On 11.07.2018 11:47, Takeshi Nakayama wrote:
 Martin Husemann  wrote
> 
>>> Another observation is that grep(1) on one NetBSD server is
>>> significantly slower between the switch from -7 to 8RC1.
>>
>> Please file separate PRs for each (and maybe provide some input files
>> to reproduce the issue).
> 
> Already filed:
>   http://gnats.netbsd.org/53241
> 
> -- Takeshi Nakayama
> 

Time of "LC_ALL=C grep" query:
0.18 real 0.08 user 0.09 sys

Time of "LC_ALL=pl_PL.UTF-8 grep":
15,94 real15,74 user 0,18 sys


Good catch! It's 200 times slower!



signature.asc
Description: OpenPGP digital signature


Re: 8.0 performance issue when running build.sh?

2018-07-11 Thread Takeshi Nakayama
>>> Martin Husemann  wrote

> > Another observation is that grep(1) on one NetBSD server is
> > significantly slower between the switch from -7 to 8RC1.
> 
> Please file separate PRs for each (and maybe provide some input files
> to reproduce the issue).

Already filed:
  http://gnats.netbsd.org/53241

-- Takeshi Nakayama


Re: 8.0 performance issue when running build.sh?

2018-07-11 Thread Kamil Rytarowski
On 11.07.2018 09:09, Simon Burge wrote:
> Hi folks,
> 
> Martin Husemann wrote:
> 
>> On Tue, Jul 10, 2018 at 12:11:41PM +0200, Kamil Rytarowski wrote:
>>> After the switch from NetBSD-HEAD (version from 1 year ago) to 8.0RC2,
>>> the ld(1) linker has serious issues with linking Clang/LLVM single
>>> libraries within 20 minutes. This causes frequent timeouts on the NetBSD
>>> buildbot in the LLVM buildfarm. Timeouts were never observed in the
>>> past, today there might be few of them daily.
>>
>> Sounds like a binutils issue (or something like too little RAM available
>> on the machine).
> 
> Probably only tangently related, but I wasn't able to link an amd64
> GENERIC kernel on a machine with 512MB of RAM, and even using "MKCTF=no
> NOCTF=yes" at least got something build but was extraordinarily slow to
> do so.  I never worried about raising a PR for that as I figured the
> world had moved on...
> 
> Cheers,
> Simon.
> 

There is certainly contribution to excessive RAM usage.. but I must
defer myself profiling tasks and focus on other things now. It's better
to focus on lld porting myself (10x boost compared to ld in larger C++
codebase) before profiling.

Additionally that LLVM build machine is a remote machine and is serving
public service.



signature.asc
Description: OpenPGP digital signature


Re: 8.0 performance issue when running build.sh?

2018-07-11 Thread Simon Burge
Hi folks,

Martin Husemann wrote:

> On Tue, Jul 10, 2018 at 12:11:41PM +0200, Kamil Rytarowski wrote:
> > After the switch from NetBSD-HEAD (version from 1 year ago) to 8.0RC2,
> > the ld(1) linker has serious issues with linking Clang/LLVM single
> > libraries within 20 minutes. This causes frequent timeouts on the NetBSD
> > buildbot in the LLVM buildfarm. Timeouts were never observed in the
> > past, today there might be few of them daily.
>
> Sounds like a binutils issue (or something like too little RAM available
> on the machine).

Probably only tangently related, but I wasn't able to link an amd64
GENERIC kernel on a machine with 512MB of RAM, and even using "MKCTF=no
NOCTF=yes" at least got something build but was extraordinarily slow to
do so.  I never worried about raising a PR for that as I figured the
world had moved on...

Cheers,
Simon.