Re: Audio buffer positions

2019-12-06 Thread Michael van Elst
dam...@zamaudio.com (Damien Zammit) writes:

>Can we use the arrival of audio IRQs to trigger the measurement of
>current buffer positions and high resolution timer timestamp combined with 
>system time,
>and use these measurements to synchronise a delay-locked-loop (DLL)
>to provide sub-sample accurate estimates of buffer position in between the 
>arrival of audio IRQs?

Not sure what a "sub-sample" buffer position would be.

Currently you get the position with "block" precision, and for regular
hardware that's "sampling the buffer pointer in the audio interrupt".

The "block" size is automatically chosen according to the configured
latency parameter (within bounds), the default precision is 40ms. Reduce
it to 4ms (with sysctl) and the result is good enough for anything and
the overhead is still rather small (does a VAX have audio?).

What you could do is interpolate the buffer pointer using the system
time and the configured bitrate, you need to be careful to keep
it monotonic.

Here is a patch that does this:

http://ftp.netbsd.org/pub/NetBSD/misc/mlelstv/audio.diff

It adds a sysctl node that lets you chose 3 modes to calculate the
buffer position.

0 - report the buffer start
1 - interpolate between start and end using time / bitrate
2 - report the buffer end (that's the unpatched behaviour)


The interpolation mode gets you precise timing... if your application uses
the buffer offset correctly. But it won't help for others that just
want precise wakeup times. For these you actually need the smaller buffers.

-- 
-- 
Michael van Elst
Internet: mlel...@serpens.de
"A potential Snark may lurk in every tree."


Re: Why NetBSD x86's bus_space_barrier does not use [sml]fence?

2019-12-06 Thread Shoichi Yamaguchi
Thank you for your teaching me about the backgrounds.
And, I could test the driver applying the related changes.

Thanks,
yamaguchi


Audio buffer positions

2019-12-06 Thread Damien Zammit
Hi there,

I would like to suggest an idea for improving the latency reporting to userland 
for audio api:

Can we use the arrival of audio IRQs to trigger the measurement of
current buffer positions and high resolution timer timestamp combined with 
system time,
and use these measurements to synchronise a delay-locked-loop (DLL)
to provide sub-sample accurate estimates of buffer position in between the 
arrival of audio IRQs?

In this way, the AUDIO_GETIOFFS and AUDIO_GETOOFFS ioctls could return 
sub-sample accurate buffer positions.

I don't think the external facing api would even need to change at all.

--

Damien Zammit
ZamAudio


Re: modules item #14 revisited

2019-12-06 Thread Christos Zoulas
In article <20191207024224.1b0d417f...@rebar.astron.com>,
Christos Zoulas  wrote:
>
>Hi,
>
>This is a quick and dirty implementation of:
>
>http://mail-index.NetBSD.org/current-users/2009/05/10/msg009372.html
>
>to use:
>$ echo KERNEL_DIR=yes >> /etc/mk.conf
># apply the enclosed patch
>$ mv /netbsd{,.old}
>$ mkdir -p /netbsd/modules

$ make && make install in src/share/mk

># build a new kernel and put it in /netbsd/kernel
>$ make && make install in src/sys/modules
>$ make && make install in sys/arch/i386/stand/boot

>$ cp /usr/mdec/boot /
>
>There are quite a few open issues:
>- All ports need to provide get_booted_kernel()
>  Are there ports where we can't get the booted kernel name? What do we do
>  there?
>- Make install in modules always overwrites the modules in /netbsd/
>- Does each kernel.tgz need to pack all the modules in the sets or
>  do we keep the modules.tgz that unpacks in /netbsd/modules?
>- How about the other kernels? do they unpack in /KERNEL-NAME/kernel?
>- Right now there is fallback code in the boot blocks for just the
>  kernel but not the old module path (this is intentional to keep things
>  simple). You can always boot a kernel by name from a file.
>
>christos
>
>Index: share/mk/bsd.kmodule.mk
>===
>RCS file: /cvsroot/src/share/mk/bsd.kmodule.mk,v
>retrieving revision 1.63
>diff -u -p -u -r1.63 bsd.kmodule.mk
>--- share/mk/bsd.kmodule.mk1 Dec 2019 20:24:47 -   1.63
>+++ share/mk/bsd.kmodule.mk7 Dec 2019 02:26:01 -
>@@ -172,7 +172,13 @@ ${PROG}: ${OBJS} ${DPADD} ${KMODSCRIPT}
> # Install rules
> .if !target(kmodinstall)
> .if !defined(KMODULEDIR)
>+.if ${KERNEL_DIR:Uno} == "yes"
>+KMODULEDIR=   ${DESTDIR}/netbsd/modules/${KMOD}
>+_INST_DIRS=   ${DESTDIR}/netbsd
>+_INST_DIRS+=  ${DESTDIR}/netbsd/modules
>+_INST_DIRS+=  ${DESTDIR}/netbsd/modules/${KMOD}
> _OSRELEASE!=  ${HOST_SH} $S/conf/osrelease.sh -k
>+.else
> # Ensure these are recorded properly in METALOG on unprived installes:
> KMODULEARCHDIR?= ${MACHINE}
> _INST_DIRS=   ${DESTDIR}/stand/${KMODULEARCHDIR}
>@@ -180,6 +186,7 @@ _INST_DIRS+=   ${DESTDIR}/stand/${KMODULEA
> _INST_DIRS+=  ${DESTDIR}/stand/${KMODULEARCHDIR}/${_OSRELEASE}/modules
> KMODULEDIR=   ${DESTDIR}/stand/${KMODULEARCHDIR}/${_OSRELEASE}/modules/${KMOD}
> .endif
>+.endif
> _PROG:=   ${KMODULEDIR}/${PROG} # installed path
> 
> .if ${MKUPDATE} == "no"
>Index: sys/arch/i386/stand/boot/Makefile.boot
>===
>RCS file: /cvsroot/src/sys/arch/i386/stand/boot/Makefile.boot,v
>retrieving revision 1.73
>diff -u -p -u -r1.73 Makefile.boot
>--- sys/arch/i386/stand/boot/Makefile.boot 13 Sep 2019 02:19:45 -  
>1.73
>+++ sys/arch/i386/stand/boot/Makefile.boot 7 Dec 2019 02:26:01 -
>@@ -53,6 +53,10 @@ CFLAGS+= -Wall -Wmissing-prototypes -Wst
> CPPFLAGS+= -nostdinc -D_STANDALONE
> CPPFLAGS+= -I$S
> 
>+.if ${KERNEL_DIR:Uno} == "yes"
>+CPPFLAGS+= -DKERNEL_DIR
>+.endif
>+
> CPPFLAGS+= -DSUPPORT_PS2
> CPPFLAGS+= -DDIRECT_SERIAL
> CPPFLAGS+= -DSUPPORT_SERIAL=boot_params.bp_consdev
>Index: sys/arch/i386/stand/boot/boot2.c
>===
>RCS file: /cvsroot/src/sys/arch/i386/stand/boot/boot2.c,v
>retrieving revision 1.72
>diff -u -p -u -r1.72 boot2.c
>--- sys/arch/i386/stand/boot/boot2.c   2 Sep 2019 06:10:24 -   1.72
>+++ sys/arch/i386/stand/boot/boot2.c   7 Dec 2019 02:26:01 -
>@@ -279,6 +279,12 @@ bootit(const char *filename, int howto)
>   if (howto & AB_VERBOSE)
>   printf("booting %s (howto 0x%x)\n", sprint_bootsel(filename),
>   howto);
>+#ifdef KERNEL_DIR
>+  char path[512];
>+  strcpy(path, filename);
>+  strcat(path, "/kernel");
>+  (void)exec_netbsd(path, 0, howto, boot_biosdev < 0x80, clearit);
>+#endif
> 
>   if (exec_netbsd(filename, 0, howto, boot_biosdev < 0x80, clearit) < 0)
>   printf("boot: %s: %s\n", sprint_bootsel(filename),
>Index: sys/arch/i386/stand/lib/exec.c
>===
>RCS file: /cvsroot/src/sys/arch/i386/stand/lib/exec.c,v
>retrieving revision 1.74
>diff -u -p -u -r1.74 exec.c
>--- sys/arch/i386/stand/lib/exec.c 13 Sep 2019 02:19:46 -  1.74
>+++ sys/arch/i386/stand/lib/exec.c 7 Dec 2019 02:26:01 -
>@@ -151,7 +151,7 @@ static voidmodule_add_common(const char
> static void   userconf_init(void);
> 
> static void   extract_device(const char *, char *, size_t);
>-static void   module_base_path(char *, size_t);
>+static void   module_base_path(char *, size_t, const char *);
> static intmodule_open(boot_module_t *, int, const char *, const char *,
>   bool);
> 
>@@ -653,8 +653,18 @@ module_open(boot_module_t *bm, int mode,
> }
> 
> static void
>-module_base_path(char *buf, size_t bufsize)
>+module_base_path(char *buf, size_t bufsize, const char 

modules item #14 revisited

2019-12-06 Thread Christos Zoulas


Hi,

This is a quick and dirty implementation of:

http://mail-index.NetBSD.org/current-users/2009/05/10/msg009372.html

to use:
$ echo KERNEL_DIR=yes >> /etc/mk.conf
# apply the enclosed patch
$ mv /netbsd{,.old}
$ mkdir -p /netbsd/modules
# build a new kernel and put it in /netbsd/kernel
$ make && make install in src/sys/modules
$ make && make install sys/arch/i386/stand/boot
$ cp /usr/mdec/boot /

There are quite a few open issues:
- All ports need to provide get_booted_kernel()
  Are there ports where we can't get the booted kernel name? What do we do
  there?
- Make install in modules always overwrites the modules in /netbsd/
- Does each kernel.tgz need to pack all the modules in the sets or
  do we keep the modules.tgz that unpacks in /netbsd/modules?
- How about the other kernels? do they unpack in /KERNEL-NAME/kernel?
- Right now there is fallback code in the boot blocks for just the
  kernel but not the old module path (this is intentional to keep things
  simple). You can always boot a kernel by name from a file.

christos

Index: share/mk/bsd.kmodule.mk
===
RCS file: /cvsroot/src/share/mk/bsd.kmodule.mk,v
retrieving revision 1.63
diff -u -p -u -r1.63 bsd.kmodule.mk
--- share/mk/bsd.kmodule.mk 1 Dec 2019 20:24:47 -   1.63
+++ share/mk/bsd.kmodule.mk 7 Dec 2019 02:26:01 -
@@ -172,7 +172,13 @@ ${PROG}: ${OBJS} ${DPADD} ${KMODSCRIPT}
 # Install rules
 .if !target(kmodinstall)
 .if !defined(KMODULEDIR)
+.if ${KERNEL_DIR:Uno} == "yes"
+KMODULEDIR=${DESTDIR}/netbsd/modules/${KMOD}
+_INST_DIRS=${DESTDIR}/netbsd
+_INST_DIRS+=   ${DESTDIR}/netbsd/modules
+_INST_DIRS+=   ${DESTDIR}/netbsd/modules/${KMOD}
 _OSRELEASE!=   ${HOST_SH} $S/conf/osrelease.sh -k
+.else
 # Ensure these are recorded properly in METALOG on unprived installes:
 KMODULEARCHDIR?= ${MACHINE}
 _INST_DIRS=${DESTDIR}/stand/${KMODULEARCHDIR}
@@ -180,6 +186,7 @@ _INST_DIRS+=${DESTDIR}/stand/${KMODULEA
 _INST_DIRS+=   ${DESTDIR}/stand/${KMODULEARCHDIR}/${_OSRELEASE}/modules
 KMODULEDIR=${DESTDIR}/stand/${KMODULEARCHDIR}/${_OSRELEASE}/modules/${KMOD}
 .endif
+.endif
 _PROG:=${KMODULEDIR}/${PROG} # installed path
 
 .if ${MKUPDATE} == "no"
Index: sys/arch/i386/stand/boot/Makefile.boot
===
RCS file: /cvsroot/src/sys/arch/i386/stand/boot/Makefile.boot,v
retrieving revision 1.73
diff -u -p -u -r1.73 Makefile.boot
--- sys/arch/i386/stand/boot/Makefile.boot  13 Sep 2019 02:19:45 -  
1.73
+++ sys/arch/i386/stand/boot/Makefile.boot  7 Dec 2019 02:26:01 -
@@ -53,6 +53,10 @@ CFLAGS+= -Wall -Wmissing-prototypes -Wst
 CPPFLAGS+= -nostdinc -D_STANDALONE
 CPPFLAGS+= -I$S
 
+.if ${KERNEL_DIR:Uno} == "yes"
+CPPFLAGS+= -DKERNEL_DIR
+.endif
+
 CPPFLAGS+= -DSUPPORT_PS2
 CPPFLAGS+= -DDIRECT_SERIAL
 CPPFLAGS+= -DSUPPORT_SERIAL=boot_params.bp_consdev
Index: sys/arch/i386/stand/boot/boot2.c
===
RCS file: /cvsroot/src/sys/arch/i386/stand/boot/boot2.c,v
retrieving revision 1.72
diff -u -p -u -r1.72 boot2.c
--- sys/arch/i386/stand/boot/boot2.c2 Sep 2019 06:10:24 -   1.72
+++ sys/arch/i386/stand/boot/boot2.c7 Dec 2019 02:26:01 -
@@ -279,6 +279,12 @@ bootit(const char *filename, int howto)
if (howto & AB_VERBOSE)
printf("booting %s (howto 0x%x)\n", sprint_bootsel(filename),
howto);
+#ifdef KERNEL_DIR
+   char path[512];
+   strcpy(path, filename);
+   strcat(path, "/kernel");
+   (void)exec_netbsd(path, 0, howto, boot_biosdev < 0x80, clearit);
+#endif
 
if (exec_netbsd(filename, 0, howto, boot_biosdev < 0x80, clearit) < 0)
printf("boot: %s: %s\n", sprint_bootsel(filename),
Index: sys/arch/i386/stand/lib/exec.c
===
RCS file: /cvsroot/src/sys/arch/i386/stand/lib/exec.c,v
retrieving revision 1.74
diff -u -p -u -r1.74 exec.c
--- sys/arch/i386/stand/lib/exec.c  13 Sep 2019 02:19:46 -  1.74
+++ sys/arch/i386/stand/lib/exec.c  7 Dec 2019 02:26:01 -
@@ -151,7 +151,7 @@ static void module_add_common(const char
 static voiduserconf_init(void);
 
 static voidextract_device(const char *, char *, size_t);
-static voidmodule_base_path(char *, size_t);
+static voidmodule_base_path(char *, size_t, const char *);
 static int module_open(boot_module_t *, int, const char *, const char *,
bool);
 
@@ -653,8 +653,18 @@ module_open(boot_module_t *bm, int mode,
 }
 
 static void
-module_base_path(char *buf, size_t bufsize)
+module_base_path(char *buf, size_t bufsize, const char *kernel_path)
 {
+#ifdef KERNEL_DIR
+   /* we cheat here, because %.* does not work with the mini printf */
+   char *ptr = strrchr(kernel_path, '/');
+   if (*ptr)
+   *ptr = '\0';
+
+   

Re: racy acccess in kern_runq.c

2019-12-06 Thread Don Lee
Please educate me. It’s been a while for me.

Writing Kernel code *requires* knowledge of what code is generated sometimes. 
In my experience, there have been standard techniques, like pragmas and 
insertions of assembly code to suppress this sort of undesirable optimization.

Don’t those techniques exist any more? My compiler friends used to put them in 
for just this purpose, and they tried to make them as portable as possible. 
Surely GCC does this. No?

Inquiring minds…

-dgl-

> On Dec 6, 2019, at 1:44 PM,   
> wrote:
> 
> 
> 
>> On Dec 6, 2019, at 10:21 AM, Mouse  wrote:
>> 
>> 
>> [EXTERNAL EMAIL] 
>> 
>>> Compilers have became much more aggressive over the years.  But they
>>> are allowed to be so by the C standard.  Specifically, in addition to
>>> code-level re-ordering, plain accesses (loads/stores) are subject to
>>> load/store fusing, tearing as well as invented loads/stores.
>> 
>> Then, honestly, it sounds to me as though "the latest revision of C" is
>> no longer an appropriate language for writing kernels.  I see no reason
>> to twist the kernel code into a pretzel to work around latitude a new
>> language gives to its compilers - and that's what C11 is, a new
>> language, albeit one closely related to various previous languages.
>> 
>> One of the prime considerations when choosing a language and/or
>> compiler for building a kernel is that it produce relatively
>> predictable code, for exactly this kind of reason.  If the latest C and
>> the latest gcc no longer do that, then IMO they are no longer
>> appropriate for writing/compiling kernels.
> 
> C11 isn't all that new, of course.  And I don't know if the rules about 
> optimization that you object to are anywhere near that new in any case.  What 
> seems to be the case instead is that compilers are more and more doing what 
> the language has for a long time allowed them to do.
> 
> Consider for example "Undefined behavior -- what happened to my code?" by 
> Wang et al. (MIT and Tsinghua University).  It describes all sorts of allowed 
> transformations that come as a surprise to programmers.
> 
> Yes, it certainly would be possible to create a programming language with 
> less surprising semantics.  C is not in any way shape or form a clean 
> language with clean semantics.  But as for wanting predictable code, that is 
> certainly doable.  Unfortunately, it requires a sufficiently deep 
> understanding of the rules of the language.  Typical textbooks are not all 
> that good for this, they are too superficial.  The language standard tends to 
> be better.  But again, a problem with most programming languages is that 
> their semantics are not well defined and/or much more complex than they 
> should be.  Programmming in C++ is particularly scary for this reason, but C 
> is also problematic.  For clean semantics, I like ALGOL; too bad it is no 
> longer used, though it did at one time serve to build successful operating 
> systems.
> 
>   paul
> 



Re: Changes to reduce pressure on uvm_pageqlock

2019-12-06 Thread Andrew Doran
Ok here's is a more complete patch for the change to uvm_pageqlock.  I have
edited by hand to remove lots of unrelated changes so it's to get a picture
of the changes - it won't apply or compile.

http://www.netbsd.org/~ad/2019/partial.diff

(It's time consuming to detangle this from the rest of my changes which are
per-CPU stat collection, a topology-aware page allocator that dispenses with
the "per-cpu" lists we have now, and yamt's radixtree).

There are a few more things I want to look at tomorrow including the
PQ_READAHEAD flag (it's not safe now) whether the pagedaemon needs to know
about the pdpolicy lock at all, and a last go over it all checking
everything.

The earlier changes to vm_page I mentioned for alignment and the page
replacement policy I am going to postpone because radixtree changes the
picture bigtime and more testing is required. (That's a separate review.)

As to the reason why: at the start of November on my machine system time for
a kernel build was 3200-3300s.  With this plus the remaining changes it's
down to 850s so far and with !DIAGNOSTIC about 580s.

Andrew

On Fri, Dec 06, 2019 at 04:34:50PM +, Andrew Doran wrote:

> I ended up taking a disliking to these changes so I went back and took
> another look at the problem yesterday evening.
> 
> There are two distinct sets of data involved that need protection: page
> identity (connected anon, uobj, wiring, loaning) and page replacement state.
> 
> There are limited points of connection between those two sets, so it's easy
> to do two locks.  We can give the pdpolicy code a lock for its state and let
> it manage that lock, which only it and the pagedaemon need to know about. 
> The lock can be private to the file (uvm_pdpolicy_clock / clockpro.c), with
> a couple of hooks for the pagedaemon to acquire and release it.
> 
> With that done, we'd then only need one version of uvm_pageactivate() etc,
> because those functions will no longer care about uvm_pageqlock.
> 
> I tried it out: the code looks more elegant and the scheme works in practice
> with contention on uvm_pageqlock now quite minor, and the sum of contention
> on the two locks lower than it was on uvm_pageqlock prior.
> 
> With the locking for the pdpolicy code being private to the file I think it
> would then be easier to experiment with different strategies to reduce
> contention further.
> 
> http://www.netbsd.org/~ad/2019/pdpolicy.diff
> 
> Andrew
> 
> On Tue, Dec 03, 2019 at 12:41:25AM +, Andrew Doran wrote:
> > Hi,
> > 
> > I have some straightforward changes for uvm to improve the situation on
> > many-CPU machines.  I'm going to break them into pieces to make them easier
> > to review (only this first piece and what's already in CVS is ready).
> > 
> > I have carefuly measured the impact of these over hundreds of kernel builds,
> > using lockstat, tprof and some custom instrumentation so I'm confident that
> > for each, the effects at least are of value.
> > 
> > Anyway I'd be grateful if someone could take a look.  This one is about
> > reducing pressure on uvm_pageqlock, and cache misses on struct vm_page.
> > 
> > Cheers,
> > Andrew
> > 
> > 
> > http://www.netbsd.org/~ad/2019/uvm1.diff
> > 
> > 
> > vm_page: cluster largely static fields used during page lookup in the first
> > 64-bytes.  Increase wire_count to 32-bits, and add a field for use of the
> > page replacement policy.  This leaves 2 bytes spare in a word locked by
> > uvm_fpageqlock/uvm_pageqlock which I want to use later for changes to the
> > page allocator.  It also brings vm_page up to 128 bytes on amd64.
> > 
> > New functions:
> > 
> > => uvmpdpol_pageactivate_p()
> > 
> > For page replacement policy.  Returns true if pdpol thinks activation info
> > would be useful enough to cause disruption to page queues, vm_page and
> > uvm_fpageqlock.  For CLOCK this returns true if page is not active, or was
> > not activated within the last second.
> > 
> > => uvm_pageenqueue1()
> > 
> > Call without uvm_pageqlock.  Acquires the lock and enqueues the page only
> > if not already enqueued.
> > 
> > => uvm_pageactivate1()
> > 
> > Call without uvm_pageqlock.  Acquires the lock and activates the page only
> > if uvmpdpol_pageactivate_p() says yes.  No similar change for deactivate nor
> > dequeue as they are much more definite.
> > 
> > => uvm_pagefree1()
> > 
> > First part of uvm_pagefree() - strip page of identity.  Requires
> > uvm_pageqlock if associated with an object.
> > 
> > => uvm_pagefree2()
> > 
> > Second part of uvm_pagefree().  Send page to free list.  No locks required.


Re: racy acccess in kern_runq.c

2019-12-06 Thread Jason Thorpe


> On Dec 6, 2019, at 11:44 AM, paul.kon...@dell.com wrote:
> 
> For clean semantics, I like ALGOL; too bad it is no longer used

There's just too much shouting in ALGOL.

-- thorpej



Re: racy acccess in kern_runq.c

2019-12-06 Thread Paul.Koning



> On Dec 6, 2019, at 10:21 AM, Mouse  wrote:
> 
> 
> [EXTERNAL EMAIL] 
> 
>> Compilers have became much more aggressive over the years.  But they
>> are allowed to be so by the C standard.  Specifically, in addition to
>> code-level re-ordering, plain accesses (loads/stores) are subject to
>> load/store fusing, tearing as well as invented loads/stores.
> 
> Then, honestly, it sounds to me as though "the latest revision of C" is
> no longer an appropriate language for writing kernels.  I see no reason
> to twist the kernel code into a pretzel to work around latitude a new
> language gives to its compilers - and that's what C11 is, a new
> language, albeit one closely related to various previous languages.
> 
> One of the prime considerations when choosing a language and/or
> compiler for building a kernel is that it produce relatively
> predictable code, for exactly this kind of reason.  If the latest C and
> the latest gcc no longer do that, then IMO they are no longer
> appropriate for writing/compiling kernels.

C11 isn't all that new, of course.  And I don't know if the rules about 
optimization that you object to are anywhere near that new in any case.  What 
seems to be the case instead is that compilers are more and more doing what the 
language has for a long time allowed them to do.

Consider for example "Undefined behavior -- what happened to my code?" by Wang 
et al. (MIT and Tsinghua University).  It describes all sorts of allowed 
transformations that come as a surprise to programmers.

Yes, it certainly would be possible to create a programming language with less 
surprising semantics.  C is not in any way shape or form a clean language with 
clean semantics.  But as for wanting predictable code, that is certainly 
doable.  Unfortunately, it requires a sufficiently deep understanding of the 
rules of the language.  Typical textbooks are not all that good for this, they 
are too superficial.  The language standard tends to be better.  But again, a 
problem with most programming languages is that their semantics are not well 
defined and/or much more complex than they should be.  Programmming in C++ is 
particularly scary for this reason, but C is also problematic.  For clean 
semantics, I like ALGOL; too bad it is no longer used, though it did at one 
time serve to build successful operating systems.

paul



Re: racy acccess in kern_runq.c

2019-12-06 Thread Thor Lancelot Simon
On Fri, Dec 06, 2019 at 09:00:33AM +, Andrew Doran wrote:
> 
> Please don't commit this.  These accesses are racy by design.  There is no
> safety issue and we do not want to disturb the activity of other CPUs in
> this code path by locking them.  We also don't want to use atomics either. 

I'm sorry, at this point I can't help myself:

/* You are not expected to understand this. */

Thor


Re: racy acccess in kern_runq.c

2019-12-06 Thread Mindaugas Rasiukevicius
Maxime Villard  wrote:
> Le 06/12/2019 à 17:53, Andrew Doran a écrit :
> > On Fri, Dec 06, 2019 at 05:22:39PM +0900, Kengo NAKAHARA wrote:
> > 
> >>/* Update the worker */
> >> -  worker_ci = hci;
> >> +  atomic_swap_ptr(_ci, hci);
> > 
> > Why atomic_swap_ptr() not atomic_store_relaxed()?  I don't see any bug
> > that it fixes.  Other than that it look OK to me.
> 
> Because I suggested it; my concern was that if not explicitly atomic, the
> cpu could make two writes internally (even though the compiler produces
> only one instruction), and in that case a page fault would have been
> possible because of garbage dereference.

No, atomic_store_relaxed() is sufficient; that is what it is for.

-- 
Mindaugas


Re: racy acccess in kern_runq.c

2019-12-06 Thread David Young
On Fri, Dec 06, 2019 at 06:33:32PM +0100, Maxime Villard wrote:
> Le 06/12/2019 à 17:53, Andrew Doran a écrit :
> >Why atomic_swap_ptr() not atomic_store_relaxed()?  I don't see any bug that
> >it fixes.  Other than that it look OK to me.
> 
> Because I suggested it; my concern was that if not explicitly atomic, the
> cpu could make two writes internally (even though the compiler produces
> only one instruction), and in that case a page fault would have been possible
> because of garbage dereference.

atomic_store_relaxed() is not explicitly atomic?

Dave

-- 
David Young
dyo...@pobox.comUrbana, IL(217) 721-9981


Re: racy acccess in kern_runq.c

2019-12-06 Thread Maxime Villard

Le 06/12/2019 à 17:53, Andrew Doran a écrit :

On Fri, Dec 06, 2019 at 05:22:39PM +0900, Kengo NAKAHARA wrote:


/* Update the worker */
-   worker_ci = hci;
+   atomic_swap_ptr(_ci, hci);


Why atomic_swap_ptr() not atomic_store_relaxed()?  I don't see any bug that
it fixes.  Other than that it look OK to me.


Because I suggested it; my concern was that if not explicitly atomic, the
cpu could make two writes internally (even though the compiler produces
only one instruction), and in that case a page fault would have been possible
because of garbage dereference.

To be honest, I'm not totally sure whether it is a valid concern;
theoretically, it is.


Re: racy acccess in kern_runq.c

2019-12-06 Thread Andrew Doran
On Fri, Dec 06, 2019 at 05:22:39PM +0900, Kengo NAKAHARA wrote:

>   /* Update the worker */
> - worker_ci = hci;
> + atomic_swap_ptr(_ci, hci);

Why atomic_swap_ptr() not atomic_store_relaxed()?  I don't see any bug that
it fixes.  Other than that it look OK to me.

Cheers,
Andrew


Re: Changes to reduce pressure on uvm_pageqlock

2019-12-06 Thread Andrew Doran
I ended up taking a disliking to these changes so I went back and took
another look at the problem yesterday evening.

There are two distinct sets of data involved that need protection: page
identity (connected anon, uobj, wiring, loaning) and page replacement state.

There are limited points of connection between those two sets, so it's easy
to do two locks.  We can give the pdpolicy code a lock for its state and let
it manage that lock, which only it and the pagedaemon need to know about. 
The lock can be private to the file (uvm_pdpolicy_clock / clockpro.c), with
a couple of hooks for the pagedaemon to acquire and release it.

With that done, we'd then only need one version of uvm_pageactivate() etc,
because those functions will no longer care about uvm_pageqlock.

I tried it out: the code looks more elegant and the scheme works in practice
with contention on uvm_pageqlock now quite minor, and the sum of contention
on the two locks lower than it was on uvm_pageqlock prior.

With the locking for the pdpolicy code being private to the file I think it
would then be easier to experiment with different strategies to reduce
contention further.

http://www.netbsd.org/~ad/2019/pdpolicy.diff

Andrew

On Tue, Dec 03, 2019 at 12:41:25AM +, Andrew Doran wrote:
> Hi,
> 
> I have some straightforward changes for uvm to improve the situation on
> many-CPU machines.  I'm going to break them into pieces to make them easier
> to review (only this first piece and what's already in CVS is ready).
> 
> I have carefuly measured the impact of these over hundreds of kernel builds,
> using lockstat, tprof and some custom instrumentation so I'm confident that
> for each, the effects at least are of value.
> 
> Anyway I'd be grateful if someone could take a look.  This one is about
> reducing pressure on uvm_pageqlock, and cache misses on struct vm_page.
> 
> Cheers,
> Andrew
> 
> 
> http://www.netbsd.org/~ad/2019/uvm1.diff
> 
> 
> vm_page: cluster largely static fields used during page lookup in the first
> 64-bytes.  Increase wire_count to 32-bits, and add a field for use of the
> page replacement policy.  This leaves 2 bytes spare in a word locked by
> uvm_fpageqlock/uvm_pageqlock which I want to use later for changes to the
> page allocator.  It also brings vm_page up to 128 bytes on amd64.
> 
> New functions:
> 
> => uvmpdpol_pageactivate_p()
> 
> For page replacement policy.  Returns true if pdpol thinks activation info
> would be useful enough to cause disruption to page queues, vm_page and
> uvm_fpageqlock.  For CLOCK this returns true if page is not active, or was
> not activated within the last second.
> 
> => uvm_pageenqueue1()
> 
> Call without uvm_pageqlock.  Acquires the lock and enqueues the page only
> if not already enqueued.
> 
> => uvm_pageactivate1()
> 
> Call without uvm_pageqlock.  Acquires the lock and activates the page only
> if uvmpdpol_pageactivate_p() says yes.  No similar change for deactivate nor
> dequeue as they are much more definite.
> 
> => uvm_pagefree1()
> 
> First part of uvm_pagefree() - strip page of identity.  Requires
> uvm_pageqlock if associated with an object.
> 
> => uvm_pagefree2()
> 
> Second part of uvm_pagefree().  Send page to free list.  No locks required.


Re: racy acccess in kern_runq.c

2019-12-06 Thread Andrew Doran
On Fri, Dec 06, 2019 at 02:55:47PM +, Mindaugas Rasiukevicius wrote:

> Compilers have became much more aggressive over the years.  But they are
> allowed to be so by the C standard.  Specifically, in addition to code-level
> re-ordering,

Yup the rules around that are well understood.

> plain accesses (loads/stores) are subject to load/store fusing, tearing as
> well as invented loads/stores.  At least load/store fusing and tearing
> *have* been observed in reality [1] [2] [3].  So, they are *not* merely
> theoretical or esoteric problems, although there are plenty of these in
> the C11 memory model too [4] [5].

Thank you for the references, very interesting - if distressing - reading.
So it does happen.

> Linux kernel developers went through this already.  Perhaps the C standard
> will plug the holes or perhaps compilers will just change their behaviour,
> as they get enough criticism [6] [7].  However, in the mean time, I think
> let's just accept that things moved on and let's embrace the new primitives.
> While these primitives might be slightly verbose, they are in C11, they fix
> real bugs, they definitely make code less error-prone and they have other
> merits too (e.g. they accommodate static analysers which find some real bugs).

Well I'm firmly of the opinion that this is a bug in the compiler.  By all
means a nice behaviour to have as an option and I very much see the use of
it, but having it as the default for compiling old code is bad news indeed.

No argument from me on accommodating static analysers I think that's a good
thing.  Presumably then we have no other choice than to work around it,
because even if we can tell the compiler not to be a jerk today, that option
won't be feasible to use in the future.

Thanks again,
Andrew


Re: [patch] PT_GET_LWP_PRIVATE and PT_SET_LWP_PRIVATE

2019-12-06 Thread Kamil Rytarowski
On 06.12.2019 06:17, Kamil Rytarowski wrote:
> I have implemented l_private accessor in ptrace(2).
> 
> By default this uses l_private from the lwp struct.
> 
> PT_SET_LWP_PRIVATE uses lwp_setprivate() directly and this call
> abstracts internally MI and MD code.
> 
> The PT_SET_LWP_PRIVATE operation uses by default l_private from the
> struct lwp, however whenever appropriate, picks MD logic that grabs this
> value from a MD filed (register or pcb).
> 
> MD code is implemented for:
>  - sparc, sparc64
>  - alpha
>  - hppa
>  - powerpc
>  - sh3
> 
> There is included compat32 support and ATF tests.
> 
> http://netbsd.org/~kamil/patch-00202-PT_GET_LWP_PRIVATE.txt
> 
> This ptrace operation is inspired by PTRACE_GET_THREAD_AREA /
> PTRACE_SET_THREAD_AREA from linux.
> 
> Please review.
> 
> After merging this, I will add an entry in man-page ptrace(2) and
> implement core(5) generation of PT_GET_LWP_PRIVATE for each LWP within a
> process.
> 

Actually, we can go better here and in one go remove one of the issues
in our current ptrace(2) API regarding legacy operation PT_LWPINFO (it
is obsoleted by PT_GET_SIGINFO and confusing with a different PT_LWPINFO
found in FreeBSD).

I will come up with another and better patch.



signature.asc
Description: OpenPGP digital signature


Re: racy acccess in kern_runq.c

2019-12-06 Thread Mouse
> Compilers have became much more aggressive over the years.  But they
> are allowed to be so by the C standard.  Specifically, in addition to
> code-level re-ordering, plain accesses (loads/stores) are subject to
> load/store fusing, tearing as well as invented loads/stores.

Then, honestly, it sounds to me as though "the latest revision of C" is
no longer an appropriate language for writing kernels.  I see no reason
to twist the kernel code into a pretzel to work around latitude a new
language gives to its compilers - and that's what C11 is, a new
language, albeit one closely related to various previous languages.

One of the prime considerations when choosing a language and/or
compiler for building a kernel is that it produce relatively
predictable code, for exactly this kind of reason.  If the latest C and
the latest gcc no longer do that, then IMO they are no longer
appropriate for writing/compiling kernels.

> While these primitives might be slightly verbose, they are in C11,
> they fix real bugs, they definitely make code less error-prone and
> they have other merits too (e.g. they accommodate static analysers
> which find some real bugs).

How many of those "real bugs" exist only because C11 gave compilers new
latitude to produce unexpected code?

/~\ The ASCII Mouse
\ / Ribbon Campaign
 X  Against HTMLmo...@rodents-montreal.org
/ \ Email!   7D C8 61 52 5D E7 2D 39  4E F1 31 3E E8 B3 27 4B


Re: racy acccess in kern_runq.c

2019-12-06 Thread Maxime Villard
Le 06/12/2019 à 11:28, Andrew Doran a écrit :
> On Fri, Dec 06, 2019 at 10:27:20AM +0100, Maxime Villard wrote:
> 
>> With 'worker_ci', there is an actual safety issue, because the compiler could
>> split the accesses and the hardware may not use atomics by default like x86.
>> This could cause random page faults; so it needs to be strictly atomic.
> 
> No I don't accept that.
> 
> The ability to load and store a native word sized int (and in more recent
> years a pointer) with a single instruction is a fundamental assumption that
> every operating system written in C rests upon.

1) That it is done with a single instruction, does not necessarily mean it is
atomic. There are many stupid things microcodes are allowed to do (not on x86,
however).

2) That there is a one-line assignment of a variable in C, does not necessarily
mean that there will be one move instruction. There are many things the compiler
is allowed to do in optimization passes, as said already.

3) "every operating system written in C rests upon" I have some difficulty with
that. Linux has our equivalent of relaxed atomics (called {READ,WRITE}_ONCE),
with 1000+ combined references in their tree. So no, here's a big open source
kernel that does not rest upon unclear assumptions for lockless operations. No
way to check Windows, but I wouldn't be surprised if they had a similar
approach.

> If the compiler splits one of those acceses, then you are either using some
> other data type, or have a broken compiler on your hands.

Or you're just using a compiler that knows well enough how to arrange
instructions for optimized pipelining, and this compiler decides to re-order
the accesses based on desirable performance properties, as allowed by the
standard.

> https://www.gnu.org/software/libc/manual/html_node/Atomic-Types.html

I don't see anything in this link that is really reassuring. "you can assume"
that things are mostly ok on "systems we know of".

At least the rationale for {READ,WRITE}_ONCE is based on standards, and not
wild assumptions:

https://github.com/google/ktsan/wiki/READ_ONCE-and-WRITE_ONCE

To me, Kengo's 2nd patch remains needed.

Maxime


Re: racy acccess in kern_runq.c

2019-12-06 Thread Andrew Doran
On Fri, Dec 06, 2019 at 10:27:20AM +0100, Maxime Villard wrote:

> With 'worker_ci', there is an actual safety issue, because the compiler could
> split the accesses and the hardware may not use atomics by default like x86.
> This could cause random page faults; so it needs to be strictly atomic.

No I don't accept that.

The ability to load and store a native word sized int (and in more recent
years a pointer) with a single instruction is a fundamental assumption that
every operating system written in C rests upon.

If the compiler splits one of those acceses, then you are either using some
other data type, or have a broken compiler on your hands.  If the compiler
is broken it's the compiler you should be looking at, not the program it
compiled.  It's as simple as that.

https://www.gnu.org/software/libc/manual/html_node/Atomic-Types.html

Andrew


Re: racy acccess in kern_runq.c

2019-12-06 Thread Kengo NAKAHARA

Hi,

On 2019/12/06 18:00, Andrew Doran wrote:

Hi,

On Fri, Dec 06, 2019 at 03:52:23PM +0900, Kengo NAKAHARA wrote:


There are some racy accesses in kern_runq.c detected by KCSAN.  Those
racy access messages is so frequency that they cover other messages,
so I want to fix them.  They can be fixed by the following patch.


Please don't commit this.  These accesses are racy by design.  There is no
safety issue and we do not want to disturb the activity of other CPUs in
this code path by locking them.  We also don't want to use atomics either.
This code path is extremely hot using atomics would impose a severe
performance penalty on the system under certain workloads.


I see, I don't commit it.  Indeed, I am worry about performance degradation.



Also if you change this to use strong ordering, you're quite likely to
change the selection behaviour of LWPs.  This is something delicate that
exhibits reasonable scheduling behaviour in large part through randomness
i.e by accident, so serialising it it's likely to have strong effects on how
LWPs are distributed.

Lastly I have a number of experimental changes to this code which I'll be
committing in the near future allowing people to change the LWP selection
policy to see how if we can improve performance under different workloads.
They will also likely show up in KCSAN as racy/dangerous accesses.

My suggestion is to find a way to teach KCSAN that we know something is
racy, we like it being racy, and that it's not a problem, so that it no
longer shows up in the KCSAN output.


I see.  I will add __nocsan attribute to the functions in my local sources
to find other races.


Thanks,

--
//
Internet Initiative Japan Inc.

Device Engineering Section,
Product Development Department,
Product Division,
Technology Unit

Kengo NAKAHARA 


Re: racy acccess in kern_runq.c

2019-12-06 Thread Maxime Villard
Le 06/12/2019 à 10:00, Andrew Doran a écrit :
> Hi,
> 
> On Fri, Dec 06, 2019 at 03:52:23PM +0900, Kengo NAKAHARA wrote:
> 
>> There are some racy accesses in kern_runq.c detected by KCSAN.  Those
>> racy access messages is so frequency that they cover other messages,
>> so I want to fix them.  They can be fixed by the following patch.
> 
> Please don't commit this.  These accesses are racy by design.  There is no
> safety issue and we do not want to disturb the activity of other CPUs in
> this code path by locking them.  We also don't want to use atomics either. 
> This code path is extremely hot using atomics would impose a severe
> performance penalty on the system under certain workloads.

With 'worker_ci', there is an actual safety issue, because the compiler could
split the accesses and the hardware may not use atomics by default like x86.
This could cause random page faults; so it needs to be strictly atomic.

Apart from that, yes, there is no other safety issue and locking would be too
expensive. All we possibly care about is making sure the accesses aren't split,
for the sake of not basing the scheduling policy on systematic garbage, and
atomic_relaxed seems like the right thing to do (Kengo's 2nd patch),
considering that it costs ~nothing.

> Also if you change this to use strong ordering, you're quite likely to
> change the selection behaviour of LWPs.  This is something delicate that
> exhibits reasonable scheduling behaviour in large part through randomness
> i.e by accident, so serialising it it's likely to have strong effects on how
> LWPs are distributed.
> 
> Lastly I have a number of experimental changes to this code which I'll be
> committing in the near future allowing people to change the LWP selection
> policy to see how if we can improve performance under different workloads. 
> They will also likely show up in KCSAN as racy/dangerous accesses.
> 
> My suggestion is to find a way to teach KCSAN that we know something is
> racy, we like it being racy, and that it's not a problem, so that it no
> longer shows up in the KCSAN output.

Maybe we should indeed have a macro to say "yes this access is racy but we
don't care". Currently this macro is atomic_{store,load}_relaxed()...




Re: racy acccess in kern_runq.c

2019-12-06 Thread Maxime Villard
Le 06/12/2019 à 07:52, Kengo NAKAHARA a écrit :
> Hi,
> 
> There are some racy accesses in kern_runq.c detected by KCSAN.  Those
> racy access messages is so frequency that they cover other messages,
> so I want to fix them.  They can be fixed by the following patch.
> 
> 
> diff --git a/sys/kern/kern_runq.c b/sys/kern/kern_runq.c
> index 04ff1732016..bb0815689cd 100644
> --- a/sys/kern/kern_runq.c
> +++ b/sys/kern/kern_runq.c
> @@ -627,7 +627,7 @@ sched_balance(void *nocallout)
>  /* Make lockless countings */
>  for (CPU_INFO_FOREACH(cii, ci)) {
>  spc = >ci_schedstate;
> -
> +    spc_lock(ci);
>  /*
>   * Average count of the threads
>   *
> @@ -643,10 +643,11 @@ sched_balance(void *nocallout)
>  hci = ci;
>  highest = spc->spc_avgcount;
>  }
> +    spc_unlock(ci);
>  }
>  
>  /* Update the worker */
> -    worker_ci = hci;
> +    atomic_swap_ptr(_ci, hci);
>  
>  if (nocallout == NULL)
>  callout_schedule(_ch, balance_period);
> @@ -734,12 +735,15 @@ sched_idle(void)
>  spc_unlock(ci);
>  
>  no_migration:
> +    spc_lock(ci);
>  if ((spc->spc_flags & SPCF_OFFLINE) != 0 || spc->spc_count != 0) {
> +    spc_unlock(ci);
>  return;
>  }
>  
>  /* Reset the counter, and call the balancer */
>  spc->spc_avgcount = 0;
> +    spc_unlock(ci);
>  sched_balance(ci);
>  tci = worker_ci;
>  tspc = >ci_schedstate;
> 

It just so happens we were talking about this yesterday.

worker_ci indeed needs to be a real atomic, but it means we also have to
fetch it atomically here:

744 tci = worker_ci;

For the other variables, my understanding was that we don't care about
races, but only care about making sure the accesses are not split, so it
seems to me atomic_relaxed would suffice.


Re: racy acccess in kern_runq.c

2019-12-06 Thread Andrew Doran
Hi,

On Fri, Dec 06, 2019 at 03:52:23PM +0900, Kengo NAKAHARA wrote:

> There are some racy accesses in kern_runq.c detected by KCSAN.  Those
> racy access messages is so frequency that they cover other messages,
> so I want to fix them.  They can be fixed by the following patch.

Please don't commit this.  These accesses are racy by design.  There is no
safety issue and we do not want to disturb the activity of other CPUs in
this code path by locking them.  We also don't want to use atomics either. 
This code path is extremely hot using atomics would impose a severe
performance penalty on the system under certain workloads.

Also if you change this to use strong ordering, you're quite likely to
change the selection behaviour of LWPs.  This is something delicate that
exhibits reasonable scheduling behaviour in large part through randomness
i.e by accident, so serialising it it's likely to have strong effects on how
LWPs are distributed.

Lastly I have a number of experimental changes to this code which I'll be
committing in the near future allowing people to change the LWP selection
policy to see how if we can improve performance under different workloads. 
They will also likely show up in KCSAN as racy/dangerous accesses.

My suggestion is to find a way to teach KCSAN that we know something is
racy, we like it being racy, and that it's not a problem, so that it no
longer shows up in the KCSAN output.

Thank you,
Andrew


Re: racy acccess in kern_runq.c

2019-12-06 Thread Kengo NAKAHARA

Hi,

Thank you for your comment.

On 2019/12/06 16:42, Maxime Villard wrote:

Le 06/12/2019 à 07:52, Kengo NAKAHARA a écrit :

Hi,

There are some racy accesses in kern_runq.c detected by KCSAN.  Those
racy access messages is so frequency that they cover other messages,
so I want to fix them.  They can be fixed by the following patch.


diff --git a/sys/kern/kern_runq.c b/sys/kern/kern_runq.c
index 04ff1732016..bb0815689cd 100644
--- a/sys/kern/kern_runq.c
+++ b/sys/kern/kern_runq.c
@@ -627,7 +627,7 @@ sched_balance(void *nocallout)
  /* Make lockless countings */
  for (CPU_INFO_FOREACH(cii, ci)) {
  spc = >ci_schedstate;
-
+    spc_lock(ci);
  /*
   * Average count of the threads
   *
@@ -643,10 +643,11 @@ sched_balance(void *nocallout)
  hci = ci;
  highest = spc->spc_avgcount;
  }
+    spc_unlock(ci);
  }
  
  /* Update the worker */

-    worker_ci = hci;
+    atomic_swap_ptr(_ci, hci);
  
  if (nocallout == NULL)

  callout_schedule(_ch, balance_period);
@@ -734,12 +735,15 @@ sched_idle(void)
  spc_unlock(ci);
  
  no_migration:

+    spc_lock(ci);
  if ((spc->spc_flags & SPCF_OFFLINE) != 0 || spc->spc_count != 0) {
+    spc_unlock(ci);
  return;
  }
  
  /* Reset the counter, and call the balancer */

  spc->spc_avgcount = 0;
+    spc_unlock(ci);
  sched_balance(ci);
  tci = worker_ci;
  tspc = >ci_schedstate;



It just so happens we were talking about this yesterday.


Oh, I missed the thread, sorry.


worker_ci indeed needs to be a real atomic, but it means we also have to
fetch it atomically here:

744 tci = worker_ci;

For the other variables, my understanding was that we don't care about
races, but only care about making sure the accesses are not split, so it
seems to me atomic_relaxed would suffice.


I see.  I update the patch.

diff --git a/sys/kern/kern_runq.c b/sys/kern/kern_runq.c
index 04ff1732016..dbd0f73585a 100644
--- a/sys/kern/kern_runq.c
+++ b/sys/kern/kern_runq.c
@@ -627,6 +627,7 @@ sched_balance(void *nocallout)
/* Make lockless countings */
for (CPU_INFO_FOREACH(cii, ci)) {
spc = >ci_schedstate;
+   u_int avg, mcount;
 
 		/*

 * Average count of the threads
@@ -634,19 +635,20 @@ sched_balance(void *nocallout)
 * The average is computed as a fixpoint number with
 * 8 fractional bits.
 */
-   spc->spc_avgcount = (
-   weight * spc->spc_avgcount + (100 - weight) * 256 * 
spc->spc_mcount
-   ) / 100;
+   avg = atomic_load_relaxed(>spc_avgcount);
+   mcount = atomic_load_relaxed(>spc_mcount);
+   avg = (weight * avg + (100 - weight) * 256 * mcount) / 100;
+   atomic_store_relaxed(>spc_avgcount, avg);
 
 		/* Look for CPU with the highest average */

-   if (spc->spc_avgcount > highest) {
+   if (avg > highest) {
hci = ci;
-   highest = spc->spc_avgcount;
+   highest = avg;
}
}
 
 	/* Update the worker */

-   worker_ci = hci;
+   atomic_swap_ptr(_ci, hci);
 
 	if (nocallout == NULL)

callout_schedule(_ch, balance_period);
@@ -734,14 +736,15 @@ sched_idle(void)
spc_unlock(ci);
 
 no_migration:

-   if ((spc->spc_flags & SPCF_OFFLINE) != 0 || spc->spc_count != 0) {
+   if ((spc->spc_flags & SPCF_OFFLINE) != 0
+   || atomic_load_relaxed(>spc_count) != 0) {
return;
}
 
 	/* Reset the counter, and call the balancer */

-   spc->spc_avgcount = 0;
+   atomic_store_relaxed(>spc_avgcount, 0);
sched_balance(ci);
-   tci = worker_ci;
+   atomic_swap_ptr(, worker_ci);
tspc = >ci_schedstate;
if (ci == tci || spc->spc_psid != tspc->spc_psid)
return;


Is this appropriate?


Thanks,

--
//
Internet Initiative Japan Inc.

Device Engineering Section,
Product Development Department,
Product Division,
Technology Unit

Kengo NAKAHARA