On Dec 15, 2010, at 1:53 AM, Jean-Yves Migeon wrote: >>> On Wed, Dec 15, 2010 at 1:49 PM, Matt Thomas <m...@3am-software.com> wrote: >>>> The diffs are at http://www.netbsd.org/uvmexp-diff.txt > > Purely cosmetic comments: > - why are most of the cpu_nsoft count commented out?
Because we were counting ASTs as soft interrupts but they aren't. kern_softint.c already counts soft interrupts. > - in uvmexp_print(), just use PRIu64 instead of %llu (as you did in other > places) Good idea. On Dec 15, 2010, at 5:51 AM, Andrew Doran wrote: >> >> The diffs are at http://www.netbsd.org/uvmexp-diff.txt >> >> Have fun reading them! > > I did! > > I have three concerns around atomicity and concurrency. > > - curcpu()->ci_data.cpu_.... we can be preempted in the middle of this type > of dereference. So sometimes we could end up with the counter going > backwards. Admittedly there are many places where we are counting, > where interrupts are off. but that was true for the previous counters. Though 64 bit makes it a bit worse. > - 64 bit increment isn't atomic on a 32-bit or RISCy platform. So the > same sort of problem as above. Except 32 bit counters could/will eventually overflow. > - In uvm where we tally the counters, we can read them mid-update so > maybe we'll only see half the updated value. The reason I had the > periodic update in the clock interrupt was to avoid this problem. > > So all that said, maybe we don't care about the above problems but I wanted > to point them out. As long as there is not a "really bad" case then > perhaps it's OK. It's not really any worse that happens today. On Dec 15, 2010, at 5:35 AM, Matthew Mondor wrote: > On Tue, 14 Dec 2010 20:49:14 -0800 > Matt Thomas <m...@3am-software.com> wrote: > >> I have a fairly large but mostly simple patch which changes the stats >> collected in >> uvmexp for faults, intrs, softs, syscalls, and traps from 32 bit to 64 bits >> and >> puts them in cpu_data (in cpu_info). This makes more accurate and a little >> cheaper >> to update on 64bit systems. > > I like the cleanliness of the changes; > > A potential issue I see is how heavy this becomes on some 32-bit CPUs > i.e. m68k, where I see for instance 1 instruction being replaced by 9 > instructions (including registers save/restore) to increment a > counter. I'm not sure if in practice this will really affect > performance, or if it's worth benchmarking for those architectures, > however. Actually, that's only true for a spurious interrupts and since those shouldn't happen, who cares? Otherwise the increment happens inside already existing register save/restore. I could get rid of the lea but that would lead a longer instruction stream. I got rid of moveq and use addq.l #1, 4(%a1). So it's down to 5 instructions. Granted that's more one but that as cheap as I can't make > If it turned out to be a problem, I could see two possible solutions: > an option to disable some stat counters on slow systems (values could > simply remain 0 in that case), or a new counter type say, > cpustatcount_t and macros defined by the MD code to use 32-bit > cpu-specific counters where necessary, getting compiled/exported to > userland using 64-bit at statistics request time to avoid > compat/userland complications... > > Thanks, > -- > Matt On Dec 15, 2010, at 5:49 AM, Antti Kantee wrote: > On Tue Dec 14 2010 at 20:49:14 -0800, Matt Thomas wrote: >> I have a fairly large but mostly simple patch which changes the stats >> collected in >> uvmexp for faults, intrs, softs, syscalls, and traps from 32 bit to 64 bits >> and >> puts them in cpu_data (in cpu_info). This makes more accurate and a little >> cheaper >> to update on 64bit systems. > > And probably a lot cheaper on MP systems? Less contention for uvmexp and the cacheline with those stats is likely to stay live. > I was curious about the cache effects of uvmstats recently. Did you by > any chance do a before/after build.sh test on an SMP machine? I did not. On Dec 15, 2010, at 6:30 AM, Martin Husemann wrote: > I have one stupid question: why can't we leave the size of the counters > at 32bit on a per arch basis? Why don't we have the same issue with ev_count being 64bit? I dislike counters that overflow and then lose any meaning. We could have 32-bit per-cpu counters and then add back the merge to uvmexp (except this time doing a 64bit atomic_add) but then we lose per-cpu ness. > At a quick glance the sparc code looked v9 only, so will need some work. sparc--netbsdelf-gcc -mcpu=v7 generates the following for a 64bit increment: ldd [%o0], %g2 addcc %g3, 1, %g3 addx %g2, 0, %g2 std %g2, [%o0] For both SPARC and SPARC64, I've added INCR64 and INCR64X macros. The former uses %o0/%o1/%o2 (%o2 on 32-bit SPARC only) and INCR64X allows you specify the registers to be used. I updated the patch at http://www.netbsd.org/~matt/uvmexp-diff.txt Thanks for the comments. On to round 2!