Re: svn commit: r252032 - head/sys/amd64/include

2013-06-26 Thread Gleb Smirnoff
Bruce, On Wed, Jun 26, 2013 at 11:42:39AM +1000, Bruce Evans wrote: B Anyway, as Gleb said, there is no point in B optimizing the i386 kernel. B B I said that there is every point in optimizing the i386 kernel. This B applies even more to other 32-bit arches. Some CPUs are much slower B

Re: svn commit: r252032 - head/sys/amd64/include

2013-06-26 Thread Dmitry Morozovsky
On Wed, 26 Jun 2013, Gleb Smirnoff wrote: On Wed, Jun 26, 2013 at 11:42:39AM +1000, Bruce Evans wrote: B Anyway, as Gleb said, there is no point in B optimizing the i386 kernel. B B I said that there is every point in optimizing the i386 kernel. This B applies even more to other 32-bit

Re: svn commit: r252032 - head/sys/amd64/include

2013-06-26 Thread Dmitry Morozovsky
On Tue, 25 Jun 2013, Konstantin Belousov wrote: Updates to the counter cannot be done from the interrupt context. This is fragile, however. It prevents using counters for things like counting interrupts. Most interrupt counting is now done directlyly and doesn't use PCPU_INC().

Re: svn commit: r252032 - head/sys/amd64/include

2013-06-26 Thread Bruce Evans
On Wed, 26 Jun 2013, Gleb Smirnoff wrote: On Wed, Jun 26, 2013 at 11:42:39AM +1000, Bruce Evans wrote: B Anyway, as Gleb said, there is no point in B optimizing the i386 kernel. B B I said that there is every point in optimizing the i386 kernel. This B applies even more to other 32-bit

Re: svn commit: r252032 - head/sys/amd64/include

2013-06-25 Thread Konstantin Belousov
On Tue, Jun 25, 2013 at 12:45:36PM +1000, Bruce Evans wrote: On Mon, 24 Jun 2013, Konstantin Belousov wrote: On Sun, Jun 23, 2013 at 07:57:57PM +1000, Bruce Evans wrote: The case that can't be fixed by rereading the counters is when fetching code runs in between the stores. If the stores

Re: svn commit: r252032 - head/sys/amd64/include

2013-06-25 Thread Bruce Evans
On Tue, 25 Jun 2013, Konstantin Belousov wrote: On Tue, Jun 25, 2013 at 12:45:36PM +1000, Bruce Evans wrote: On Mon, 24 Jun 2013, Konstantin Belousov wrote: ... The following is the prototype for the x86. The other 64bit architectures are handled exactly like amd64. For 32bit !x86 arches,

Re: svn commit: r252032 - head/sys/amd64/include

2013-06-25 Thread Gleb Smirnoff
On Mon, Jun 24, 2013 at 11:16:33PM +1000, Bruce Evans wrote: B K This is quite interesting idea, but I still did not decided if it B K acceptable. The issue is that we could add the carry to the other B K processor counter, if the preemption kicks in at right time between B K two

Re: svn commit: r252032 - head/sys/amd64/include

2013-06-25 Thread Bruce Evans
On Tue, 25 Jun 2013, Gleb Smirnoff wrote: On Mon, Jun 24, 2013 at 11:16:33PM +1000, Bruce Evans wrote: B K This is quite interesting idea, but I still did not decided if it B K acceptable. The issue is that we could add the carry to the other B K processor counter, if the preemption kicks

Re: svn commit: r252032 - head/sys/amd64/include

2013-06-25 Thread Konstantin Belousov
On Tue, Jun 25, 2013 at 08:14:41PM +1000, Bruce Evans wrote: On Tue, 25 Jun 2013, Konstantin Belousov wrote: Updates to the counter cannot be done from the interrupt context. This is fragile, however. It prevents using counters for things like counting interrupts. Most interrupt

Re: svn commit: r252032 - head/sys/amd64/include

2013-06-25 Thread Bruce Evans
On Tue, 25 Jun 2013, Konstantin Belousov wrote: On Tue, Jun 25, 2013 at 08:14:41PM +1000, Bruce Evans wrote: On Tue, 25 Jun 2013, Konstantin Belousov wrote: Updates to the counter cannot be done from the interrupt context. This is fragile, however. It prevents using counters for things

Re: svn commit: r252032 - head/sys/amd64/include

2013-06-24 Thread Gleb Smirnoff
On Sun, Jun 23, 2013 at 10:33:43AM +0300, Konstantin Belousov wrote: K On Sat, Jun 22, 2013 at 06:58:15PM +1000, Bruce Evans wrote: K So the i386 version be simply addl; adcl to memory. Each store in K this is atomic at the per-CPU level. If there is no carry, then the K separate stores are

Re: svn commit: r252032 - head/sys/amd64/include

2013-06-24 Thread Gleb Smirnoff
Bruce, did you run your benchmarks in userland or in kernel? How many parallel threads were updating the same counter? Can you please share your benchmarks? -- Totus tuus, Glebius. ___ svn-src-head@freebsd.org mailing list

Re: svn commit: r252032 - head/sys/amd64/include

2013-06-24 Thread Bruce Evans
On Mon, 24 Jun 2013, Gleb Smirnoff wrote: did you run your benchmarks in userland or in kernel? How many parallel threads were updating the same counter? Can you please share your benchmarks? Only userland, with 1 thread. I don't have any more benchmarks than the test program in previous

Re: svn commit: r252032 - head/sys/amd64/include

2013-06-24 Thread Bruce Evans
On Mon, 24 Jun 2013, Gleb Smirnoff wrote: On Sun, Jun 23, 2013 at 10:33:43AM +0300, Konstantin Belousov wrote: K On Sat, Jun 22, 2013 at 06:58:15PM +1000, Bruce Evans wrote: K So the i386 version be simply addl; adcl to memory. Each store in K this is atomic at the per-CPU level. If there

Re: svn commit: r252032 - head/sys/amd64/include

2013-06-24 Thread mdf
[snipping everything about counter64, atomic ops, cycles, etc.] I wonder if the idea explained in this paper: http://static.usenix.org/event/usenix03/tech/freenix03/full_papers/mcgarry/mcgarry_html/ Which seems to be used in FreeBSD for some ARM atomics:

Re: svn commit: r252032 - head/sys/amd64/include

2013-06-24 Thread Konstantin Belousov
On Sun, Jun 23, 2013 at 07:57:57PM +1000, Bruce Evans wrote: The case that can't be fixed by rereading the counters is when fetching code runs in between the stores. If the stores are on a another CPU that is currently executing them, then we can keep checking that the counters don't change

Re: svn commit: r252032 - head/sys/amd64/include

2013-06-24 Thread Bruce Evans
On Mon, 24 Jun 2013, Konstantin Belousov wrote: On Sun, Jun 23, 2013 at 07:57:57PM +1000, Bruce Evans wrote: The case that can't be fixed by rereading the counters is when fetching code runs in between the stores. If the stores are on a another CPU that is currently executing them, then we

Re: svn commit: r252032 - head/sys/amd64/include

2013-06-24 Thread Bruce Evans
On Tue, 25 Jun 2013, I wrote: My current best design: - use ordinary mutexes to protect counter fetches in non-per-CPU contexts. - use native-sized or always 32-bit counters. Counter updates are done by a single addl on i386. Fix pcpu.h on arches other than amd64 and i386 and use the same

Re: svn commit: r252032 - head/sys/amd64/include

2013-06-24 Thread Daniel O'Connor
On 25/06/2013, at 12:54, Bruce Evans b...@optusnet.com.au wrote: - run a daemon every few minutes to fetch all the counters, so that the native-sized counters are in no danger of overflowing on systems that don't run statistics programs often enough to fetch the counters to actually use.

Re: svn commit: r252032 - head/sys/amd64/include

2013-06-23 Thread Konstantin Belousov
On Sat, Jun 22, 2013 at 01:37:58PM +1000, Bruce Evans wrote: On Sat, 22 Jun 2013, I wrote: ... Here are considerably expanded tests, with noninline tests dropped. Summary of times on Athlon64: simple increment: 4-7 cycles (1) simple increment preceded

Re: svn commit: r252032 - head/sys/amd64/include

2013-06-23 Thread Konstantin Belousov
On Sat, Jun 22, 2013 at 06:58:15PM +1000, Bruce Evans wrote: So the i386 version be simply addl; adcl to memory. Each store in this is atomic at the per-CPU level. If there is no carry, then the separate stores are equivalent to adding separate nonnegative values and the counter value is

Re: svn commit: r252032 - head/sys/amd64/include

2013-06-23 Thread Bruce Evans
On Sun, 23 Jun 2013, Konstantin Belousov wrote: On Sat, Jun 22, 2013 at 01:37:58PM +1000, Bruce Evans wrote: On Sat, 22 Jun 2013, I wrote: ... Here are considerably expanded tests, with noninline tests dropped. Summary of times on Athlon64: simple increment:

Re: svn commit: r252032 - head/sys/amd64/include

2013-06-23 Thread Bruce Evans
On Sun, 23 Jun 2013, Konstantin Belousov wrote: On Sat, Jun 22, 2013 at 06:58:15PM +1000, Bruce Evans wrote: So the i386 version be simply addl; adcl to memory. Each store in this is atomic at the per-CPU level. If there is no carry, then the separate stores are equivalent to adding separate

Re: svn commit: r252032 - head/sys/amd64/include

2013-06-23 Thread Bruce Evans
On Sun, 23 Jun 2013, I wrote: I thought of lots of variations, but couldn't find one that works perfectly. One idea (that goes with the sign check on the low 32 bits) is to use a misaligned add to memory to copy the 31st bit as a carry bit to the the high word. The value of the counter is

Re: svn commit: r252032 - head/sys/amd64/include

2013-06-22 Thread Bruce Evans
On Sat, 22 Jun 2013, I wrote: On Sat, 22 Jun 2013, I wrote: ... Here are considerably expanded tests, with noninline tests dropped. Summary of times on Athlon64: simple increment: 4-7 cycles (1) simple increment preceded by feature test: 5-8 cycles (1)

Re: svn commit: r252032 - head/sys/amd64/include

2013-06-21 Thread Konstantin Belousov
On Fri, Jun 21, 2013 at 12:15:24PM +1000, Lawrence Stewart wrote: Hi Kostik, On 06/21/13 00:30, Konstantin Belousov wrote: Author: kib Date: Thu Jun 20 14:30:04 2013 New Revision: 252032 URL: http://svnweb.freebsd.org/changeset/base/252032 Log: Allow immediate operand.

Re: svn commit: r252032 - head/sys/amd64/include

2013-06-21 Thread Gleb Smirnoff
Bruce, On Fri, Jun 21, 2013 at 09:04:34AM +1000, Bruce Evans wrote: B The i386 version of the counter asm doesn't support the immediate B constraint for technical reasons. 64 bit counters are too large and B slow to use on i386, especially when they are implemented as they are B without

Re: svn commit: r252032 - head/sys/amd64/include

2013-06-21 Thread Bruce Evans
On Fri, 21 Jun 2013, Gleb Smirnoff wrote: On Fri, Jun 21, 2013 at 09:04:34AM +1000, Bruce Evans wrote: B The i386 version of the counter asm doesn't support the immediate B constraint for technical reasons. 64 bit counters are too large and B slow to use on i386, especially when they are

Re: svn commit: r252032 - head/sys/amd64/include

2013-06-21 Thread Gleb Smirnoff
Bruce, On Fri, Jun 21, 2013 at 09:02:36PM +1000, Bruce Evans wrote: B Not if it is a 32-bit increment on 32-bit systems, as it should be. B B I said to use a daemon to convert small (16 or 32 bit) counters into B larger (32 or 64 bit) ones. It is almost as efficient to call the B accumulation

Re: svn commit: r252032 - head/sys/amd64/include

2013-06-21 Thread Bruce Evans
On Fri, 21 Jun 2013, Gleb Smirnoff wrote: On Fri, Jun 21, 2013 at 09:02:36PM +1000, Bruce Evans wrote: B Not if it is a 32-bit increment on 32-bit systems, as it should be. B B I said to use a daemon to convert small (16 or 32 bit) counters into B larger (32 or 64 bit) ones. It is almost as

Re: svn commit: r252032 - head/sys/amd64/include

2013-06-21 Thread Bruce Evans
On Sat, 22 Jun 2013, I wrote: ... Here are considerably expanded tests, with noninline tests dropped. Summary of times on Athlon64: simple increment: 4-7 cycles (1) simple increment preceded by feature test: 5-8 cycles (1) simple 32-bit increment:

svn commit: r252032 - head/sys/amd64/include

2013-06-20 Thread Konstantin Belousov
Author: kib Date: Thu Jun 20 14:30:04 2013 New Revision: 252032 URL: http://svnweb.freebsd.org/changeset/base/252032 Log: Allow immediate operand. Sponsored by: The FreeBSD Foundation Modified: head/sys/amd64/include/counter.h Modified: head/sys/amd64/include/counter.h

Re: svn commit: r252032 - head/sys/amd64/include

2013-06-20 Thread Bruce Evans
On Thu, 20 Jun 2013, Konstantin Belousov wrote: Log: Allow immediate operand. .. Modified: head/sys/amd64/include/counter.h == --- head/sys/amd64/include/counter.hThu Jun 20 14:20:03 2013 (r252031) +++

Re: svn commit: r252032 - head/sys/amd64/include

2013-06-20 Thread Bruce Evans
On Fri, 21 Jun 2013, I wrote: On Thu, 20 Jun 2013, Konstantin Belousov wrote: ... @@ -44,7 +44,7 @@ counter_u64_add(counter_u64_t c, int64_t ... The i386 version of the counter asm doesn't support the immediate constraint for technical reasons. 64 bit counters are too large and slow to use

Re: svn commit: r252032 - head/sys/amd64/include

2013-06-20 Thread Bruce Evans
On Fri, 21 Jun 2013, Bruce Evans wrote: On Fri, 21 Jun 2013, I wrote: On Thu, 20 Jun 2013, Konstantin Belousov wrote: ... @@ -44,7 +44,7 @@ counter_u64_add(counter_u64_t c, int64_t ... The i386 version of the counter asm doesn't support the immediate constraint for technical reasons. 64

Re: svn commit: r252032 - head/sys/amd64/include

2013-06-20 Thread Lawrence Stewart
Hi Kostik, On 06/21/13 00:30, Konstantin Belousov wrote: Author: kib Date: Thu Jun 20 14:30:04 2013 New Revision: 252032 URL: http://svnweb.freebsd.org/changeset/base/252032 Log: Allow immediate operand. Sponsored by: The FreeBSD Foundation Modified: