Re: [go-nuts] Fast ConcurrentCounter without memory sharing

2016-08-14 Thread Ian Lance Taylor
On Sun, Aug 14, 2016 at 2:49 PM, Nigel Tao  wrote:
> On Aug 15, 2016 05:21, "Ian Lance Taylor"  wrote:
>> If these approaches seem awkward, I suppose you could get
>> syscall(syscall.GETCPU, ...) when you want to increment a counter.
>
> That still seems racy:
>
> n := syscall(syscall.GETCPU, ...)
> // The goroutine could be re-scheduled here to another CPU, between these
> two lines, so n is out of date.
> array[n]++ // read/writes wrong n, non-atomically.

Yes, you do need to use an atomic add.  But the point is that in the
normal case that particular memory location will only be accessed from
a single CPU, and as such it should be faster than having all the CPUs
access a single shared counter.

Ian

-- 
You received this message because you are subscribed to the Google Groups 
"golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to golang-nuts+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


RE: [go-nuts] Fast ConcurrentCounter without memory sharing

2016-08-14 Thread John Souvestre
Hello Gaurav.

You might want to look at Jon Gjengset's project: 
https://github.com/jonhoo/drwmutex

Also, I seem to recall Dmitry saying that sync.Pool distributes locking.  So it 
might be worth looking into.

Another type of lock which might be of interest is MCS (or K42, CLH, HCLH).

John

John Souvestre - New Orleans LA

-Original Message-
From: golang-nuts@googlegroups.com [mailto:golang-nuts@googlegroups.com] On 
Behalf Of Ian Lance Taylor
Sent: 2016 August 14, Sun 14:21
To: Gaurav Agarwal
Cc: golang-nuts
Subject: Re: [go-nuts] Fast ConcurrentCounter without memory sharing

On Sun, Aug 14, 2016 at 9:58 AM, Gaurav Agarwal
<gauravagarw...@gmail.com> wrote:
> Ian, thanks for the explanation and the link !
>
> But I am still unclear how to implement such a concurrent counter in Go -
> given that we can't find out what thread/cpu is the goroutine executing.
> Note that in this case there was never the need of pinning a goroutine to a
> thread or cpu; just that we wanted to know which is it, at any given moment
> so as to access the required memory location.
>
> Do you have any alternate ideas for this?

I think the last time a similar though not identical concept was
discussed was this thread:

https://groups.google.com/d/msg/golang-nuts/zt_CQssHw4M/TteNG44geaEJ

It would be interesting to know what the potential speedup is.  It
should be easy enough to write a C program to measure that.  But when
writing such a program, remember that no real program will simply
increment a concurrent counter.  The question is not just how much
speedup you can get from a concurrent counter, but how much it will
matter to a real program.

Anyhow, that aside, if you know how many goroutines you are going to
run, just allocate a slice with that many elements.  If you don't, use
a set of fixed size arrays and have each goroutine grab a pointer to a
possibly-new array when it starts.  Pass down the pointer.

If these approaches seem awkward, I suppose you could get
syscall(syscall.GETCPU, ...) when you want to increment a counter.
But that would admittedly introduce overhead that would distort the
results.

We do not plan to modify the Go runtime to expose a thread or CPU ID,
because although it's true that your application doesn't care much if
the value becomes immediately stale, other applications would care.
It seems unwise to expose a value that is extremely easy to misuse in
a way that will often succeed and occasionally and inexplicably fail.

Ian

-- 
You received this message because you are subscribed to the Google Groups 
"golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to golang-nuts+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

-- 
You received this message because you are subscribed to the Google Groups 
"golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to golang-nuts+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: [go-nuts] Fast ConcurrentCounter without memory sharing

2016-08-14 Thread Gaurav Agarwal
Ian, thanks for the explanation and the link !

But I am still unclear how to implement such a concurrent counter in Go -
given that we can't find out what thread/cpu is the goroutine executing.
Note that in this case there was never the need of pinning a goroutine to a
thread or cpu; just that we wanted to know which is it, at any given moment
so as to access the required memory location.

Do you have any alternate ideas for this?

On Sun, Aug 14, 2016 at 10:18 PM, Ian Lance Taylor  wrote:

> On Sun, Aug 14, 2016 at 2:25 AM, gaurav  wrote:
> >
> > How do I get to know the Thread (or CpuCore) id on which the goroutine is
> > running?
>
> Go does not expose this information.  In Go it's also unreliable
> because the language makes no guarantees about when a goroutine may
> move from one thread, or CPU, to another.
>
> > In the get() call of this counter, I need to sum up array values; how
> does
> > one ensure that the  latest values written to the array are visible to
> the
> > goroutine calling the get() method? Are the values written using atomic
> > package made visible to all cores immediately? Or else, does
> > atomic.LoadInt32()guarantee to return the latest values? I think the
> problem
> > is that atomic package does not say anything about visibility of memory
> > updates across cores.
>
> The lack of documentation is https://golang.org/issue/5045.  It's not
> the case that values written are visible to all cores for ordinary
> memory access, but they are visible if you use atomic.LoadNN.  What is
> unclear is what that says about other memory writes, but I think the
> current implementation, and all plausible future implementations,
> treat Store as a store-release and treat Load as a load-acquire.
>
> Ian
>

-- 
You received this message because you are subscribed to the Google Groups 
"golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to golang-nuts+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


[go-nuts] Fast ConcurrentCounter without memory sharing

2016-08-14 Thread gaurav
Hi there,

Just for heck of it - I am trying to come up with a ConcurrentCounter that 
does not suffer memory sharing or the cost of mutex lock/unlock.

The idea as described in the following Java code snippet is quite 
straightforward:

   - Pin every thread to a unique int32 variable.
   - Put these int32 variable in an array with sufficient 'dummy' elements 
   in between to eliminate false sharing.
   - In the inc() call, find the appropriate location using the thread/core 
   id and then increment it atomically.

https://github.com/ashkrit/blog/blob/master/src/main/java/counter/PaddedAtomicCounter.java

However, being new to Go, I am facing a few obstacles:

I thought of allocating a[numCores*16]uint32 and then use atomic.AddUint32() 
to atomically increment the individual element. However there are few basic 
things missing:

   - How do I get to know the Thread (or CpuCore) id on which the goroutine 
   is running? 
   - In the get() call of this counter, I need to sum up array values; how 
   does one ensure that the  latest values written to the array are visible to 
   the goroutine calling the get() method? Are the values written using 
   atomic package made visible to all cores immediately? Or else, does 
   atomic.LoadInt32()guarantee to return the latest values? I think the 
   problem is that atomic package does not say anything about visibility of 
   memory updates across cores.

--
cheers,
gaurav

-- 
You received this message because you are subscribed to the Google Groups 
"golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to golang-nuts+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.