If there is likely nothing in the Pool, then maybe it's better to not use 
one at all. Can you compare the internal workload with an implementation 
where all callers just call the `New` function directly? What's the purpose 
of using a pooled memory if there's often nothing in the pool?

On Wednesday, February 14, 2018 at 3:56:34 PM UTC-8, Carlo Alberto Ferraris 
wrote:
>
> In an attempt to reduce the time pprof says is spent in sync.Pool 
> (internal workloads, sorry) I modified Get and getSlow to skip locking the 
> per-P shared pools if the pools are likely to be empty. This yields 
> promising results, but I'm not sure the approach is sound since the check I 
> do is inherently racy.
>
> As a (artificial and contrived) benchmark, I'm using this:
>
> func BenchmarkPoolUnderflow(b *testing.B) {
>   var p Pool
>   b.RunParallel(func(pb *testing.PB) {
>     for pb.Next() {
>       p.Put(1)
>       p.Get()
>       p.Get()
>     }
>   })
> }
>
> This is meant to simulate a pool in which more or objects are Get() than 
> are Put() (it wouldn't make much sense to simulate a pool in which we only 
> get, so to keep things simple I opted for a 2:1 ratio)
>
> The change I applied to Get and getSlow is the following. Starting from 
> the current pattern of:
>
> l := ... # per-P poolLocal
> l.Lock()
> last := len(l.shared) - 1
> if last >= 0 {
>   x = l.shared[last]
>   l.shared = l.shared[:last]
> }
> l.Unlock()
>
> I add a check (not protected by the mutex, that is the expensive op we're 
> trying to skip if it's not necessary) to see if the pool is likely to be 
> non-empty: 
>
> l := ... # per-P poolLocal
> if len(l.shared) > 0 { # the racy check for non-emptiness
>   l.Lock()
>   last := len(l.shared) - 1
>   if last >= 0 {
>     x = l.shared[last]
>     l.shared = l.shared[:last]
>   }
>   l.Unlock()
> }
>
> I know I should not call this a benign race, but in this case I don't see 
> how this can lead to problems. If the racy check gets it right, then it's 
> almost a net win. If if it gets it wrong, either we do what we do now (i.e. 
> we lock, just to find an empty pool), or we skip an otherwise non-empty 
> pool - thereby failing to immediately return an otherwise reusable object 
> (note that 1. there is a per-P shared pool for each P, so I'd estimate the 
> chances of this happening on all of them to be pretty low and 2. the Pool 
> documentation explicitly say that Get is allowed to treat the pool as 
> empty). Also note that the race detector is already explicitly disabled in 
> all sync.Pool methods.
>
> The reason I'm asking is to understand whether my reasoning is sound and, 
> regardless, if anybody has suggestions about how to do this in a better way.
>
> The current results (of the approach above, plus some refactoring to 
> recover some lost performance on the other benchmarks) on my laptop are the 
> following:
>
>     name             old time/op  new time/op  delta
>     Pool-4           14.5ns ± 3%  14.2ns ± 2%   -1.64%  (p=0.023 n=9+10)
>     PoolOverflow-4   1.99µs ±12%  1.78µs ± 1%  -10.62%  (p=0.000 n=10+8)
>     PoolUnderflow-4   152ns ± 6%    30ns ± 1%  -80.00%  (p=0.000 n=10+8)
>
> (the first two benchmarks are already part of sync.Pool, the last one is 
> the one I described above)
>
> Any feedback is welcome. If this is deemed safe I'm going to submit a CL.
>
> Carlo
>
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to golang-nuts+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to