What's the complexity of bincount or inc_subtensor in Theano on the GPU?

In particular, I assume there's a synchronization problem on the GPU when 
the same count must be incremented by two different cores. How is this 
handled on the GPU?

I hope the complexity is not Omega(n). I could live with Theta(logn) on 
average.

You can answer the question on stackoverflow if you want:
http://stackoverflow.com/q/41936363/4603642?sem=2

-- 

--- 
You received this message because you are subscribed to the Google Groups 
"theano-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
For more options, visit https://groups.google.com/d/optout.

Reply via email to