What's the complexity of bincount or inc_subtensor in Theano on the GPU?
In particular, I assume there's a synchronization problem on the GPU when the same count must be incremented by two different cores. How is this handled on the GPU? I hope the complexity is not Omega(n). I could live with Theta(logn) on average. You can answer the question on stackoverflow if you want: http://stackoverflow.com/q/41936363/4603642?sem=2 -- --- You received this message because you are subscribed to the Google Groups "theano-users" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. For more options, visit https://groups.google.com/d/optout.
