Thank you. Basically bincount is just a call to inc_subtensor. I didn't know about AtomicAdd. One day I'll look into Cuda programming :)
On Thursday, February 2, 2017 at 2:50:22 PM UTC+1, nouiz wrote: > > I don't recall the implementation of binvount and I'm offline. > > For inc_subtensor, we only modify the data the same number of time as the > new data, y in the code bellow > > inc_subtensor(x[idx], y) > > On the GPU, we use the GPU feature AtomicAdd > > Le lun. 30 janv. 2017 07:51, Kiuhnm Mnhuik <[email protected] > <javascript:>> a écrit : > >> What's the complexity of bincount or inc_subtensor in Theano on the GPU? >> >> In particular, I assume there's a synchronization problem on the GPU when >> the same count must be incremented by two different cores. How is this >> handled on the GPU? >> >> I hope the complexity is not Omega(n). I could live with Theta(logn) on >> average. >> >> You can answer the question on stackoverflow if you want: >> http://stackoverflow.com/q/41936363/4603642?sem=2 >> >> -- >> >> --- >> You received this message because you are subscribed to the Google Groups >> "theano-users" group. >> To unsubscribe from this group and stop receiving emails from it, send an >> email to [email protected] <javascript:>. >> For more options, visit https://groups.google.com/d/optout. >> > -- --- You received this message because you are subscribed to the Google Groups "theano-users" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. For more options, visit https://groups.google.com/d/optout.
