There's an example here for addition which will look very similar to 
multiplication:
http://deeplearning.net/software/theano/tutorial/adding.html#adding-two-matrices

On Thursday, October 27, 2016 at 10:42:38 AM UTC-7, Jesse Livezey wrote:
>
> It will look similar to creating two numpy arrays and multiplying them 
> elementwise, except it will perform the multiplications in parallel on the 
> gpu.
>
> On Thursday, October 27, 2016 at 10:41:33 AM UTC-7, Jesse Livezey wrote:
>>
>> If you can create a theano vector that has all of the i's and a second 
>> theano vector that has all of the j's, then you can just do i*j and will 
>> will perform all of the multiplications in parallel.
>>
>> On Wednesday, October 26, 2016 at 11:48:06 PM UTC-7, [email protected] 
>> wrote:
>>>
>>> I would like to compute the result of i*j for a number of i's and j's, 
>>> and I would like to do so concurrently. If I use the scan function over my 
>>> sequence of i's and j's, I will get my desired result, but it will not 
>>> perform the operations concurrently. If I have 100 cores in my single GPU, 
>>> I would like there to be 100 asynchronous computations (technically more 
>>> since each core has multiple threads) of the multiplication and final 
>>> assignment to one vector that will be returned. This is similar to how 
>>> multiprocessing works in base python with CPU cores. The Theano tutorial 
>>> claims that it uses GPU asynchronous capabilities, but I am not sure of 
>>> that as I have ran scan functions, and they seems to go as fast or slower 
>>> than the CPU.
>>>
>>> Should I not use scan? Can this even be done in Theano? Do I have to use 
>>> PyCUDA?
>>>
>>

-- 

--- 
You received this message because you are subscribed to the Google Groups 
"theano-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
For more options, visit https://groups.google.com/d/optout.

Reply via email to