Hi Fred,

Thank you for the quick reply and clever suggestions, you really hit the
nail on the head !

I tried the suggested vectorized code, but unfortunately still experienced
the max recursion limit exception.
I also tried the workaround, but still get the max recusion limit until I
set it to 20K and I get a segmentation fault:

Process finished with exit code 139 (interrupted by signal 11: SIGSEGV)


Can you think of other suggestions that I could try ?
If for-loop approach is doomed because of the max recursion issue, is there
perhaps a way I could rewrite my code such that  scan becomes reasonably
faster ?



On 16 February 2017 at 22:22, Frédéric Bastien <[email protected]>
wrote:

> Is the indices exatly the one you have in your code? If so, you can
> probably take advantage of this to vectorize the code and make the for loop
> case work.
>
> The size are so small then using scan will be super slow. Each iteration
> of scan have big overhead and you only work on scalar mostly at each
> iteration. So it is normal that it is slow with scan.
>
> For the for loop, there was a problem of indices. I fixed it. Here is a
> partial vectorization of the code that will make it faster to execute and
> make a smaller graph:
>
>     coss = tt.cos(theta)
>     sins = tt.sin(theta)
>
>     for k in range(0, K):
>         A = tt.set_subtensor(A[indices[k, 0], indices[k, 0]], coss[k])
>         A = tt.set_subtensor(A[indices[k, 1], indices[k, 1]], coss[k])
>         A = tt.set_subtensor(A[indices[k, 0], indices[k, 1]], sins[k])
>         A = tt.set_subtensor(A[indices[k, 1], indices[k, 0]], -sins[k])
>         B = tt.dot(A, B)
>
> The call to cos and sin are only done once. This make the graph smaller.
>
> You can also do the indexing call only once. THis make the graph still
> smaller:
>
>     coss = tt.cos(theta)
>     sins = tt.cos(theta)
>
>     for k in range(0, K):
>         idx0 = indices[k, 0]
>         idx1 = indices[k, 1]
>         A = tt.set_subtensor(A[idx0, idx0], coss[k])
>         A = tt.set_subtensor(A[idx1, idx1], coss[k])
>         A = tt.set_subtensor(A[idx0, idx1], sins[k])
>         A = tt.set_subtensor(A[idx1, idx0], -sins[k])
>         B = tt.dot(A, B)
>
> The if the indices is exactly what you created and don't change, it can
> become that is even smaller:
>
>     for k in range(0, K):
>         idx0 = K//(N-1)
>         idx1 = indices[k, 1]
>         A = tt.set_subtensor(A[idx0, idx0], coss[k])
>         A = tt.set_subtensor(A[idx1, idx1], coss[k])
>         A = tt.set_subtensor(A[idx0, idx1], sins[k])
>         A = tt.set_subtensor(A[idx1, idx0], -sins[k])
>         B = tt.dot(A, B)
>
> You can do similar with j.
>
> If with that, you still have the max recursion limit, try the work around
> in this issue.
>
> https://github.com/Theano/Theano/issues/3607
>
> Fred
>
> On Thu, Feb 16, 2017 at 7:53 AM Oussama Souihli <[email protected]>
> wrote:
>
>> Hi Fred, Adam, Kiuhnm, all
>>
>> Thank you for the detailed and quick replies and multiple suggestions.
>> I tried bacthed_dot but still experienced both issues (maximum depth
>> recursion when using a for loop and very slow epochs when using
>> theano.scan),
>> so now I'm thinking the problem is also possibly  due to the way I
>> populate the array A (using sub_tensor).
>>
>> Sorry for imposing on your kindness, but would you mind taking a look at
>> the attached, *very short*, minimal examples reproducing what I'm trying
>> to achieve ?
>> The examples are self-contained and only require python with numpy and
>> theano packages, nothing else.
>>
>>    - *Issue_with_for_loop.py*:
>>    - Reproduces the for-loop maximum recursion issue.
>>       - if you run it as is, it will throw the exception on B.eval()
>>
>>       - *Issue_with_scan.py*:
>>    - This one shows the alternative way of implementing the logic using
>>       scan.
>>       - Unfortunately in this case it returns a result and gives the
>>       impression like it works fine, but if I run it in Keras the epoch 
>> duration
>>       becomes prohibitively large (400,000 seconds per epoch !!)
>>
>>
>> If you can think of a way that can achieve the logic implemented in
>> either sheets with maximum efficiency theano-wise, I'd really appreciate
>> it.
>>
>>
>> With gratitude,
>> Oussama
>>
>> --
>>
>> ---
>> You received this message because you are subscribed to the Google Groups
>> "theano-users" group.
>> To unsubscribe from this group and stop receiving emails from it, send an
>> email to [email protected].
>> For more options, visit https://groups.google.com/d/optout.
>>
> --
>
> ---
> You received this message because you are subscribed to a topic in the
> Google Groups "theano-users" group.
> To unsubscribe from this topic, visit https://groups.google.com/d/
> topic/theano-users/Yz8g-cejB8c/unsubscribe.
> To unsubscribe from this group and all its topics, send an email to
> [email protected].
> For more options, visit https://groups.google.com/d/optout.
>

-- 

--- 
You received this message because you are subscribed to the Google Groups 
"theano-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
For more options, visit https://groups.google.com/d/optout.

Reply via email to