Re: [theano-users] Re: IfElse GPU version

2017-04-19 Thread Frédéric Bastien
It is more efficient then what you describe.

The condition is evaluate first and is:
- completly evaluated on CPU
- or transfered to the CPU.

Then we will evaluate one of the two branches depending of the result of
the condition. All those evaluation can be on the GPU. It won't transfer
them to the CPU.

The only inneficiency is that sometimes, both branches get completly
evaluated. This can be fixed by disabling some other optimization, but this
have its own problem. If you want to try that, use the Theano flag:
optimizer_excluding=inplace

The other non-efficiency is that if the else branch is taken, we will do a
copy of that value. But the copy should stay on the device device where the
original data is (so CPU or GPU).

Fred

On Fri, Mar 24, 2017 at 12:34 PM Šarūnas S.  wrote:

> I am using theano version 0.9.0.rc2.dev version.
>
>
>
>
> On Friday, 24 March 2017 17:32:33 UTC+1, Šarūnas S. wrote:
>>
>> In my graph I have a few IfElse nodes and I am wondering how and where
>> they are executed.
>>
>> At first I ran the code with linker=cvm in my THEANO_FLAGS but after
>> profiling it looked like the ifelse is being executed on the CPU. Then I
>> forced the linker=c to check whether the IfElse will go through and I got
>> the NotImplementedError: if{inplace, gpu} cannot produce C code. Btw
>> removing inline optimization did not help as it still gave the same error.
>>
>> So does IfElse have a GPU implementation? If yes how do I use it? Also,
>> does it do lazy evaluation or not?
>>
> --
>
> ---
> You received this message because you are subscribed to the Google Groups
> "theano-users" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to theano-users+unsubscr...@googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.
>

-- 

--- 
You received this message because you are subscribed to the Google Groups 
"theano-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to theano-users+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: [theano-users] Re: IfElse GPU version

2017-04-13 Thread Šarūnas S .
Ok I think its getting clearer now due to your help. Thanks.

As far as I understand then on each ifelse call the condition gets 
evaluated on CPU and then the branches on the CPU. But if the condition is 
based on some variable in GPU it would then transfer data back to CPU 
evaluate and transfer back to GPU to process in the branches right? 

Thus this implementation sounds like it could easily bottleneck the whole 
computation

On Tuesday, 11 April 2017 18:02:21 UTC+3, nouiz wrote:
>
> ifelse work on the GPU. The "PY" just mean it use the Python interface. 
> But it still work on the GPU. Only the condition stay on CPU, but both 
> branch are moved to the GPU.
>
> If you want to make sure of that, put a python break point in the file 
> ifelse.py, in the method thunk(). You will see the inputs data isn't 
> numpy.ndarray.
>
> Fred
>
> On Sun, Mar 26, 2017 at 10:51 AM Šarūnas S.  > wrote:
>
>> Indeed this was my first approach, but due to many small variations the 
>> number of graphs is a bit too big to manage. Currently I precompile a few 
>> trees and for the remainder of variations I do an ifelse variant with 
>> boolean operations which reduces the number of trees at the cost of 
>> computational inefficiency. 
>>
>> But I am most interested in the ifelse status since I could use a single 
>> tree and with lazy evaluations would have full computational efficiency. 
>> That's what the GPUs and theano is all about, right? 
>>
>>
>>
>>
>>
>> On Sunday, 26 March 2017 05:26:37 UTC+2, Jesse Livezey wrote:
>>>
>>> I have decided to precompile a general graph in which all the possible 
 graphs are nested. Then during realtime I would set which parts of the 
 general graph to use using the *allowed_branch* variables and *if* nodes. 
 Since afaik ifs are evaluated lazily in each case I would only be using 
 the 
 relevant part of the graph so my computational cost is minimal.
>>>
>>>
>>> Have you considered precompiling all possible graphs individually and 
>>> then just using python conditionals to choose a graph? Maybe this won't 
>>> work for your system, but it might be easier to get right.
>>>
>>> On Saturday, March 25, 2017 at 2:21:27 AM UTC-7, Šarūnas S. wrote:

 Nouiz sorry I understand what you were refering by is constant. I've 
 mislead you with my example. 

 This is a more realistic example:

 import theano as th
 import theano.tensor as T

 allowed_branch = th.shared( np.cast['float32']( 0 ) )

 x = T.matrix('x')
 y = T.matrix('y')
 f = x ** 2 + y ** 2 + 2 * x * y  
  
 result = th.ifelse.ifelse( T.gt( allowed_branch, T.constant( 0 ) ), f, 
 T.zeros( (2,2) ) )
 


 I am working on a realtime system which in a given situation will 
 constructs a relevant computational graph, compute its result and display 
 it. 
 However, the graphs are relatively big and each of their compilation 
 takes too long so I cant compile realtime. Thus I have to somehow 
 precompile. 

 I have decided to precompile a general graph in which all the possible 
 graphs are nested. Then during realtime I would set which parts of the 
 general graph to use using the *allowed_branch* variables and *if* 
 nodes. Since afaik ifs are evaluated lazily in each case I would only be 
 using the relevant part of the graph so my computational cost is minimal.


 On Saturday, 25 March 2017 10:04:21 UTC+1, Šarūnas S. wrote:
>
> I suspect that ifelse is running on GPU because this is the profile I 
> get
>
> ==
>   Message: Sum of all(44) printed profiles at exit excluding Scan op 
> profile.
>   Time in 95 calls to Function.__call__: 2.309995e-01s
>   Time in Function.fn.__call__: 2.25e-01s (99.567%)
>   Time in thunks: 2.307765e-01s (99.903%)
>   Total compile time: 1.360100e+01s
> Number of Apply nodes: 416
> Theano Optimizer time: 6.314001e+00s
>Theano validate time: 9.200015e-01s
> Theano Linker time (includes C, CUDA code generation/compiling): 
> 1.169000e+00s
>Import time 2.799892e-02s
>Node make_thunk time 1.108999e+00s
>Node GpuElemwise{Composite{(i0 * ((i1 * i2) + (i1 * 
> i3)))}}[(0, 2)](CudaNdarrayConstant{0.5}, 
> CudaNdarrayConstant{0.83313465}, GpuCAReduce{add}{1,1}.0, 
> GpuCAReduce{add}{1,1}.0) time 6.69e-03s
>Node GpuElemwise{Composite{(-minimum(i0, 
> maximum(minimum(i0, (maximum((i1 - i2), i3) + i2)), (((i1 + i2) * i4) + 
> i1},no_inplace}(, 
> , , 
> CudaNdarrayConstant{120.0}, ) time 
> 4.999876e-03s
>Node GpuElemwise{mul,no_inplace}(

Re: [theano-users] Re: IfElse GPU version

2017-04-11 Thread Frédéric Bastien
ifelse work on the GPU. The "PY" just mean it use the Python interface. But
it still work on the GPU. Only the condition stay on CPU, but both branch
are moved to the GPU.

If you want to make sure of that, put a python break point in the file
ifelse.py, in the method thunk(). You will see the inputs data isn't
numpy.ndarray.

Fred

On Sun, Mar 26, 2017 at 10:51 AM Šarūnas S.  wrote:

Indeed this was my first approach, but due to many small variations the
number of graphs is a bit too big to manage. Currently I precompile a few
trees and for the remainder of variations I do an ifelse variant with
boolean operations which reduces the number of trees at the cost of
computational inefficiency.

But I am most interested in the ifelse status since I could use a single
tree and with lazy evaluations would have full computational efficiency.
That's what the GPUs and theano is all about, right?





On Sunday, 26 March 2017 05:26:37 UTC+2, Jesse Livezey wrote:

I have decided to precompile a general graph in which all the possible
graphs are nested. Then during realtime I would set which parts of the
general graph to use using the *allowed_branch* variables and *if* nodes.
Since afaik ifs are evaluated lazily in each case I would only be using the
relevant part of the graph so my computational cost is minimal.


Have you considered precompiling all possible graphs individually and then
just using python conditionals to choose a graph? Maybe this won't work for
your system, but it might be easier to get right.

On Saturday, March 25, 2017 at 2:21:27 AM UTC-7, Šarūnas S. wrote:

Nouiz sorry I understand what you were refering by is constant. I've
mislead you with my example.

This is a more realistic example:

import theano as th
import theano.tensor as T

allowed_branch = th.shared( np.cast['float32']( 0 ) )

x = T.matrix('x')
y = T.matrix('y')
f = x ** 2 + y ** 2 + 2 * x * y

result = th.ifelse.ifelse( T.gt( allowed_branch, T.constant( 0 ) ), f,
T.zeros( (2,2) ) )



I am working on a realtime system which in a given situation will
constructs a relevant computational graph, compute its result and display
it.
However, the graphs are relatively big and each of their compilation takes
too long so I cant compile realtime. Thus I have to somehow precompile.

I have decided to precompile a general graph in which all the possible
graphs are nested. Then during realtime I would set which parts of the
general graph to use using the *allowed_branch* variables and *if* nodes.
Since afaik ifs are evaluated lazily in each case I would only be using the
relevant part of the graph so my computational cost is minimal.


On Saturday, 25 March 2017 10:04:21 UTC+1, Šarūnas S. wrote:

I suspect that ifelse is running on GPU because this is the profile I get

==
  Message: Sum of all(44) printed profiles at exit excluding Scan op
profile.
  Time in 95 calls to Function.__call__: 2.309995e-01s
  Time in Function.fn.__call__: 2.25e-01s (99.567%)
  Time in thunks: 2.307765e-01s (99.903%)
  Total compile time: 1.360100e+01s
Number of Apply nodes: 416
Theano Optimizer time: 6.314001e+00s
   Theano validate time: 9.200015e-01s
Theano Linker time (includes C, CUDA code generation/compiling):
1.169000e+00s
   Import time 2.799892e-02s
   Node make_thunk time 1.108999e+00s
   Node GpuElemwise{Composite{(i0 * ((i1 * i2) + (i1 * i3)))}}[(0,
2)](CudaNdarrayConstant{0.5}, CudaNdarrayConstant{0.83313465},
GpuCAReduce{add}{1,1}.0, GpuCAReduce{add}{1,1}.0) time 6.69e-03s
   Node GpuElemwise{Composite{(-minimum(i0, maximum(minimum(i0,
(maximum((i1 - i2), i3) + i2)), (((i1 + i2) * i4) +
i1},no_inplace}(,
, ,
CudaNdarrayConstant{120.0}, ) time
4.999876e-03s
   Node GpuElemwise{mul,no_inplace}(, GpuElemwise{TrueDiv}[(0, 0)].0) time 4.000187e-03s
   Node HostFromGpu() time
3.49e-03s
   Node GpuElemwise{Mul}[(0, 1)](GpuDimShuffle{x,x}.0,
GpuDimShuffle{x,0}.0) time 3.49e-03s

Time in all call to theano.grad() 0.00e+00s
Time since theano import 28.959s
Class
---
<% time> <#call> <#apply>

  55.4%55.4%   0.128s   8.71e-05s C 1468 301
theano.sandbox.cuda.basic_ops.GpuElemwise
  25.6%81.0%   0.059s   1.03e-04s C  571 106
theano.sandbox.cuda.basic_ops.GpuCAReduce
   9.1%90.1%   0.021s   3.72e-05s C  564 150
theano.sandbox.cuda.basic_ops.HostFromGpu
   5.6%95.7%   0.013s   6.04e-06s Py2148 168
theano.ifelse.IfElse
   3.5%99.1%   0.008s   2.16e-04s C   37   4
theano.compile.ops.DeepCopyOp
   0.4%99.6%   0.001s   1.60e-06s C  623 122
theano.sandbox.cuda.basic_ops.GpuDimShuffle
   0.4%   100.0% 

Re: [theano-users] Re: IfElse GPU version

2017-03-26 Thread Šarūnas S .
Indeed this was my first approach, but due to many small variations the 
number of graphs is a bit too big to manage. Currently I precompile a few 
trees and for the remainder of variations I do an ifelse variant with 
boolean operations which reduces the number of trees at the cost of 
computational inefficiency. 

But I am most interested in the ifelse status since I could use a single 
tree and with lazy evaluations would have full computational efficiency. 
That's what the GPUs and theano is all about, right? 




On Sunday, 26 March 2017 05:26:37 UTC+2, Jesse Livezey wrote:
>
> I have decided to precompile a general graph in which all the possible 
>> graphs are nested. Then during realtime I would set which parts of the 
>> general graph to use using the *allowed_branch* variables and *if* nodes. 
>> Since afaik ifs are evaluated lazily in each case I would only be using the 
>> relevant part of the graph so my computational cost is minimal.
>
>
> Have you considered precompiling all possible graphs individually and then 
> just using python conditionals to choose a graph? Maybe this won't work for 
> your system, but it might be easier to get right.
>
> On Saturday, March 25, 2017 at 2:21:27 AM UTC-7, Šarūnas S. wrote:
>>
>> Nouiz sorry I understand what you were refering by is constant. I've 
>> mislead you with my example. 
>>
>> This is a more realistic example:
>>
>> import theano as th
>> import theano.tensor as T
>>
>> allowed_branch = th.shared( np.cast['float32']( 0 ) )
>>
>> x = T.matrix('x')
>> y = T.matrix('y')
>> f = x ** 2 + y ** 2 + 2 * x * y  
>>  
>> result = th.ifelse.ifelse( T.gt( allowed_branch, T.constant( 0 ) ), f, 
>> T.zeros( (2,2) ) )
>> 
>>
>>
>> I am working on a realtime system which in a given situation will 
>> constructs a relevant computational graph, compute its result and display 
>> it. 
>> However, the graphs are relatively big and each of their compilation 
>> takes too long so I cant compile realtime. Thus I have to somehow 
>> precompile. 
>>
>> I have decided to precompile a general graph in which all the possible 
>> graphs are nested. Then during realtime I would set which parts of the 
>> general graph to use using the *allowed_branch* variables and *if* 
>> nodes. Since afaik ifs are evaluated lazily in each case I would only be 
>> using the relevant part of the graph so my computational cost is minimal.
>>
>>
>> On Saturday, 25 March 2017 10:04:21 UTC+1, Šarūnas S. wrote:
>>>
>>> I suspect that ifelse is running on GPU because this is the profile I get
>>>
>>> ==
>>>   Message: Sum of all(44) printed profiles at exit excluding Scan op 
>>> profile.
>>>   Time in 95 calls to Function.__call__: 2.309995e-01s
>>>   Time in Function.fn.__call__: 2.25e-01s (99.567%)
>>>   Time in thunks: 2.307765e-01s (99.903%)
>>>   Total compile time: 1.360100e+01s
>>> Number of Apply nodes: 416
>>> Theano Optimizer time: 6.314001e+00s
>>>Theano validate time: 9.200015e-01s
>>> Theano Linker time (includes C, CUDA code generation/compiling): 
>>> 1.169000e+00s
>>>Import time 2.799892e-02s
>>>Node make_thunk time 1.108999e+00s
>>>Node GpuElemwise{Composite{(i0 * ((i1 * i2) + (i1 * 
>>> i3)))}}[(0, 2)](CudaNdarrayConstant{0.5}, 
>>> CudaNdarrayConstant{0.83313465}, GpuCAReduce{add}{1,1}.0, 
>>> GpuCAReduce{add}{1,1}.0) time 6.69e-03s
>>>Node GpuElemwise{Composite{(-minimum(i0, maximum(minimum(i0, 
>>> (maximum((i1 - i2), i3) + i2)), (((i1 + i2) * i4) + 
>>> i1},no_inplace}(, 
>>> , , 
>>> CudaNdarrayConstant{120.0}, ) time 
>>> 4.999876e-03s
>>>Node GpuElemwise{mul,no_inplace}(>> matrix)>, GpuElemwise{TrueDiv}[(0, 0)].0) time 4.000187e-03s
>>>Node HostFromGpu() time 
>>> 3.49e-03s
>>>Node GpuElemwise{Mul}[(0, 1)](GpuDimShuffle{x,x}.0, 
>>> GpuDimShuffle{x,0}.0) time 3.49e-03s
>>>
>>> Time in all call to theano.grad() 0.00e+00s
>>> Time since theano import 28.959s
>>> Class
>>> ---
>>> <% time> <#call> <#apply> 
>>> 
>>>   55.4%55.4%   0.128s   8.71e-05s C 1468 301   
>>> theano.sandbox.cuda.basic_ops.GpuElemwise
>>>   25.6%81.0%   0.059s   1.03e-04s C  571 106   
>>> theano.sandbox.cuda.basic_ops.GpuCAReduce
>>>9.1%90.1%   0.021s   3.72e-05s C  564 150   
>>> theano.sandbox.cuda.basic_ops.HostFromGpu
>>>5.6%95.7%   0.013s   6.04e-06s Py2148 168   
>>> theano.ifelse.IfElse
>>>3.5%99.1%   0.008s   2.16e-04s C   37   4   
>>> theano.compile.ops.DeepCopyOp
>>>0.4%99.6%   0.001s   1.60e-06s C  623 122   
>>> theano.sandbox.cuda.basic_ops.GpuDimShuffle
>>>0.4%   

Re: [theano-users] Re: IfElse GPU version

2017-03-25 Thread Jesse Livezey

>
> I have decided to precompile a general graph in which all the possible 
> graphs are nested. Then during realtime I would set which parts of the 
> general graph to use using the *allowed_branch* variables and *if* nodes. 
> Since afaik ifs are evaluated lazily in each case I would only be using the 
> relevant part of the graph so my computational cost is minimal.


Have you considered precompiling all possible graphs individually and then 
just using python conditionals to choose a graph? Maybe this won't work for 
your system, but it might be easier to get right.

On Saturday, March 25, 2017 at 2:21:27 AM UTC-7, Šarūnas S. wrote:
>
> Nouiz sorry I understand what you were refering by is constant. I've 
> mislead you with my example. 
>
> This is a more realistic example:
>
> import theano as th
> import theano.tensor as T
>
> allowed_branch = th.shared( np.cast['float32']( 0 ) )
>
> x = T.matrix('x')
> y = T.matrix('y')
> f = x ** 2 + y ** 2 + 2 * x * y  
>  
> result = th.ifelse.ifelse( T.gt( allowed_branch, T.constant( 0 ) ), f, 
> T.zeros( (2,2) ) )
> 
>
>
> I am working on a realtime system which in a given situation will 
> constructs a relevant computational graph, compute its result and display 
> it. 
> However, the graphs are relatively big and each of their compilation takes 
> too long so I cant compile realtime. Thus I have to somehow precompile. 
>
> I have decided to precompile a general graph in which all the possible 
> graphs are nested. Then during realtime I would set which parts of the 
> general graph to use using the *allowed_branch* variables and *if* nodes. 
> Since afaik ifs are evaluated lazily in each case I would only be using the 
> relevant part of the graph so my computational cost is minimal.
>
>
> On Saturday, 25 March 2017 10:04:21 UTC+1, Šarūnas S. wrote:
>>
>> I suspect that ifelse is running on GPU because this is the profile I get
>>
>> ==
>>   Message: Sum of all(44) printed profiles at exit excluding Scan op 
>> profile.
>>   Time in 95 calls to Function.__call__: 2.309995e-01s
>>   Time in Function.fn.__call__: 2.25e-01s (99.567%)
>>   Time in thunks: 2.307765e-01s (99.903%)
>>   Total compile time: 1.360100e+01s
>> Number of Apply nodes: 416
>> Theano Optimizer time: 6.314001e+00s
>>Theano validate time: 9.200015e-01s
>> Theano Linker time (includes C, CUDA code generation/compiling): 
>> 1.169000e+00s
>>Import time 2.799892e-02s
>>Node make_thunk time 1.108999e+00s
>>Node GpuElemwise{Composite{(i0 * ((i1 * i2) + (i1 * 
>> i3)))}}[(0, 2)](CudaNdarrayConstant{0.5}, 
>> CudaNdarrayConstant{0.83313465}, GpuCAReduce{add}{1,1}.0, 
>> GpuCAReduce{add}{1,1}.0) time 6.69e-03s
>>Node GpuElemwise{Composite{(-minimum(i0, maximum(minimum(i0, 
>> (maximum((i1 - i2), i3) + i2)), (((i1 + i2) * i4) + 
>> i1},no_inplace}(, 
>> , , 
>> CudaNdarrayConstant{120.0}, ) time 
>> 4.999876e-03s
>>Node GpuElemwise{mul,no_inplace}(> matrix)>, GpuElemwise{TrueDiv}[(0, 0)].0) time 4.000187e-03s
>>Node HostFromGpu() time 
>> 3.49e-03s
>>Node GpuElemwise{Mul}[(0, 1)](GpuDimShuffle{x,x}.0, 
>> GpuDimShuffle{x,0}.0) time 3.49e-03s
>>
>> Time in all call to theano.grad() 0.00e+00s
>> Time since theano import 28.959s
>> Class
>> ---
>> <% time> <#call> <#apply> 
>> 
>>   55.4%55.4%   0.128s   8.71e-05s C 1468 301   
>> theano.sandbox.cuda.basic_ops.GpuElemwise
>>   25.6%81.0%   0.059s   1.03e-04s C  571 106   
>> theano.sandbox.cuda.basic_ops.GpuCAReduce
>>9.1%90.1%   0.021s   3.72e-05s C  564 150   
>> theano.sandbox.cuda.basic_ops.HostFromGpu
>>5.6%95.7%   0.013s   6.04e-06s Py2148 168   
>> theano.ifelse.IfElse
>>3.5%99.1%   0.008s   2.16e-04s C   37   4   
>> theano.compile.ops.DeepCopyOp
>>0.4%99.6%   0.001s   1.60e-06s C  623 122   
>> theano.sandbox.cuda.basic_ops.GpuDimShuffle
>>0.4%   100.0%   0.001s   1.97e-06s C  506 110   
>> theano.tensor.elemwise.Elemwise
>>... (remaining 0 Classes account for   0.00%(0.00s) of the runtime)
>>
>> Ops
>> ---
>> <% time> <#call> <#apply> > name>
>>   16.9%16.9%   0.039s   1.22e-04s C  319   58   
>> GpuElemwise{mul,no_inplace}
>>   10.0%26.9%   0.023s   1.49e-04s C  155   30   
>> GpuCAReduce{add}{1,0}
>>9.1%36.0%   0.021s   3.72e-05s C  564  150   
>> HostFromGpu
>>8.2%44.2%   0.019s   1.23e-04s C  154   30   
>> GpuCAReduce{add}{0,1}
>>6.9%51.1%   0.016s   6.61e-05s C  

Re: [theano-users] Re: IfElse GPU version

2017-03-24 Thread Frédéric Bastien
What tell you the ifelse is on the CPU?

Anyway, add the condition is constant Theano will remove it during the
compilation.

Fred

Le ven. 24 mars 2017 12:41, Šarūnas S.  a écrit :

> Please find a code example:
>
> import theano as th
> import theano.tensor as T
>
> retval = th.ifelse.ifelse( T.gt(T.constant(2.0),T.constant(1.0)), T.ones((
> 500,1)),T.zeros((250,1)))
>
> On Friday, 24 March 2017 17:33:59 UTC+1, Šarūnas S. wrote:
>
> I am using theano version 0.9.0.rc2.dev version.
>
>
>
> On Friday, 24 March 2017 17:32:33 UTC+1, Šarūnas S. wrote:
>
> In my graph I have a few IfElse nodes and I am wondering how and where
> they are executed.
>
> At first I ran the code with linker=cvm in my THEANO_FLAGS but after
> profiling it looked like the ifelse is being executed on the CPU. Then I
> forced the linker=c to check whether the IfElse will go through and I got
> the NotImplementedError: if{inplace, gpu} cannot produce C code. Btw
> removing inline optimization did not help as it still gave the same error.
>
> So does IfElse have a GPU implementation? If yes how do I use it? Also,
> does it do lazy evaluation or not?
>
> --
>
> ---
> You received this message because you are subscribed to the Google Groups
> "theano-users" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to theano-users+unsubscr...@googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.
>

-- 

--- 
You received this message because you are subscribed to the Google Groups 
"theano-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to theano-users+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


[theano-users] Re: IfElse GPU version

2017-03-24 Thread Šarūnas S .
Please find a code example:

import theano as th
import theano.tensor as T

retval = th.ifelse.ifelse( T.gt(T.constant(2.0),T.constant(1.0)), T.ones((
500,1)),T.zeros((250,1)))

On Friday, 24 March 2017 17:33:59 UTC+1, Šarūnas S. wrote:
>
> I am using theano version 0.9.0.rc2.dev version.
>
>
>
> On Friday, 24 March 2017 17:32:33 UTC+1, Šarūnas S. wrote:
>>
>> In my graph I have a few IfElse nodes and I am wondering how and where 
>> they are executed. 
>>
>> At first I ran the code with linker=cvm in my THEANO_FLAGS but after 
>> profiling it looked like the ifelse is being executed on the CPU. Then I 
>> forced the linker=c to check whether the IfElse will go through and I got 
>> the NotImplementedError: if{inplace, gpu} cannot produce C code. Btw 
>> removing inline optimization did not help as it still gave the same error. 
>>
>> So does IfElse have a GPU implementation? If yes how do I use it? Also, 
>> does it do lazy evaluation or not? 
>>
>

-- 

--- 
You received this message because you are subscribed to the Google Groups 
"theano-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to theano-users+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


[theano-users] Re: IfElse GPU version

2017-03-24 Thread Šarūnas S .
I am using theano version 0.9.0.rc2.dev version.



On Friday, 24 March 2017 17:32:33 UTC+1, Šarūnas S. wrote:
>
> In my graph I have a few IfElse nodes and I am wondering how and where 
> they are executed. 
>
> At first I ran the code with linker=cvm in my THEANO_FLAGS but after 
> profiling it looked like the ifelse is being executed on the CPU. Then I 
> forced the linker=c to check whether the IfElse will go through and I got 
> the NotImplementedError: if{inplace, gpu} cannot produce C code. Btw 
> removing inline optimization did not help as it still gave the same error. 
>
> So does IfElse have a GPU implementation? If yes how do I use it? Also, 
> does it do lazy evaluation or not? 
>

-- 

--- 
You received this message because you are subscribed to the Google Groups 
"theano-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to theano-users+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.