Hi,

do you use float? I was meaning float32. The old back-end only suport
float32. So if you use float64 or int32, nothing will compute on the GPU.

The new back-end support many dtypes including float64 and int*. So it
should work better.

Note, if you do operation between float32 and int32, the result is float64.
This is the normal c/numpy casting rules. float32 and int16 return float32.
So if you end up with float64, it is frequently that case.
Fred

On Wed, Aug 9, 2017 at 2:48 PM Haining Yu <hainin...@gmail.com> wrote:

> Thank you Fred.
>
> Yes I am using device=gpu0. I will switch to the new backend and test
> again.
>
> On float64, do you mean int64? If yes, am puzzled by that too. In my code
> I never explicit cast to int64. Instead I use tensor.ivector() to index
> matrices and cast them explicitly into int32. For example:
>
> x = T.ivector()
>
> z = T.cast(y, dtype='int32')
>
> Do you think these things cause the problem?
>
> Thank you,
> Haining
>
> Haining Yu on Gmail
>
> On Wed, Aug 9, 2017 at 2:36 PM, Frédéric Bastien <
> frederic.bast...@gmail.com> wrote:
>
>> My guess is that you use the old GPU backend. Can you confirm you use the
>> Theano flag device=gpu? And that also you have float64 in the graph. The
>> old backend don't support them. I suggest that you install the just
>> released 0.10 beta and that you use the new backend with device=cuda.
>>
>> Also,you can use the flag warn_float64=pdb to find where you have them
>> and make sore they are float32. This will be faster.
>>
>> Fred
>>
>> Le lun. 31 juil. 2017 14:42, Haining Yu <hainin...@gmail.com> a écrit :
>>
>>> Hi,
>>>
>>> I am running a RNN/GRU model for a fairly large dataset with the goal
>>> of sequence prediction. When I profile my code, I found one GpuFromHost
>>> takes ~30% of computation time. See part of profiling results below:
>>>
>>> <% time> <sum %> <apply time> <time per call> <#call> <id> <Mflops>
>>> <Gflops/s> <Apply name>
>>>   30.2%    73.0%     462.776s       3.71e-01s   1248   221
>>>       GpuFromHost(Subtensor{:int64:}.0)
>>>     input 0: dtype=float32, shape=(512, 1024, 2048), strides=(-4096, 4,
>>> 2097152)
>>>     output 0: dtype=float32, shape=(512, 1024, 2048), strides=(2097152,
>>> 2048, 1)
>>>
>>> theano.printing.debugprint shows that the call is generated in gradient
>>> calculation; see snippet below. There is also a HostFromGpu a couple of
>>> layers below.
>>>
>>>  | | | | |GpuFromHost [id FN] ''   221
>>>  | | | |   |Subtensor{:int64:} [id FO] ''   220
>>>  | | | |     |Subtensor{::int64} [id FP] ''   219
>>>  | | | |     | |InplaceDimShuffle{1,2,0} [id FQ] ''   218
>>>  | | | |     | | |Reshape{3} [id FR] ''   217
>>>  | | | |     | |   |CrossentropyCategorical1HotGrad [id FS] ''   216
>>>  | | | |     | |   | |Elemwise{Second}[(0, 0)] [id FT] ''   215
>>>  | | | |     | |   | | |CrossentropyCategorical1Hot [id FU] ''   209
>>>  | | | |     | |   | | | |HostFromGpu [id FV] ''   206
>>>
>>> I have heard about the cost of using GpuFromHost (and its counterpart
>>> HostFromGpu) and had moved almost all data to GPU (via shared
>>> variables). So I don't understand why the call is needed. In particular I
>>> don't understand:
>>>
>>> 1. If all my data are on GPU and theano is optimized for GPU, why is the
>>> GpuFromHost even generated?
>>> 2. Is the call generated because the memory is too large? The call tries
>>> to move 512 x 1024 x 2048 x 4 = 4.2GB memory. But my Tesla K80 should have
>>> 12GB memory thus the need to move seems remote on the surface. Overall
>>> memory consumption seems OK under profiling.
>>> 3. Does the call have anything to do with CrossentropyCategorical1Hot? I
>>> assume CrossentropyCategorical1Hot  has been optimized for GPU. But the
>>> code shows that a HostFromGPU is called before CrossentropyCategorical1Hot
>>> is applied. I am not sure if CrossentropyCategorical1Hot has any memory
>>> requirement (e.g., c-contiguous).
>>> 4. Should I try any GPU assertion to debug the root cause of the problem?
>>>
>>> Any hint is appreciated.
>>>
>>> Thank you,
>>> Haining
>>>
>>> --
>>>
>>> ---
>>> You received this message because you are subscribed to the Google
>>> Groups "theano-users" group.
>>> To unsubscribe from this group and stop receiving emails from it, send
>>> an email to theano-users+unsubscr...@googlegroups.com.
>>> For more options, visit https://groups.google.com/d/optout.
>>>
>> --
>>
>> ---
>> You received this message because you are subscribed to a topic in the
>> Google Groups "theano-users" group.
>> To unsubscribe from this topic, visit
>> https://groups.google.com/d/topic/theano-users/CjR0L_KroOU/unsubscribe.
>> To unsubscribe from this group and all its topics, send an email to
>> theano-users+unsubscr...@googlegroups.com.
>>
>
>> For more options, visit https://groups.google.com/d/optout.
>>
> --
>
> ---
> You received this message because you are subscribed to the Google Groups
> "theano-users" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to theano-users+unsubscr...@googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.
>

-- 

--- 
You received this message because you are subscribed to the Google Groups 
"theano-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to theano-users+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to