I don't see any float64 in debugprint result. 

Inspecting the code, I am just using floatX e.g. 
self.x = theano.shared(name='gx', value=x1.astype(theano.config.floatX))

I did use int32 to cast various indices. In profiling it seems to be 
converted into int64.

Will make all the changes based on your suggestion and test one more time. 

Thanks again.

On Wednesday, August 9, 2017 at 5:38:26 PM UTC-4, nouiz wrote:
>
> Hi,
>
> do you use float? I was meaning float32. The old back-end only suport 
> float32. So if you use float64 or int32, nothing will compute on the GPU.
>
> The new back-end support many dtypes including float64 and int*. So it 
> should work better.
>
> Note, if you do operation between float32 and int32, the result is 
> float64. This is the normal c/numpy casting rules. float32 and int16 return 
> float32. So if you end up with float64, it is frequently that case.
> Fred
>
> On Wed, Aug 9, 2017 at 2:48 PM Haining Yu <hain...@gmail.com <javascript:>> 
> wrote:
>
>> Thank you Fred.
>>
>> Yes I am using device=gpu0. I will switch to the new backend and test 
>> again.
>>
>> On float64, do you mean int64? If yes, am puzzled by that too. In my code 
>> I never explicit cast to int64. Instead I use tensor.ivector() to index 
>> matrices and cast them explicitly into int32. For example:
>>
>> x = T.ivector()
>>
>> z = T.cast(y, dtype='int32')
>>
>> Do you think these things cause the problem?
>>
>> Thank you,
>> Haining
>>
>> Haining Yu on Gmail
>>
>> On Wed, Aug 9, 2017 at 2:36 PM, Frédéric Bastien <frederic...@gmail.com 
>> <javascript:>> wrote:
>>
>>> My guess is that you use the old GPU backend. Can you confirm you use 
>>> the Theano flag device=gpu? And that also you have float64 in the graph. 
>>> The old backend don't support them. I suggest that you install the just 
>>> released 0.10 beta and that you use the new backend with device=cuda.
>>>
>>> Also,you can use the flag warn_float64=pdb to find where you have them 
>>> and make sore they are float32. This will be faster.
>>>
>>> Fred
>>>
>>> Le lun. 31 juil. 2017 14:42, Haining Yu <hain...@gmail.com <javascript:>> 
>>> a écrit :
>>>
>>>> Hi,
>>>>
>>>> I am running a RNN/GRU model for a fairly large dataset with the goal 
>>>> of sequence prediction. When I profile my code, I found one GpuFromHost 
>>>> takes ~30% of computation time. See part of profiling results below:
>>>>
>>>> <% time> <sum %> <apply time> <time per call> <#call> <id> <Mflops> 
>>>> <Gflops/s> <Apply name>  
>>>>   30.2%    73.0%     462.776s       3.71e-01s   1248   221             
>>>>         GpuFromHost(Subtensor{:int64:}.0)
>>>>     input 0: dtype=float32, shape=(512, 1024, 2048), strides=(-4096, 4, 
>>>> 2097152) 
>>>>     output 0: dtype=float32, shape=(512, 1024, 2048), strides=(2097152, 
>>>> 2048, 1) 
>>>>
>>>> theano.printing.debugprint shows that the call is generated in gradient 
>>>> calculation; see snippet below. There is also a HostFromGpu a couple 
>>>> of layers below.
>>>>
>>>>  | | | | |GpuFromHost [id FN] ''   221
>>>>  | | | |   |Subtensor{:int64:} [id FO] ''   220
>>>>  | | | |     |Subtensor{::int64} [id FP] ''   219
>>>>  | | | |     | |InplaceDimShuffle{1,2,0} [id FQ] ''   218
>>>>  | | | |     | | |Reshape{3} [id FR] ''   217
>>>>  | | | |     | |   |CrossentropyCategorical1HotGrad [id FS] ''   216
>>>>  | | | |     | |   | |Elemwise{Second}[(0, 0)] [id FT] ''   215
>>>>  | | | |     | |   | | |CrossentropyCategorical1Hot [id FU] ''   209
>>>>  | | | |     | |   | | | |HostFromGpu [id FV] ''   206
>>>>
>>>> I have heard about the cost of using GpuFromHost (and its counterpart 
>>>> HostFromGpu) and had moved almost all data to GPU (via shared 
>>>> variables). So I don't understand why the call is needed. In particular I 
>>>> don't understand:
>>>>
>>>> 1. If all my data are on GPU and theano is optimized for GPU, why is 
>>>> the GpuFromHost even generated?
>>>> 2. Is the call generated because the memory is too large? The call 
>>>> tries to move 512 x 1024 x 2048 x 4 = 4.2GB memory. But my Tesla K80 
>>>> should 
>>>> have 12GB memory thus the need to move seems remote on the surface. 
>>>> Overall 
>>>> memory consumption seems OK under profiling.
>>>> 3. Does the call have anything to do with CrossentropyCategorical1Hot? 
>>>> I assume CrossentropyCategorical1Hot  has been optimized for GPU. But the 
>>>> code shows that a HostFromGPU is called before CrossentropyCategorical1Hot 
>>>> is applied. I am not sure if CrossentropyCategorical1Hot has any 
>>>> memory requirement (e.g., c-contiguous).
>>>> 4. Should I try any GPU assertion to debug the root cause of the 
>>>> problem?
>>>>
>>>> Any hint is appreciated.
>>>>
>>>> Thank you,
>>>> Haining 
>>>>
>>>> -- 
>>>>
>>>> --- 
>>>> You received this message because you are subscribed to the Google 
>>>> Groups "theano-users" group.
>>>> To unsubscribe from this group and stop receiving emails from it, send 
>>>> an email to theano-users...@googlegroups.com <javascript:>.
>>>> For more options, visit https://groups.google.com/d/optout.
>>>>
>>> -- 
>>>
>>> --- 
>>> You received this message because you are subscribed to a topic in the 
>>> Google Groups "theano-users" group.
>>> To unsubscribe from this topic, visit 
>>> https://groups.google.com/d/topic/theano-users/CjR0L_KroOU/unsubscribe.
>>> To unsubscribe from this group and all its topics, send an email to 
>>> theano-users...@googlegroups.com <javascript:>.
>>>
>>
>>> For more options, visit https://groups.google.com/d/optout.
>>>
>> -- 
>>
>> --- 
>> You received this message because you are subscribed to the Google Groups 
>> "theano-users" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to theano-users...@googlegroups.com <javascript:>.
>> For more options, visit https://groups.google.com/d/optout.
>>
>

-- 

--- 
You received this message because you are subscribed to the Google Groups 
"theano-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to theano-users+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to