Hi, do you use float? I was meaning float32. The old back-end only suport float32. So if you use float64 or int32, nothing will compute on the GPU.
The new back-end support many dtypes including float64 and int*. So it should work better. Note, if you do operation between float32 and int32, the result is float64. This is the normal c/numpy casting rules. float32 and int16 return float32. So if you end up with float64, it is frequently that case. Fred On Wed, Aug 9, 2017 at 2:48 PM Haining Yu <[email protected]> wrote: > Thank you Fred. > > Yes I am using device=gpu0. I will switch to the new backend and test > again. > > On float64, do you mean int64? If yes, am puzzled by that too. In my code > I never explicit cast to int64. Instead I use tensor.ivector() to index > matrices and cast them explicitly into int32. For example: > > x = T.ivector() > > z = T.cast(y, dtype='int32') > > Do you think these things cause the problem? > > Thank you, > Haining > > Haining Yu on Gmail > > On Wed, Aug 9, 2017 at 2:36 PM, Frédéric Bastien < > [email protected]> wrote: > >> My guess is that you use the old GPU backend. Can you confirm you use the >> Theano flag device=gpu? And that also you have float64 in the graph. The >> old backend don't support them. I suggest that you install the just >> released 0.10 beta and that you use the new backend with device=cuda. >> >> Also,you can use the flag warn_float64=pdb to find where you have them >> and make sore they are float32. This will be faster. >> >> Fred >> >> Le lun. 31 juil. 2017 14:42, Haining Yu <[email protected]> a écrit : >> >>> Hi, >>> >>> I am running a RNN/GRU model for a fairly large dataset with the goal >>> of sequence prediction. When I profile my code, I found one GpuFromHost >>> takes ~30% of computation time. See part of profiling results below: >>> >>> <% time> <sum %> <apply time> <time per call> <#call> <id> <Mflops> >>> <Gflops/s> <Apply name> >>> 30.2% 73.0% 462.776s 3.71e-01s 1248 221 >>> GpuFromHost(Subtensor{:int64:}.0) >>> input 0: dtype=float32, shape=(512, 1024, 2048), strides=(-4096, 4, >>> 2097152) >>> output 0: dtype=float32, shape=(512, 1024, 2048), strides=(2097152, >>> 2048, 1) >>> >>> theano.printing.debugprint shows that the call is generated in gradient >>> calculation; see snippet below. There is also a HostFromGpu a couple of >>> layers below. >>> >>> | | | | |GpuFromHost [id FN] '' 221 >>> | | | | |Subtensor{:int64:} [id FO] '' 220 >>> | | | | |Subtensor{::int64} [id FP] '' 219 >>> | | | | | |InplaceDimShuffle{1,2,0} [id FQ] '' 218 >>> | | | | | | |Reshape{3} [id FR] '' 217 >>> | | | | | | |CrossentropyCategorical1HotGrad [id FS] '' 216 >>> | | | | | | | |Elemwise{Second}[(0, 0)] [id FT] '' 215 >>> | | | | | | | | |CrossentropyCategorical1Hot [id FU] '' 209 >>> | | | | | | | | | |HostFromGpu [id FV] '' 206 >>> >>> I have heard about the cost of using GpuFromHost (and its counterpart >>> HostFromGpu) and had moved almost all data to GPU (via shared >>> variables). So I don't understand why the call is needed. In particular I >>> don't understand: >>> >>> 1. If all my data are on GPU and theano is optimized for GPU, why is the >>> GpuFromHost even generated? >>> 2. Is the call generated because the memory is too large? The call tries >>> to move 512 x 1024 x 2048 x 4 = 4.2GB memory. But my Tesla K80 should have >>> 12GB memory thus the need to move seems remote on the surface. Overall >>> memory consumption seems OK under profiling. >>> 3. Does the call have anything to do with CrossentropyCategorical1Hot? I >>> assume CrossentropyCategorical1Hot has been optimized for GPU. But the >>> code shows that a HostFromGPU is called before CrossentropyCategorical1Hot >>> is applied. I am not sure if CrossentropyCategorical1Hot has any memory >>> requirement (e.g., c-contiguous). >>> 4. Should I try any GPU assertion to debug the root cause of the problem? >>> >>> Any hint is appreciated. >>> >>> Thank you, >>> Haining >>> >>> -- >>> >>> --- >>> You received this message because you are subscribed to the Google >>> Groups "theano-users" group. >>> To unsubscribe from this group and stop receiving emails from it, send >>> an email to [email protected]. >>> For more options, visit https://groups.google.com/d/optout. >>> >> -- >> >> --- >> You received this message because you are subscribed to a topic in the >> Google Groups "theano-users" group. >> To unsubscribe from this topic, visit >> https://groups.google.com/d/topic/theano-users/CjR0L_KroOU/unsubscribe. >> To unsubscribe from this group and all its topics, send an email to >> [email protected]. >> > >> For more options, visit https://groups.google.com/d/optout. >> > -- > > --- > You received this message because you are subscribed to the Google Groups > "theano-users" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to [email protected]. > For more options, visit https://groups.google.com/d/optout. > -- --- You received this message because you are subscribed to the Google Groups "theano-users" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. For more options, visit https://groups.google.com/d/optout.
