Hi,
I am running a RNN/GRU model for a fairly large dataset with the goal of
sequence prediction. When I profile my code, I found one GpuFromHost takes
~30% of computation time. See part of profiling results below:
<% time> <sum %> <apply time> <time per call> <#call> <id> <Mflops>
<Gflops/s> <Apply name>
30.2% 73.0% 462.776s 3.71e-01s 1248 221
GpuFromHost(Subtensor{:int64:}.0)
input 0: dtype=float32, shape=(512, 1024, 2048), strides=(-4096, 4,
2097152)
output 0: dtype=float32, shape=(512, 1024, 2048), strides=(2097152,
2048, 1)
theano.printing.debugprint shows that the call is generated in gradient
calculation; see snippet below. There is also a HostFromGpu a couple of
layers below.
| | | | |GpuFromHost [id FN] '' 221
| | | | |Subtensor{:int64:} [id FO] '' 220
| | | | |Subtensor{::int64} [id FP] '' 219
| | | | | |InplaceDimShuffle{1,2,0} [id FQ] '' 218
| | | | | | |Reshape{3} [id FR] '' 217
| | | | | | |CrossentropyCategorical1HotGrad [id FS] '' 216
| | | | | | | |Elemwise{Second}[(0, 0)] [id FT] '' 215
| | | | | | | | |CrossentropyCategorical1Hot [id FU] '' 209
| | | | | | | | | |HostFromGpu [id FV] '' 206
I have heard about the cost of using GpuFromHost (and its counterpart
HostFromGpu) and had moved almost all data to GPU (via shared variables).
So I don't understand why the call is needed. In particular I don't
understand:
1. If all my data are on GPU and theano is optimized for GPU, why is the
GpuFromHost even generated?
2. Is the call generated because the memory is too large? The call tries to
move 512 x 1024 x 2048 x 4 = 4.2GB memory. But my Tesla K80 should have
12GB memory thus the need to move seems remote on the surface. Overall
memory consumption seems OK under profiling.
3. Does the call have anything to do with CrossentropyCategorical1Hot? I
assume CrossentropyCategorical1Hot has been optimized for GPU. But the
code shows that a HostFromGPU is called before CrossentropyCategorical1Hot
is applied. I am not sure if CrossentropyCategorical1Hot has any memory
requirement (e.g., c-contiguous).
4. Should I try any GPU assertion to debug the root cause of the problem?
Any hint is appreciated.
Thank you,
Haining
--
---
You received this message because you are subscribed to the Google Groups
"theano-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
For more options, visit https://groups.google.com/d/optout.