Hi,
I have a simple bilinear interpolation code, which is part of a larger
algorithm. To simplify things, I created a minimal example of my problem.
When using *two-dimensional array indexing* in the interpolation, the
profiler shows a lot of time in "*AdvancedIncSubtensor*", which is python
code. After some research I switched to *linear indexing* and got the
expected speedup on CPU. The code now uses "*AdvancedIncSubtensor1*", which
has a C implementation:
*CPU before*:
THEANO function evaluation took 0.124027 seconds.
THEANO gradient evaluation took 0.248830 seconds.
*CPU after:*THEANO function evaluation took 0.087603 seconds.
THEANO gradient evaluation took 0.180211 seconds.
However*, *on the GPU (GTX980)*, *this leads to a >200 times increase in
runtime (!):
*GPU before *(floatX=float32,device=cuda0):
THEANO function evaluation took 0.286931 seconds.
THEANO gradient evaluation took 0.203636 seconds.
*GPU after:*THEANO function evaluation took 0.191236 seconds.
THEANO gradient evaluation took 44.522054 seconds.
Nearly all of the time is spent in *"GpuAdvancedIncSubtensor1"*. The same
can be observed using the old gpu backend. I'm using Theano (0.9.0rc4).
Here is the code, you can just comment/uncomment one of the blocks in
interp2D for the before/after behavior:
import theano
import numpy
import time
def interpolationTest():
numpy.random.seed(42)
m1=[2048,2048]
m2 = [1025,1025]
m2Grid = numpy.meshgrid(numpy.linspace(0,m1[0],m2[0]),numpy.linspace(0,m1[
1],m2[1]))
y = m2Grid[0].flatten() + (numpy.random.random(m2Grid[0].size)-0.5)
x = theano.tensor.vector()
fval = theano.tensor.sum(interp2D(x,m1, m2))
f = theano.function([x],fval,allow_input_downcast=True)
grad = theano.tensor.grad(fval,x)
df = theano.function([x],grad,allow_input_downcast=True)
t0 = time.time()
val_y = f(y)
t1 = time.time()
print("THEANO function evaluation took {0:f} seconds.".format(t1-t0))
t0 = time.time()
dF_y = df(y)
t1 = time.time()
print("THEANO gradient evaluation took {0:f} seconds.".format(t1-t0))
def interp2D(I,m1,m2):
m1Grid = numpy.meshgrid(numpy.linspace(0.5,m1[1]-0.5,m1[1])*0.5,numpy.
linspace(0.5,m1[0]-0.5,m1[0])*0.5)
xPoint = theano.shared(m1Grid[1].flatten().astype(theano.config.floatX))
yPoint = theano.shared(m1Grid[0].flatten().astype(theano.config.floatX))
integerX = theano.tensor.cast(theano.tensor.floor(xPoint),'int32')
#theano.tensor.floor
remainderX = xPoint-theano.tensor.cast(integerX,'floatX')
integerY = theano.tensor.cast(theano.tensor.floor(yPoint),'int32')
#theano.tensor.floor
remainderY = yPoint-theano.tensor.cast(integerY,'floatX')
#2D-Indexing
# I= theano.tensor.reshape(I,m2)
# I00 = I[integerX,integerY]
# I01 = I[integerX,integerY+1]
# I10 = I[integerX+1,integerY]
# I11 = I[integerX+1,integerY+1]
##########
##linear indexing
I00 = I[integerX*m2[1]+integerY]
I01 = I[integerX*m2[1]+(integerY+1)]
I10 = I[(integerX+1)*m2[1]+integerY]
I11 = I[(integerX+1)*m2[1]+(integerY+1)]
##########
invRemainderX = 1.0-remainderX
invRemainderY = 1.0-remainderY
return invRemainderX*(invRemainderY*I00+remainderY*I01) + remainderX*(
invRemainderY*I10+remainderY*I11)
if __name__ == '__main__':
interpolationTest()
Any ideas what is happening here, and how to fix this?
Thanks!
--
---
You received this message because you are subscribed to the Google Groups
"theano-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
For more options, visit https://groups.google.com/d/optout.