Hi there,
So i've recently started getting into theano, and wanted to try and take my
existing code and put it on a GPU. I tried the simple example on the
website, and got similar improvements to those quoted, so was quite hopeful
going in ! The code that i'm evaluating is the following, hopefully
commented sufficiently to make it clear whats going on:
amps = tt.vector('amps', dtype=theano.config.floatX)
offs = tt.vector('offs', dtype=theano.config.floatX)
sigs = tt.vector('sigs', dtype=theano.config.floatX)
phase = tt.scalar('phase', dtype=theano.config.floatX)
#TFlatTimes is a float32 shared vector that is 1024*useToAs long and
contains the observed times of useToAs light curves, each of
#which is sampled with 1024 bins. useToAs is set to 100 in this case but
will eventually be tens of thousands.
#ReferencePeriod is a float32 shared scalar
#Tg1width, Tg2amp, Tg2width are float32 shared scalars that define a double
Gaussian model
#phase is a single free parameter that defines when to evaluate the
gausisan model jointly for each light curve
#first shift TFlatTimes by phase, and then wrap between values of
-ReferencePeriod/2 and +ReferencePeriod/2, store as x
#Then evaluate first gaussian as y
#repeat for the position of the second gaussian and evaluate it as y2
x = ( TFlatTimes - phase + ReferencePeriod/2) % (ReferencePeriod ) -
ReferencePeriod/2
y = tt.exp(-0.5*(x)**2/Tg1width**2)
x2 = ( TFlatTimes - phase - gsep + ReferencePeriod/2) % (ReferencePeriod ) -
ReferencePeriod/2
y2 = Tg2amp*tt.exp(-0.5*(x2)**2/Tg2width**2)
#AmpVec, OffVec, and SigVec contain the overall amplitude of each curve, an
offset, and the noise level
#Each is 1024*useToAs in length and is a single number (ie amps[0])
repeated 1024 times, then amps[1] 1024 times etc
AmpVec = theano.tensor.extra_ops.repeat(amps, 1024)
OffVec = theano.tensor.extra_ops.repeat(offs, 1024)
SigVec = theano.tensor.extra_ops.repeat(sigs, 1024)
Nbins=Nbins.astype(int)
TNbins=theano.shared(Nbins)
#construct final signal vector, the sum of the two gaussians multipled by
the overall amplitude for that curve, plus the offset
s = AmpVec*(y+y2) + OffVec
#calculate log likelihood
like = 0.5*tt.sum(((TFlatData-s)/SigVec)**2) + 0.5*tt.sum(TNbins[:useToAs]*
tt.log(sigs**2))
#calculate gradient with respect to the parameters
glike = tt.grad(like, [phase, amps, offs, sigs])
#define functions to return likelihood, gradient, and the signal vector
getS = theano.function([phase, amps, offs], s)
getX = theano.function([phase, amps, offs, sigs], like)
getG = theano.function([phase, amps, offs, sigs], glike)
#Wrap these in a single function that is passed vectors of parameters
def TheanoFunc2(phaseval, ampvec, offvec, sigvec):
l=getX(phaseval, ampvec, offvec, sigvec)*1
g=getG(phaseval, ampvec, offvec, sigvec)
return l, g
I then wanted to test this by evaluating TheanoFunc2 20000 times using
random numbers as the input:
pval = np.float32(0.00288206)
Tpval = theano.shared(pval)
ltot = 0
#define random number functions
from theano.sandbox.rng_mrg import MRG_RandomStreams as RandomStreams
theano_rng = RandomStreams(189)
avals = theano.function([], theano_rng.normal( size = (useToAs,), avg = 0.0,
std = 1.0, dtype=theano.config.floatX))
ovals = theano.function([], theano_rng.normal( size = (useToAs,), avg = 0.0,
std = 1.0, dtype=theano.config.floatX))
nvals = theano.function([], theano_rng.normal( size = (useToAs,), avg = 0.0,
std = 1.0, dtype=theano.config.floatX)**2)
start = time.clock()
for i in range(20000):
if(i%100 == 0):
print i
l, g = TheanoFunc2(pval, avals(), ovals(), nvals())
ltot += l
end = time.clock()
print "time", start-end
I then timed this for CPU and GPU uses using:
setenv THEANO_FLAGS 'mode=FAST_RUN,device=cpu,floatX=float32'
and
setenv THEANO_FLAGS 'mode=FAST_RUN,device=gpu,floatX=float32'
and get times of 469.33s on CPU, and 561.29s on a GPU.
Unfortunately I have no idea why that might be, is there any way to see how
much/when stuff is being copied to and from the GPU? In principle all i
need to do is copy my initial vector of parameters to the GPU, and then
just return the likelihood and gradient, everything else can be made and
kept on the GPU.
If anyone was able to look through this and shed some light, I would
greatly appreciate it!
Thanks
--
---
You received this message because you are subscribed to the Google Groups
"theano-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
For more options, visit https://groups.google.com/d/optout.