On Mon, Nov 21, 2016, Seppo Enarvi wrote:
> Ok. Is random number generation working in the new GPU backend yet? I can
> see some code related to it, but a call to *uniform()* produces the error
> messages "context name None not defined" and "Could not infer context from
> inputs". Looks like it's not possible to specify the target device to
> *uniform()*.
It should work.
In fact I just tried with the latest master, and I did not have any issue
with the following, with THEANO_FLAGS=device=cuda0,floatX=float32:
>>> import theano
Mapped name None to device cuda0 [...]
>>> from theano.sandbox.rng_mrg import MRG_RandomStreams as RandomStreams
>>> rng = RandomStreams(23)
>>> u = rng.uniform((12,))
>>> f = theano.function([], u)
HostFromGpu(gpuarray) [id A] <TensorType(float32, vector)> '' 1
|GPUA_mrg_uniform{GpuArrayType<None>(float32, (False,)),inplace}.1 [id B]
<GpuArrayType<None>(float32, (False,))> '' 0
|<GpuArrayType<None>(int32, (False, False))> [id C]
<GpuArrayType<None>(int32, (False, False))>
|TensorConstant{(1,) of 12} [id D] <TensorType(int64, (True,))>
GPUA_mrg_uniform{GpuArrayType<None>(float32, (False,)),inplace}.0 [id B]
<GpuArrayType<None>(int32, (False, False))> '' 0
>>> f()
array([ 0.04422134, 0.93608665, 0.04399569, 0.95211482, 0.39980391,
0.23936224, 0.31680474, 0.9962666 , 0.46095091, 0.72883427,
0.13103466, 0.61714345], dtype=float32)
>
> On Monday, November 21, 2016 at 2:40:03 AM UTC+2, Pascal Lamblin wrote:
> >
> > Right, now I remember that the _dev20 version only works on a limited
> > number of dimensions. That would explain why adding a new axis helped.
> >
> > It may be fixed already in the new GPU back-end (it needs libgpuarray,
> > then use device=cudaX instead of gpuX) already, otherwise this is where
> > we should fix that.
> >
> > On Fri, Nov 18, 2016, Seppo Enarvi wrote:
> > >
> > >
> > > That's interesting, because this function is not supposed to update the
> > > bias. It just computes the cost and its gradient. Maybe that op is used
> > > to update the gradient.
> > >
> > > My GPU is Quadro K2000. I don't think it's too old because the graph
> > > contains other instances of GpuAdvancedIncSubtensor1_dev20.
> > >
> > > Anyway, I started to think why I don't have this problem with the weight
> > > matrix. I'm selecting vectors from the weight matrix in the same manner.
> > > So I tried converting the bias vector into a matrix, and selecting rows
> > > from the matrix (each of which contain only one element):
> > >
> > > bias = bias[class_ids]
> > > =>
> > > bias = bias[:, None]
> > > bias = bias[class_ids, 0]
> > >
> > > It's a lot faster this way. I updated to the latest version of Theano
> > > from Git and I still see the huge speed difference.
> > >
> > > Seppo
> > >
> > >
> > >
> > > On Friday, November 18, 2016 at 6:49:56 PM UTC+2, Pascal Lamblin wrote:
> > > >
> > > > Hi,
> > > >
> > > > This operation is actually the _update_ of the selected elements of
> > the
> > > > bias.
> > > >
> > > > There is a faster implementation (named GpuAdvancedIncSubtensor1_dev20
> > > > IIRC) that uses atomic addition to speed up that operation. It has the
> > > > downside of not yielding a deterministic order of summation if the
> > same
> > > > element is updated more than once in the same operation.
> > > >
> > > > One of the issues seems to be that this faster implementation is not
> > > > selected. Could it be that you have an old GPU?
> > > >
> > > > Another potential issue is that your graph seems to first apply
> > updates
> > > > on a tensor of zeros, and then apply another update on the bias
> > itself.
> > > > There may be a way of simplifying that.
> > > >
> > > > On Fri, Nov 18, 2016, Seppo Enarvi wrote:
> > > > >
> > > > > I'm implementing sampling based softmax alternatives, where I
> > compute
> > > > the
> > > > > preactivations only for certain output classes. I get a very bad
> > > > > performance due to a GpuAdvancedIncSubtensor1 op, which consumes 90
> > % of
> > > > > the processing time of the update function:
> > > > >
> > > > > <% time> <sum %> <apply time> <time per call> <#call> <id> <Mflops>
> > > > > <Gflops/s> <Apply name> 89.0% 89.0% 725.413s 2.44e-01s 2968 115
> > > > >
> > > >
> > GpuAdvancedIncSubtensor1{inplace,inc}(GpuAdvancedIncSubtensor1{inplace,inc}.0,
> >
> >
> > > >
> > > > > GpuFromHost.0, Elemwise{Cast{int64}}.0) input 0: dtype=float32,
> > > > > shape=(10001,), strides=(1,) input 1: dtype=float32, shape=(25600,),
> > > > > strides=(1,) input 2: dtype=int64, shape=(25600,), strides=c output
> > 0:
> > > > > dtype=float32, shape=(10001,), strides=(1,)
> > > > >
> > > > > Looking at the computation graph of that function, I noticed it's
> > > > operating
> > > > > on the bias vector:
> > > > >
> > > > > GpuAdvancedIncSubtensor1{inplace,inc} [id FL] '' 115
> > > > > |GpuAdvancedIncSubtensor1{inplace,inc} [id FM] '' 112
> > > > > | |GpuAlloc{memset_0=True} [id FN] '' 17
> > > > > | | |CudaNdarrayConstant{[ 0.]} [id FO]
> > > > > | | |Shape_i{0} [id FP] '' 7
> > > > > | | |bias [id BU]
> > > > >
> > > > > More precisely, the performance hit seems to come from selecting
> > from
> > > > the
> > > > > bias vector those values that correspond to the output classes (bias
> > =
> > > > > bias[class_ids]). Is that a particularly expensive operation?
> > class_ids
> > > > can
> > > > > be large (1,000 - 10,000). If I don't use the bias, my speed
> > improves
> > > > > tenfold. Is there a way to circumvent that problem?
> > > >
> > >
> > > --
> > >
> > > ---
> > > You received this message because you are subscribed to the Google
> > Groups "theano-users" group.
> > > To unsubscribe from this group and stop receiving emails from it, send
> > an email to [email protected] <javascript:>.
> > > For more options, visit https://groups.google.com/d/optout.
> >
> >
> > --
> > Pascal
> >
>
> --
>
> ---
> You received this message because you are subscribed to the Google Groups
> "theano-users" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to [email protected].
> For more options, visit https://groups.google.com/d/optout.
--
Pascal
--
---
You received this message because you are subscribed to the Google Groups
"theano-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
For more options, visit https://groups.google.com/d/optout.