I had normalized the input so that all the values were between 0 and 1.
There is no luck with using either ReLU activation unit or tanh nor by
multiplying the initial weights with factor of 0.01.
There seems to be some other problem with my implementation which I'm
unable to figure out.
On
The error is telling you the issue, your original shared variables are
float32 but the updates you produce are float64. I'm guessing you don't
have floatX set as float32 in your theano config, so when you multiply the
gradient by the learning rate it gets upcast to float64, you can either set
On this machine I have another process using theano with memory on the GPU.
Normally I can launch many processes and import theano but now I get this
error.
$ python
Python 2.7.12 |Anaconda 2.3.0 (64-bit)| (default, Jul 2
I did not test this, because I don't have data or code for one_step, but it
would be something like:
def loop_over_examples(x):
# hidden and outputs of the entire sequence
[h_vals, o_vals], inner_updates = theano.scan(fn=one_step,
sequences = dict(input = x, taps=[0]),
Hi,
My guess is that:
- without cnmem, allocation and deallocation of intermediate results
force synchronization of the GPU more often, so the overall time is
slower
- with cnmem and borrow=False, there is no synchronization at all, and
what is measured is just the time to launch the GPU
The sigmoid activation function tends to saturate and block gradient
propagation, so the gradient wrt W1 is probably really close to zero in
your case.
Potential solutions include using another activation function (ReLU for
instance, or tanh), initializing W1 with smaller weights, making sure
Hi Fred, thanks for the info, will try that.
On Tuesday, 11 October 2016 20:51:49 UTC+8, nouiz wrote:
>
> Update to Theano dev version. There was update to it since the last
> release that could help you.
>
> Fred
>
> Le 11 oct. 2016 01:31, "狄凯" a écrit :
>
>> Hi guys, I'm