Hi,
Yes, it is an actual problem that we never managed to fix in a
satisfactory way. The current behaviour is inconsistent.
Doing the substitution one at a time is a workaround, I think Blocks
does that for dropout, but it can be cumbersome to have everything
cloned over and over again.
Another option, still experimental, may be the `map_variables` function
in scan_modules/scan_utils.
Finally, it is actually possible to replace Apply nodes inputs manually.
In your case, you could do something like:
>>> exp2.owner.inputs[1] = 3*exp1
>>> exp3.owner.inputs[1] = 5*exp2
>>> exp4.owner.inputs[1] = 7*exp3
>>> print(theano.pp(exp4))
(TensorConstant{8} * (TensorConstant{7} * (TensorConstant{6} *
(TensorConstant{5} * (TensorConstant{4} * (TensorConstant{3} *
(TensorConstant{2} * <TensorType(int64, scalar)>)))))))
>>> exp4.eval({v: 1})
array(40320)
But it can get hard to get right if the same expression is re-used
several times.
On Fri, Oct 14, 2016, John Coolidge wrote:
> Hello,
>
> I'm trying to use theano.clone to implement dropout in my MLP network.
> Because I want to apply dropout at multiple layers, I pass the clone call
> multiple key value pairs to its replacement parameter:
> replace={layer1:mask*layer1, layer2:mask*layer2, etc} however the graph
> that's returned seems to have only actually made one of the replacements.
> I suspect this is because clone is doing the replacements sequentially and
> once it's done one replacement it generates a new graph for which the other
> key value pairs no longer correspond.
>
> Here is some example code that demonstrates the unexpected behavior:
>
> v = T.lscalar()
> exp1 = 2*v
> exp2 = 4*exp1
> exp3 = 6*exp2
> exp4 = 8*exp3
>
> print theano.pp(exp4)
> exp5 = theano.clone(exp4, replace={exp1:(3*exp1), exp2:(5*exp2),
> exp3:(7*exp3)})
> print theano.pp(exp5)
> t = theano.function(inputs=[v], outputs=exp5)
> print t(1)
>
>
> The output is:
> (TensorConstant{8} * (TensorConstant{6} * (TensorConstant{4} *
> (TensorConstant{2} * <TensorType(int64, scalar)>))))
> (TensorConstant{8} * (TensorConstant{7} * (TensorConstant{6} *
> (TensorConstant{4} * (TensorConstant{2} * <TensorType(int64, scalar)>)))))
> 2688
>
> Although the clone adds the 7 factor to the new graph, it does not add the
> 3 or 5 factors such that the output for an input value of 1 is 8*7*6*4*2*1
> instead of 8! as I would have expected.
>
> I'm guessing this is how the clone function is supposed to work, but does
> anyone see how to get the desired behavior I'm looking for? Perhaps I
> could apply the replacements one at a time and after each replacement
> update the remaining replacement key value pairs to point to corresponding
> points in the new graph, but I'm not sure how to find these corresponding
> points. Or perhaps there's a function like the clone but that actually
> makes the replacements in place so that the other replacement key value
> pairs would not be invalidated after the first replacement? Any ideas
> would be greatly appreciated!
>
> --
>
> ---
> You received this message because you are subscribed to the Google Groups
> "theano-users" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to [email protected].
> For more options, visit https://groups.google.com/d/optout.
--
Pascal
--
---
You received this message because you are subscribed to the Google Groups
"theano-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
For more options, visit https://groups.google.com/d/optout.