On Tue, 6 Mar 2018 12:52:14, Robert Kern wrote:
> I would just recommend using one of the codebases to initialize the
> network, save the network out to disk, and load up the initialized network
> in each of the different codebases for training. That way you are sure
> they are both starting from the same exact network parameters.
> Even if you do rewrite a precisely equivalent np.random.randn() for
> Scala/Java, you ought to write the code to serialize the initialized
> network anyways so that you can test that the two initialization routines
> are equivalent. But if you're going to do that, you might as well take my
> recommended approach.
Thanks for the suggestion! I decided to use the approach you proposed.
Still, I'm puzzled by an issue that seems to be related to random
I've three different NN implementations, 2 in Scala and one in NumPy.
When using the exact same initialization parameters I get the same
cost after each training iteration from each implementation. So, based on
I'd infer that the implementations work equivalently.
However, the results look very different when using random initialization.
With respect to exact cost this is course expected, but what I find
is that after N training iterations the cost starts approaching zero with
code (most of of the time), whereas with the Scala based implementations
to converge (most of the time).
With NumPy I'm simply using the following random initilization code:
np.random.randn(n_h, n_x) * 0.01
I'm trying to emulate the same behaviour in my Scala code by sampling from
Gaussian distribution with mean = 0 and std dev = 1.
NumPy-Discussion mailing list