On Tue, 6 Mar 2018 12:52:14, Robert Kern wrote:
> I would just recommend using one of the codebases to initialize the
> network, save the network out to disk, and load up the initialized network
> in each of the different codebases for training. That way you are sure
that
> they are both starting from the same exact network parameters.
>
> Even if you do rewrite a precisely equivalent np.random.randn() for
> Scala/Java, you ought to write the code to serialize the initialized
> network anyways so that you can test that the two initialization routines
> are equivalent. But if you're going to do that, you might as well take my
> recommended approach.

## Advertising

Thanks for the suggestion! I decided to use the approach you proposed.
Still, I'm puzzled by an issue that seems to be related to random
initilization.
I've three different NN implementations, 2 in Scala and one in NumPy.
When using the exact same initialization parameters I get the same
cost after each training iteration from each implementation. So, based on
this
I'd infer that the implementations work equivalently.
However, the results look very different when using random initialization.
With respect to exact cost this is course expected, but what I find
troublesome
is that after N training iterations the cost starts approaching zero with
the NumPy
code (most of of the time), whereas with the Scala based implementations
cost fails
to converge (most of the time).
With NumPy I'm simply using the following random initilization code:
np.random.randn(n_h, n_x) * 0.01
I'm trying to emulate the same behaviour in my Scala code by sampling from
a
Gaussian distribution with mean = 0 and std dev = 1.
Any ideas?
Marko

_______________________________________________
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion