Mike,

I agree that 'symbol' works better than 'dot' in plot.

I can't think why you got the nonce error. But I was trying to reproduce
the following sequence from python. This sequence surprised me a little as
I had originally used something more like yours.

  dhidden = np.dot(dscores, W2.T)
  # backprop the ReLU non-linearity
  dhidden[hidden_layer <= 0] = 0

I have read your follow-on messages and they reminded me that I renamed
their y as classes because j verbs don't like the name y for non-arguments.
Any yes classes are 0s 1s and 2s.

They are using normal variates with ยต=0 and variance = 1, so I attempted to
supply "uniform" variates with similar mean and variance, but of course not
exactly the same relative frequencies. Uniform variates have variance
=(b-a)^2/12, so (1-_1)^2/12 = 4/12 = 1/3 so I multiplied each uniform
variate by %:3 to adjust. I hope that makes sense and hope it does not
produce such a great difference, but I suppose I can get normal variates
and experiment. Good idea.

Wrt your question "Does indexing by "classes" (all 3s) have the same effect
as their y ?", I hope so.

Thanks very much,

On Wed, May 15, 2019 at 4:51 PM 'Mike Day' via Programming <
[email protected]> wrote:

> I've had a look at your example and the source you cite.  You differ
> from the source in seeming to
>
> need explicit handling of hidden layer with both W & b AND W2 & b2 which
> I can't understand right now.
>
> Ah - I've just found a second listing, lower down the page, which does
> have W2 and b2 and a hidden layer!
>
> I found, at least in Windows 10, that 'dot'plot.... shows more or less
> white space; 'symbol' plot is better.
>
> Anyway, when I first ran train, I got:
>
>     train 100
> |nonce error: train
> |   dhidden=.0     indx}dhidden
>
> The trouble arose from this triplet of lines:
>
>      dhidden =. dscores dot |:W2
>      indx =. I. hidden_layer <: 0
>      dhidden =. 0 indx}dhidden
>
> Since you seem to be restricting dhidden to be non-negative, I replaced
> these three with:
>
>      dhidden =. 0 >. dscores dot |:W2   NB. is this what you meant?
>
> I've also changed the loop so that we get a report for the first cycle,
> as in Python:
>
> for_i. i. >: y do.
>
> and added this line after smoutput i,loss - might not be necessary in
> Darwin...
>
> wd'msgs'
>
> With these changes, train ran as follows:
>
> cc =: train 10000    NB. loss starts ok,  increases slightly, still
> unlike the Python ex!
>
> 0 1.09856
>
> 1000 1.10522
>
> 2000 1.10218
>
> 3000 1.0997
>
> 4000 1.09887
>
> 5000 1.09867
>
> 6000 1.09862
>
> 7000 1.09861
>
> 8000 1.09861
>
> 9000 1.09861
>
>     $h_l =. 0>.(>1{cc) +"1 X dot >0{cc
> 300 100
>     $sc =. (>3{cc) +"1 h_l dot >2{cc
> 300 3
>     $predicted_class =. (i.>./)"1 sc
> 300
>     mean predicted_class = classes
> 0.333333
>
> Why are the cycle 0 losses different, if only slightly? They report
> 1.098744 cf your 1.09856 .
>
> Sorry - only minor problems found - they don't explain why you don't
> reproduce their results
>
> more closely,
>
> Mike
>
>
>
----------------------------------------------------------------------
For information about J forums see http://www.jsoftware.com/forums.htm

Reply via email to