Yes, thanks for the warning.

I remember this one:

(^&_2) d. _1

and it also seems this is, too, is a bug,

(%@:*:) d. _1

I once tried to figure out how J implements d. and D., but unfortunately I 
found J's C source code impenetrable.

I assume fixing d. and D. are not priorities for new J releases. 

Regards,
Jon
--------------------------------------------
On Tue, 2/28/17, Henry Rich <[email protected]> wrote:

 Subject: Re: [Jprogramming] Fast derivative of Softmax function
 To: [email protected]
 Date: Tuesday, February 28, 2017, 8:47 AM
 
 The Bugs page is
 festooned with anomalies in D. and d., especially 
 higher derivatives.  Have a look at it before
 you get too far committed.
 
 Henry Rich
 
 On
 2/27/2017 6:45 PM, 'Jon Hough' via Programming
 wrote:
 > Thanks, the dsoftmax is going to
 be used for a toy Multilayer Perceptron Classifier I am
 writing. Using my original dsoftmax, which is being called
 hundreds of thousands of times, it took way too long.
 >
 > Using sm D. 1 has
 made things quite a bit faster.
 >
 > Thanks.
 >
 --------------------------------------------
 > On Tue, 2/28/17, Raul Miller <[email protected]>
 wrote:
 >
 >   Subject: Re: [Jprogramming]
 Fast derivative of Softmax function
 >   To: "Programming
 forum" <[email protected]>
 >   Date: Tuesday, February 28,
 2017, 12:24 AM
 >   
 >   I was about to point out
 >   the same thing.
 >   
 >     
 a =: 0.5 0.6 0.23 0.66
 >      sm=:(] %
 +/ )@:^ NB. softmax
 >      softmax=: {
 sm
 >      dsoftmax=: 4 : 0
 >        idx=. x
 > 
   
 >      vals=. y
 >    
 >      smx=.
 idx softmax vals
 >    
 >      rx=. ''
 >    
 >      for_j.
 i.#vals do.
 >      
 >      if. j = idx do. rx=. rx , smx * (1
 - smx)
 >          elseif. 1 do. rx=.
 rx ,(j
 >   softmax vals)* (0 -
 smx) end.
 >    
 > 
     end.
 >        rx
 >     )
 >   
 >     
 sm D.1 a
 >    
 >   0.186192 _0.0676431
 _0.0467234 _0.0718259
 >   _0.0676431    0.19866
 _0.0516374
 >   _0.0793799
 >   _0.0467234
 >   _0.0516374   0.153191
 _0.0548305
 >   _0.0718259
 _0.0793799
 >   _0.0548305   0.206036
 >      (i.# a) dsoftmax"0 _  a
 >     0.186192 _0.0676431
 _0.0467234 _0.0718259
 >   _0.0676431    0.19866
 _0.0516374
 >   _0.0793799
 >   _0.0467234
 >   _0.0516374   0.153191
 _0.0548305
 >   _0.0718259
 _0.0793799
 >   _0.0548305   0.206036
 >   
 >   The speedup is not too
 impressive, but it is a
 >   speedup (probably
 >   because we are retaining
 >   and reusing all results from
 sm rather than
 >   recomputing
 it so many times -- I imagine using
 >   sm directly and
 >   lifting it out of the loop
 >   ):
 >   
 >     
 timespacex
 >   '(i.# a)
 dsoftmax"0 _  a'
 >   5.5e_5 6528
 >      timespacex 'sm D.1 a'
 >   2.3e_5 7424
 >   
 >   That said, note that we can
 approximate this
 >   speedup
 by using a
 >   variant on what
 Pascal
 >   proposed:
 >   
 >   d_softmax=: 4 :
 >   0
 > 
    rx=. i.0 0
 > 
    smv=. sm
 >   y
 > 
    for_i. x do.
 >   
    ry=.
 >   i.0
 >       smx=. i { smv
 >    
 > 
    for_j. i.#y do.
 >     
    if. j_index =
 >   i_index do. ry=. ry , smx *
 (1 - smx)
 >      
 >     else.  ry=. ry ,(j {smv)*
 (0 - smx) end.
 >   
    end.
 >   
    rx=.rx, ry
 > 
    end.
 > 
    rx
 >   )
 >   
 >     
 (i.# a) d_softmax  a
 > 
    0.186192 _0.0676431 _0.0467234 _0.0718259
 >   _0.0676431    0.19866
 _0.0516374
 >   _0.0793799
 >   _0.0467234
 >   _0.0516374   0.153191
 _0.0548305
 >   _0.0718259
 _0.0793799
 >   _0.0548305   0.206036
 >      timespacex '(i.# a)
 >   d_softmax  a'
 >   3e_5 6016
 >   
 >   (Remember that it's
 >   generally a good idea to
 ignore speedups which are
 >   less than a factor of 2,
 because of scheduling
 >   issues within the
 >   machine itself - you can
 >   see this by inspecting
 multiple timing runs)
 >   
 >      timespacex
 >   '(i.# a) d_softmax 
 a'
 >   3e_5 6016
 >      timespacex 'sm D.1 a'
 >   2.3e_5 7424
 >      timespacex '(i.# a)
 >   d_softmax  a'
 >   2.9e_5 6016
 >      timespacex 'sm D.1 a'
 >   2.3e_5 7424
 >      timespacex '(i.# a)
 >   d_softmax  a'
 >   2.8e_5 6016
 >      timespacex 'sm D.1 a'
 >   3.7e_5 7424
 >      timespacex '(i.# a)
 >   d_softmax  a'
 >   3.2e_5 6016
 >   
 >   I hope this helps,
 >   
 >   --
 >   Raul
 >   
 >   On Mon, Feb 27, 2017 at
 8:05
 >   AM, Louis de Forcrand
 <[email protected]>
 >   wrote:
 >   > You probably know about
 it, but
 >   I'll mention
 it anyway: there's a primitive partial
 >   derivative operator in J. I
 think it would do exactly what
 >   you want (numerically), and
 it's probably reasonably
 >   fast. It's not too hard
 to use either:
 >   >
 >   > dsoftmax=: sm D.1
 >   >
 >   > Louis
 >   >
 >   >> On 27 Feb 2017,
 >   at 10:20, 'Pascal
 Jasmin' via Programming <[email protected]>
 >   wrote:
 >   >>
 >   >> one
 >   optimization is removing the
 rank"0 _, so that the
 >   function not need to be
 reparsed for each x
 >   >>
 >   >> untested.
 >   >>
 >   >>
 >   >> dsoftmax=: 4 : 0
 >   >> rx=. ''
 >   >> for_i. x do.
 >   >>
 >   smx=. i softmax y
 >   >> for_j. i.#vals
 >   do.
 >   >> if. j_index. =
 i_index. do.
 >   rx=. rx , smx
 * (1 - smx)
 >   >>
 else.
 >   rx=. rx ,(j softmax
 y)* (0 - smx) end.
 >   >> end. end.
 >   >>
 >   rx
 >   >> )
 >   >>
 >   >>
 >   >> -----
 >   Original Message -----
 >   >> From:
 >   'Jon Hough' via
 Programming <[email protected]>
 >   >> To: Programming
 Forum <[email protected]>
 >   >> Sent: Monday,
 February 27, 2017 3:09
 >   AM
 >   >> Subject:
 [Jprogramming] Fast
 >   derivative of Softmax
 function
 >   >>
 >   >> Given an array, we
 can calculate the
 >   softmax
 function
 >   >> https://en.wikipedia.org/wiki/Softmax_function
 >   >>
 >   >> a =: 0.5 0.6
 >   0.23 0.66
 >   >> sm=:(] % +/ )@:^
 NB.
 >   softmax
 >   >>
 >   >> sm
 >   a
 >   >> 0.247399 0.273418
 0.188859
 >   0.290325
 >   >>
 >   >>
 >   The (partial) derivative of
 softmax is a little more
 >   complicated:
 >   >>
 >   >> If the array is of
 length N, we need
 >   an NxN
 matrix of partial derivatives where (in pseudo
 >   code)
 >   >>
 >   >>
 >   derivatives[i,j] = sm
 (array[i] )  *( 1 -
 >   sm(array[j])   if
 i == j
 >   >>
 >   or
 >   >> derivatives[i,j] = 
 -1 * sm
 >   (array[i] )  * (
 sm(array[j])   if i != j
 >   >>
 >   >> ( see here
 >   for the reasoning: 
 >http://eli.thegreenplace.net/2016/the-softmax-function-and-its-derivative/
 >   )
 >   >>
 >   >> My
 >   implementation of the partial
 derivatives is this:
 >   >>
 >   >>
 >   >> NB. x value is
 index, y value is the
 >   whole array
 >   >> dsoftmax=: 4 : 0
 >   >> idx=. x
 >   >>
 >   vals=. y
 >   >> smx=. idx softmax
 vals
 >   >> rx=.
 ''
 >   >> for_j.
 i.#vals do.
 >   >>  if.
 j = idx do. rx=. rx , smx * (1 -
 >   smx)
 >   >>  elseif. 1 do. rx=.
 rx ,(j
 >   softmax vals)* (0 -
 smx) end.
 >   >>
 >   end.
 >   >> rx
 >   >>
 >   )
 >   >>
 >   >>
 >   >> Then, for example
 using above array
 >   a,
 >   >>
 >   >> (i.# a)
 >   dsoftmax"0 _  a
 >   >>
 >   >> gives the values, in
 a 4x4 matrix.
 >   >>
 >   >> This is quite
 >   slow. I have tried to do this
 without iterating and
 >   branching, but cannot figure
 out a way to do it.
 >   >> Any help
 appreciated.
 >   >>
 Thanks,
 >   >>
 >   >> Jon
 >   >>
 >   ----------------------------------------------------------------------
 >   >> For information
 about J forums see http://www.jsoftware.com/forums.htm
 >   >>
 >   ----------------------------------------------------------------------
 >   >> For information
 about J forums see http://www.jsoftware.com/forums.htm
 >   >
 >   >
 >   ----------------------------------------------------------------------
 >   > For information about J
 forums see http://www.jsoftware.com/forums.htm
 >   ----------------------------------------------------------------------
 >   For information about J
 forums see http://www.jsoftware.com/forums.htm
 >
 ----------------------------------------------------------------------
 > For information about J forums see http://www.jsoftware.com/forums.htm
 
 ----------------------------------------------------------------------
 For information about J forums see http://www.jsoftware.com/forums.htm
----------------------------------------------------------------------
For information about J forums see http://www.jsoftware.com/forums.htm

Reply via email to