Yes, I think the consensus was that it was a mistake to have implemented it in C, since it would have been possible to do the symbolic math in J.

If you look at the source again you'll see that there's more commentary now. If you have questions I can help.

Henry Rich

On 2/27/2017 7:13 PM, 'Jon Hough' via Programming wrote:
Yes, thanks for the warning.

I remember this one:

(^&_2) d. _1

and it also seems this is, too, is a bug,

(%@:*:) d. _1

I once tried to figure out how J implements d. and D., but unfortunately I 
found J's C source code impenetrable.

I assume fixing d. and D. are not priorities for new J releases.

Regards,
Jon
--------------------------------------------
On Tue, 2/28/17, Henry Rich <[email protected]> wrote:

  Subject: Re: [Jprogramming] Fast derivative of Softmax function
  To: [email protected]
  Date: Tuesday, February 28, 2017, 8:47 AM
The Bugs page is
  festooned with anomalies in D. and d., especially
  higher derivatives.  Have a look at it before
  you get too far committed.
Henry Rich On
  2/27/2017 6:45 PM, 'Jon Hough' via Programming
  wrote:
  > Thanks, the dsoftmax is going to
  be used for a toy Multilayer Perceptron Classifier I am
  writing. Using my original dsoftmax, which is being called
  hundreds of thousands of times, it took way too long.
  >
  > Using sm D. 1 has
  made things quite a bit faster.
  >
  > Thanks.
  >
  --------------------------------------------
  > On Tue, 2/28/17, Raul Miller <[email protected]>
  wrote:
  >
  >   Subject: Re: [Jprogramming]
  Fast derivative of Softmax function
  >   To: "Programming
  forum" <[email protected]>
  >   Date: Tuesday, February 28,
  2017, 12:24 AM
  >
  >   I was about to point out
  >   the same thing.
  >
  >
  a =: 0.5 0.6 0.23 0.66
  >      sm=:(] %
  +/ )@:^ NB. softmax
  >      softmax=: {
  sm
  >      dsoftmax=: 4 : 0
  >        idx=. x
  >
> vals=. y
  >
  >      smx=.
  idx softmax vals
  >
  >      rx=. ''
  >
  >      for_j.
  i.#vals do.
  >
  >      if. j = idx do. rx=. rx , smx * (1
  - smx)
  >          elseif. 1 do. rx=.
  rx ,(j
  >   softmax vals)* (0 -
  smx) end.
  >
  >
      end.
  >        rx
  >     )
  >
  >
  sm D.1 a
  >
  >   0.186192 _0.0676431
  _0.0467234 _0.0718259
  >   _0.0676431    0.19866
  _0.0516374
  >   _0.0793799
  >   _0.0467234
  >   _0.0516374   0.153191
  _0.0548305
  >   _0.0718259
  _0.0793799
  >   _0.0548305   0.206036
  >      (i.# a) dsoftmax"0 _  a
  >     0.186192 _0.0676431
  _0.0467234 _0.0718259
  >   _0.0676431    0.19866
  _0.0516374
  >   _0.0793799
  >   _0.0467234
  >   _0.0516374   0.153191
  _0.0548305
  >   _0.0718259
  _0.0793799
  >   _0.0548305   0.206036
  >
  >   The speedup is not too
  impressive, but it is a
  >   speedup (probably
  >   because we are retaining
  >   and reusing all results from
  sm rather than
  >   recomputing
  it so many times -- I imagine using
  >   sm directly and
  >   lifting it out of the loop
  >   ):
  >
  >
  timespacex
  >   '(i.# a)
  dsoftmax"0 _  a'
  >   5.5e_5 6528
  >      timespacex 'sm D.1 a'
  >   2.3e_5 7424
  >
  >   That said, note that we can
  approximate this
  >   speedup
  by using a
  >   variant on what
  Pascal
  >   proposed:
  >
  >   d_softmax=: 4 :
  >   0
  >
     rx=. i.0 0
  >
     smv=. sm
  >   y
  >
     for_i. x do.
  >
     ry=.
  >   i.0
  >       smx=. i { smv
  >
  >
     for_j. i.#y do.
  >
     if. j_index =
  >   i_index do. ry=. ry , smx *
  (1 - smx)
  >
  >     else.  ry=. ry ,(j {smv)*
  (0 - smx) end.
  >
     end.
  >
     rx=.rx, ry
  >
     end.
  >
     rx
  >   )
  >
  >
  (i.# a) d_softmax  a
  >
     0.186192 _0.0676431 _0.0467234 _0.0718259
  >   _0.0676431    0.19866
  _0.0516374
  >   _0.0793799
  >   _0.0467234
  >   _0.0516374   0.153191
  _0.0548305
  >   _0.0718259
  _0.0793799
  >   _0.0548305   0.206036
  >      timespacex '(i.# a)
  >   d_softmax  a'
  >   3e_5 6016
  >
  >   (Remember that it's
  >   generally a good idea to
  ignore speedups which are
  >   less than a factor of 2,
  because of scheduling
  >   issues within the
  >   machine itself - you can
  >   see this by inspecting
  multiple timing runs)
  >
  >      timespacex
  >   '(i.# a) d_softmax
  a'
  >   3e_5 6016
  >      timespacex 'sm D.1 a'
  >   2.3e_5 7424
  >      timespacex '(i.# a)
  >   d_softmax  a'
  >   2.9e_5 6016
  >      timespacex 'sm D.1 a'
  >   2.3e_5 7424
  >      timespacex '(i.# a)
  >   d_softmax  a'
  >   2.8e_5 6016
  >      timespacex 'sm D.1 a'
  >   3.7e_5 7424
  >      timespacex '(i.# a)
  >   d_softmax  a'
  >   3.2e_5 6016
  >
  >   I hope this helps,
  >
  >   --
  >   Raul
  >
  >   On Mon, Feb 27, 2017 at
  8:05
  >   AM, Louis de Forcrand
  <[email protected]>
  >   wrote:
  >   > You probably know about
  it, but
  >   I'll mention
  it anyway: there's a primitive partial
  >   derivative operator in J. I
  think it would do exactly what
  >   you want (numerically), and
  it's probably reasonably
  >   fast. It's not too hard
  to use either:
  >   >
  >   > dsoftmax=: sm D.1
  >   >
  >   > Louis
  >   >
  >   >> On 27 Feb 2017,
  >   at 10:20, 'Pascal
  Jasmin' via Programming <[email protected]>
  >   wrote:
  >   >>
  >   >> one
  >   optimization is removing the
  rank"0 _, so that the
  >   function not need to be
  reparsed for each x
  >   >>
  >   >> untested.
  >   >>
  >   >>
  >   >> dsoftmax=: 4 : 0
  >   >> rx=. ''
  >   >> for_i. x do.
  >   >>
  >   smx=. i softmax y
  >   >> for_j. i.#vals
  >   do.
  >   >> if. j_index. =
  i_index. do.
  >   rx=. rx , smx
  * (1 - smx)
  >   >>
  else.
  >   rx=. rx ,(j softmax
  y)* (0 - smx) end.
  >   >> end. end.
  >   >>
  >   rx
  >   >> )
  >   >>
  >   >>
  >   >> -----
  >   Original Message -----
  >   >> From:
  >   'Jon Hough' via
  Programming <[email protected]>
  >   >> To: Programming
  Forum <[email protected]>
  >   >> Sent: Monday,
  February 27, 2017 3:09
  >   AM
  >   >> Subject:
  [Jprogramming] Fast
  >   derivative of Softmax
  function
  >   >>
  >   >> Given an array, we
  can calculate the
  >   softmax
  function
  >   >> https://en.wikipedia.org/wiki/Softmax_function
  >   >>
  >   >> a =: 0.5 0.6
  >   0.23 0.66
  >   >> sm=:(] % +/ )@:^
  NB.
  >   softmax
  >   >>
  >   >> sm
  >   a
  >   >> 0.247399 0.273418
  0.188859
  >   0.290325
  >   >>
  >   >>
  >   The (partial) derivative of
  softmax is a little more
  >   complicated:
  >   >>
  >   >> If the array is of
  length N, we need
  >   an NxN
  matrix of partial derivatives where (in pseudo
  >   code)
  >   >>
  >   >>
  >   derivatives[i,j] = sm
  (array[i] )  *( 1 -
  >   sm(array[j])   if
  i == j
  >   >>
  >   or
  >   >> derivatives[i,j] =
  -1 * sm
  >   (array[i] )  * (
  sm(array[j])   if i != j
  >   >>
  >   >> ( see here
  >   for the reasoning: 
http://eli.thegreenplace.net/2016/the-softmax-function-and-its-derivative/
  >   )
  >   >>
  >   >> My
  >   implementation of the partial
  derivatives is this:
  >   >>
  >   >>
  >   >> NB. x value is
  index, y value is the
  >   whole array
  >   >> dsoftmax=: 4 : 0
  >   >> idx=. x
  >   >>
  >   vals=. y
  >   >> smx=. idx softmax
  vals
  >   >> rx=.
  ''
  >   >> for_j.
  i.#vals do.
  >   >>  if.
  j = idx do. rx=. rx , smx * (1 -
  >   smx)
  >   >>  elseif. 1 do. rx=.
  rx ,(j
  >   softmax vals)* (0 -
  smx) end.
  >   >>
  >   end.
  >   >> rx
  >   >>
  >   )
  >   >>
  >   >>
  >   >> Then, for example
  using above array
  >   a,
  >   >>
  >   >> (i.# a)
  >   dsoftmax"0 _  a
  >   >>
  >   >> gives the values, in
  a 4x4 matrix.
  >   >>
  >   >> This is quite
  >   slow. I have tried to do this
  without iterating and
  >   branching, but cannot figure
  out a way to do it.
  >   >> Any help
  appreciated.
  >   >>
  Thanks,
  >   >>
  >   >> Jon
  >   >>
  >   ----------------------------------------------------------------------
  >   >> For information
  about J forums see http://www.jsoftware.com/forums.htm
  >   >>
  >   ----------------------------------------------------------------------
  >   >> For information
  about J forums see http://www.jsoftware.com/forums.htm
  >   >
  >   >
  >   ----------------------------------------------------------------------
  >   > For information about J
  forums see http://www.jsoftware.com/forums.htm
  >   ----------------------------------------------------------------------
  >   For information about J
  forums see http://www.jsoftware.com/forums.htm
  >
  ----------------------------------------------------------------------
  > For information about J forums see http://www.jsoftware.com/forums.htm
----------------------------------------------------------------------
  For information about J forums see http://www.jsoftware.com/forums.htm
----------------------------------------------------------------------
For information about J forums see http://www.jsoftware.com/forums.htm

----------------------------------------------------------------------
For information about J forums see http://www.jsoftware.com/forums.htm

Reply via email to