Re: [Jprogramming] convolutional neural network [was simplifying im2col]

jonghough via Programming Thu, 18 Apr 2019 17:13:57 -0700

 The convolution kernel function is just a straight up elementwise multiply and 
then sum all, it is not a dot product or matrix product.
Nice illustration is found here: https://mlnotebook.github.io/post/CNN1/


so +/@:,@:* works. I don't know if there is a faster way to do it.

Thanks,
Jon     On Friday, April 19, 2019, 5:54:24 AM GMT+9, Raul Miller 
<[email protected]> wrote:  
 
 They're also not equivalent.

For example:

  (i.2 3 4) +/@:,@:* i.2 3
970
  (i.2 3 4) +/ .* i.2 3
|length error

I haven't studied the possibilities of this code base enough to know
how relevant this might be, but if you're working with rank 4 arrays,
this kind of thing might matter.

On the other hand, if the arrays handled by +/@:,@:* are the same
shape, then +/ .*&, might be what you want. (Then again... any change
introduced on "performance" grounds should get at least enough testing
to show that there's a current machine where that change provides
significant benefit for plausible data.)

Thanks,

-- 
Raul

On Thu, Apr 18, 2019 at 4:16 PM Henry Rich <[email protected]> wrote:
>
> FYI: +/@:*"1 and +/ . * are two ways of doing dot-products fast.
> +/@:,@:* is not as fast.
>
> Henry Rich
>
> On 4/18/2019 10:38 AM, jonghough via Programming wrote:
> >
> > Regarding the test network I sent in the previous email, it will not work. 
> > This one should:
> >
> > NB. 
> > =========================================================================
> >
> >
> > NB. 3 classes
> > NB. horizontal lines (A), vertical lines (B), diagonal lines (C).
> > NB. each class is a 3 channel matrix 3x8x8
> >
> >
> > A1=: 3 8 8 $ 1 1 1 1 1 1 1 1, 0 0 0 0 0 0 0 0, 0 0 0 0 0 0 0 0, 1 1 1 1 1 1 
> > 1 1, 0 0 0 0 0 0 0 0, 1 1 1 1 1 1 1 1, 1 1 1 1 1 1 1 1, 0 0 0 0 0 0 0 0, 0 
> > 0 0 0 0 0 0 0
> >
> > A2=: 3 8 8 $ 1 1 1 1 1 1 1 1, 0 0 0 0 0 0 0 0, 1 1 1 1 1 1 1 1, 1 1 1 1 1 1 
> > 1 1, 0 0 0 0 0 0 0 0
> >
> > A3=: 3 8 8 $ 1 1 1 1 1 1 1 1, 0 0 0 0 0 0 0 0, 0 0 0 0 0 0 0 0
> >
> > A4=: 3 8 8 $ 1 1 1 1 1 1 1 1, 0 0 0 0 0 0 0 0, 0 0 0 0 0 0 0 0, 1 1 1 1 1 1 
> > 1 1, 1 1 1 1 1 1 1 1, 0 0 0 0 0 0 0 0
> >
> > A5=: 2 |. A4
> >
> >
> >
> > B1=: |:"2 A1
> >
> > B2=: |:"2 A2
> >
> > B3=: |:"2 A3
> >
> > B4=: |:"2 A4
> >
> > B5=: |:"2 A5
> >
> >
> >
> > C1=: 3 8 8 $ 1 0 0 0 0 0 0 1, 0 1 0 0 0 0 1 0, 0 0 1 0 0 1 0 0, 0 0 0 1 1 0 
> > 0 0, 0 0 0 1 1 0 0 0, 0 0 1 0 0 1 0 0, 0 1 0 0 0 0 1 0, 1 0 0 0 0 0 0 1
> >
> > C2=: 3 8 8 $ 1 0 0 0 0 0 0 0, 0 1 0 0 0 0 0 0, 0 0 1 0 0 0 0 0, 0 0 0 1 0 0 
> > 0 0, 0 0 0 0 1 0 0 0, 0 0 0 0 0 1 0 0, 0 0 0 0 0 0 1 0, 0 0 0 0 0 0 0 1
> >
> > C3=: 3 8 8 $ 1 0 1 0 1 0 0 0, 0 1 0 1 0 1 0 0, 0 0 1 0 1 0 1 0, 0 0 0 1 0 1 
> > 0 1, 1 0 0 0 1 0 1 0, 0 1 0 0 0 1 0 1, 1 0 1 0 0 0 1 0, 0 1 0 1 0 0 0 1
> >
> > C4=: |."1 C3
> >
> > C5=: 3 8 8 $ 1 1 1 1 0 0 0 0, 0 0 1 1 1 1 0 0, 0 0 0 0 1 1 1 1, 1 1 0 0 0 0 
> > 1 1, 1 1 1 1 0 0 0 0, 0 0 1 1 1 1 0 0, 0 0 0 0 1 1 1 1, 1 1 0 0 0 0 1 1
> >
> >
> >
> > A=: 5 3 8 8 $, A1, A2, A3, A4, A5
> >
> > B=: 5 3 8 8 $, B1, B2, B3, B4, B5
> >
> > C=: 5 3 8 8 $, C1, C2, C3, C4, C5
> >
> > INPUT=: A,B,C
> >
> > OUTPUT=: 15 3 $ 1 0 0, 1 0 0, 1 0 0, 1 0 0, 1 0 0, 0 1 0, 0 1 0, 0 1 0, 0 1 
> > 0, 0 1 0, 0 0 1, 0 0 1, 0 0 1, 0 0 1, 0 0 1
> >
> >
> >
> > pipe=: (10;10;'softmax';1;'l2';0.0001) conew 'NNPipeline'
> >
> > c1=: ((10 3 4 4);2;'relu';'adam';0.01;0) conew 'Conv2D'
> >
> > b1=: (0; 1 ;0.0001;10;0.01) conew 'BatchNorm2D'
> >
> > a1=: 'relu' conew 'Activation'
> >
> >
> >
> > c2=: ((12 10 2 2); 1;'relu';'adam';0.01;0) conew 'Conv2D'
> >
> > b2=: (0; 1 ;0.0001;5;0.01) conew 'BatchNorm2D'
> >
> > a2=: 'relu' conew 'Activation'
> >
> > p1=: 2 conew 'PoolLayer'
> >
> >
> >
> > fl=: 3 conew 'FlattenLayer'
> >
> > fc=: (12;3;'softmax';'adam';0.01) conew 'SimpleLayer'
> >
> > b3=: (0; 1 ;0.0001;2;0.01) conew 'BatchNorm'
> >
> > a3=: 'softmax' conew 'Activation'
> >
> >
> >
> > addLayer__pipe c1
> >
> > addLayer__pipe p1
> >
> > NB.addLayer__pipe b1
> >
> > addLayer__pipe a1
> >
> > addLayer__pipe c2
> >
> > NB.addLayer__pipe b2
> >
> > addLayer__pipe a2
> >
> > addLayer__pipe fl
> >
> > addLayer__pipe fc
> >
> > NB.addLayer__pipe b3
> >
> > addLayer__pipe a3
> >
> > require 'plot viewmat'
> > NB. check the input images (per channel)
> > NB. viewmat"2 A1
> > NB. viewmat"2 B1
> > NB. viewmat"2 C1
> >
> >
> > OUTPUT fit__pipe INPUT NB. <--- should get 100%ish accuracy after only a 
> > few iterations.
> > NB. 
> > =========================================================================
> >
> >
> > Running the above doesn't prove much, as there is no training / testing set 
> > split. It is just to see *if* the training will push the networks 
> > parameters in the correct direction. Getting accurate predictions on all 
> > the A,B,C images will at least show that the network is not doing anything 
> > completely wrong. It is also just useful as a playground to see if 
> > different ideas work.
> >
> > You can test the accuracy with
> > OUTPUT -:"1 1 (=>./)"1 >{: predict__pipe INPUT
> >      On Thursday, April 18, 2019, 11:36:35 AM GMT+9, Brian Schott 
> ><[email protected]> wrote:
> >
> >  I have renamed this message because the topic has changed, but considered
> > moving it to jchat as well. However I settled on jprogramming because there
> > are definitely some j programming issues to discuss.
> >
> > Jon,
> >
> > Your script code is beautifully commented and very valuable, imho. The lack
> > of an example has slowed down my study of the script, but now I have some
> > questions and comments.
> >
> > I gather from your comments that the word tensor is used to designate a 4
> > dimensional array. That's new to me, but it is very logical.
> >
> > Your definition convFunc=: +/@:,@:* works very well. However, for some
> > reason I wish I could think of a way to defined convFunc in terms of X=:
> > dot=: +/ . * .
> >
> > The main insight I have gained from your code is that (x u;.+_3 y)  can be
> > used with x of shape 2 n where n>2 (and not just 2 2). This is great
> > information. And that you built the convFunc directly into cf is also very
> > enlightening.
> >
> > I have created a couple of examples of the use of your function `cf` to
> > better understand how it works. [The data is borrowed from the fine example
> > at http://cs231n.github.io/convolutional-networks/#conv . Beware that the
> > dynamic example seen at the link changes everytime the page is refreshed,
> > so you will not see the exact data I present, but the shapes of the data
> > are constant.]
> >
> > Notice that in my first experiments both `filter` and the RHA of cf"3 are
> > arrays and not tensors. Consequently(?) the result is an array, not a
> > tensor, either.
> >
> >    i=: _7]\".;._2 (0 : 0)
> > 0 0 0 0 0 0 0
> > 0 0 0 1 2 2 0
> > 0 0 0 2 1 0 0
> > 0 0 0 1 2 2 0
> > 0 0 0 0 2 0 0
> > 0 0 0 2 2 2 0
> > 0 0 0 0 0 0 0
> > 0 0 0 0 0 0 0
> > 0 2 1 2 2 2 0
> > 0 0 1 0 2 0 0
> > 0 1 1 1 1 1 0
> > 0 2 0 0 0 2 0
> > 0 0 0 2 2 2 0
> > 0 0 0 0 0 0 0
> > 0 0 0 0 0 0 0
> > 0 0 0 1 2 1 0
> > 0 1 1 0 0 0 0
> > 0 2 1 2 0 2 0
> > 0 1 0 0 2 2 0
> > 0 1 0 1 2 2 0
> > 0 0 0 0 0 0 0
> > )
> >
> >    k =: _3]\".;._2(0 :0)
> > 1  0 0
> > 1 _1 0
> > _1 _1 1
> > 0 _1 1
> > 0  0 1
> > 0 _1 1
> > 1  0 1
> > 0 _1 0
> > 0 _1 0
> > )
> >
> >    $i NB. 3 7 7
> >    $k NB.  3 3 3
> >
> >    filter =: k
> >    convFunc=: +/@:,@:*
> >
> >    cf=: 4 :  '|:"2 |: +/ x filter&(convFunc"3 3);._3 y'
> >    (1 2 2,:3 3 3) cf"3 i NB. 3 3$1 1 _2 _2 3 _7 _3 1  0
> >
> > My next example makes both the `filter` and the RHA into tensors. And
> > notice the shape of the result shows it is a tensor, also.
> >
> >    filter2 =: filter,:_1+filter
> >    cf2=: 4 :  '|:"2 |: +/ x filter2&(convFunc"3 3);._3 y'
> >    $ (1 2 2,:3 3 3) cf2"3 i,:5+i NB. 2 2 3 3
> >
> > Much of my effort regarding CNN has been studying the literature that
> > discusses efficient ways of computing these convolutions by translating the
> > filters and the image data into flattened (and somewhat sparse} forms that
> > can be restated in matrix  formats. These matrices accomplish the
> > convolution and deconvolution as *efficient* matrix products. Your
> > demonstration of the way that j's ;._3 can be so effective challenges the
> > need for such efficiencies.
> >
> > On the other hand, I could use some help understanding how the 1 0 2 3 |:
> > transpose you apply to `filter` is effective in the backpropogation stage.
> > Part of my confusion is that I would have thought the transpose would have
> > been 0 1 3 2 |:, instead. Can you say more about that?
> >
> > I have yet to try to understand your verbs `forward` and `backward`, but I
> > look forward to doing so.
> >
> > I could not find definitions for the following functions and wonder if you
> > can say more about them, please?
> >
> > bmt_jLearnUtil_
> > setSolver
> >
> > I noticed that your definitions of relu and derivRelu were more complicated
> > than mine, so I attempted to test yours out against mine as follows.
> >
> >
> >
> >    relu    =: 0&>.
> >    derivRelu =: 0&<
> >    (relu -: 0:`[@.>&0) i: 4
> > 1
> >    (derivRelu -: 0:`1:@.>&0) i: 4
> > 1
> >
> >
> >
> >
> > On Sun, Apr 14, 2019 at 8:31 AM jonghough via Programming <
> > [email protected]> wrote:
> >
> >>    I had a go writing conv nets in J.
> >> See
> >> https://github.com/jonghough/jlearn/blob/master/adv/conv2d.ijs
> >>
> >> This uses ;.3 to do the convolutions. Using a version of this , with a
> >> couple of fixes/, I managed to get 88% accuracy on the cifar-10 imageset.
> >> Took several days to run, as my algorithms are not optimized in any way,
> >> and no gpu was used.
> >> If you look at the references in the above link, you may get some ideas.
> >>
> >> the convolution verb is defined as:
> >> cf=: 4 : 0
> >> |:"2 |: +/ x filter&(convFunc"3 3);._3 y
> >> )
> >>
> >> Note that since the input is an batch of images, each 3-d (width, height,
> >> channels), we are actually doing the whole forward pass over a 4d array,
> >> and outputting another 4d array of different shape, depending on output
> >> channels, filter width, and filter height.
> >>
> >> Thanks,
> >> Jon
> >>
> > Thank you,
> >
>
>
> ---
> This email has been checked for viruses by AVG.
> https://www.avg.com
>
> ----------------------------------------------------------------------
> For information about J forums see http://www.jsoftware.com/forums.htm
----------------------------------------------------------------------
For information about J forums see http://www.jsoftware.com/forums.htm  
----------------------------------------------------------------------
For information about J forums see http://www.jsoftware.com/forums.htm

Re: [Jprogramming] convolutional neural network [was simplifying im2col]

Reply via email to