The convolution kernel function is just a straight up elementwise multiply and then sum all, it is not a dot product or matrix product. Nice illustration is found here: https://mlnotebook.github.io/post/CNN1/
so +/@:,@:* works. I don't know if there is a faster way to do it. Thanks, Jon On Friday, April 19, 2019, 5:54:24 AM GMT+9, Raul Miller <[email protected]> wrote: They're also not equivalent. For example: (i.2 3 4) +/@:,@:* i.2 3 970 (i.2 3 4) +/ .* i.2 3 |length error I haven't studied the possibilities of this code base enough to know how relevant this might be, but if you're working with rank 4 arrays, this kind of thing might matter. On the other hand, if the arrays handled by +/@:,@:* are the same shape, then +/ .*&, might be what you want. (Then again... any change introduced on "performance" grounds should get at least enough testing to show that there's a current machine where that change provides significant benefit for plausible data.) Thanks, -- Raul On Thu, Apr 18, 2019 at 4:16 PM Henry Rich <[email protected]> wrote: > > FYI: +/@:*"1 and +/ . * are two ways of doing dot-products fast. > +/@:,@:* is not as fast. > > Henry Rich > > On 4/18/2019 10:38 AM, jonghough via Programming wrote: > > > > Regarding the test network I sent in the previous email, it will not work. > > This one should: > > > > NB. > > ========================================================================= > > > > > > NB. 3 classes > > NB. horizontal lines (A), vertical lines (B), diagonal lines (C). > > NB. each class is a 3 channel matrix 3x8x8 > > > > > > A1=: 3 8 8 $ 1 1 1 1 1 1 1 1, 0 0 0 0 0 0 0 0, 0 0 0 0 0 0 0 0, 1 1 1 1 1 1 > > 1 1, 0 0 0 0 0 0 0 0, 1 1 1 1 1 1 1 1, 1 1 1 1 1 1 1 1, 0 0 0 0 0 0 0 0, 0 > > 0 0 0 0 0 0 0 > > > > A2=: 3 8 8 $ 1 1 1 1 1 1 1 1, 0 0 0 0 0 0 0 0, 1 1 1 1 1 1 1 1, 1 1 1 1 1 1 > > 1 1, 0 0 0 0 0 0 0 0 > > > > A3=: 3 8 8 $ 1 1 1 1 1 1 1 1, 0 0 0 0 0 0 0 0, 0 0 0 0 0 0 0 0 > > > > A4=: 3 8 8 $ 1 1 1 1 1 1 1 1, 0 0 0 0 0 0 0 0, 0 0 0 0 0 0 0 0, 1 1 1 1 1 1 > > 1 1, 1 1 1 1 1 1 1 1, 0 0 0 0 0 0 0 0 > > > > A5=: 2 |. A4 > > > > > > > > B1=: |:"2 A1 > > > > B2=: |:"2 A2 > > > > B3=: |:"2 A3 > > > > B4=: |:"2 A4 > > > > B5=: |:"2 A5 > > > > > > > > C1=: 3 8 8 $ 1 0 0 0 0 0 0 1, 0 1 0 0 0 0 1 0, 0 0 1 0 0 1 0 0, 0 0 0 1 1 0 > > 0 0, 0 0 0 1 1 0 0 0, 0 0 1 0 0 1 0 0, 0 1 0 0 0 0 1 0, 1 0 0 0 0 0 0 1 > > > > C2=: 3 8 8 $ 1 0 0 0 0 0 0 0, 0 1 0 0 0 0 0 0, 0 0 1 0 0 0 0 0, 0 0 0 1 0 0 > > 0 0, 0 0 0 0 1 0 0 0, 0 0 0 0 0 1 0 0, 0 0 0 0 0 0 1 0, 0 0 0 0 0 0 0 1 > > > > C3=: 3 8 8 $ 1 0 1 0 1 0 0 0, 0 1 0 1 0 1 0 0, 0 0 1 0 1 0 1 0, 0 0 0 1 0 1 > > 0 1, 1 0 0 0 1 0 1 0, 0 1 0 0 0 1 0 1, 1 0 1 0 0 0 1 0, 0 1 0 1 0 0 0 1 > > > > C4=: |."1 C3 > > > > C5=: 3 8 8 $ 1 1 1 1 0 0 0 0, 0 0 1 1 1 1 0 0, 0 0 0 0 1 1 1 1, 1 1 0 0 0 0 > > 1 1, 1 1 1 1 0 0 0 0, 0 0 1 1 1 1 0 0, 0 0 0 0 1 1 1 1, 1 1 0 0 0 0 1 1 > > > > > > > > A=: 5 3 8 8 $, A1, A2, A3, A4, A5 > > > > B=: 5 3 8 8 $, B1, B2, B3, B4, B5 > > > > C=: 5 3 8 8 $, C1, C2, C3, C4, C5 > > > > INPUT=: A,B,C > > > > OUTPUT=: 15 3 $ 1 0 0, 1 0 0, 1 0 0, 1 0 0, 1 0 0, 0 1 0, 0 1 0, 0 1 0, 0 1 > > 0, 0 1 0, 0 0 1, 0 0 1, 0 0 1, 0 0 1, 0 0 1 > > > > > > > > pipe=: (10;10;'softmax';1;'l2';0.0001) conew 'NNPipeline' > > > > c1=: ((10 3 4 4);2;'relu';'adam';0.01;0) conew 'Conv2D' > > > > b1=: (0; 1 ;0.0001;10;0.01) conew 'BatchNorm2D' > > > > a1=: 'relu' conew 'Activation' > > > > > > > > c2=: ((12 10 2 2); 1;'relu';'adam';0.01;0) conew 'Conv2D' > > > > b2=: (0; 1 ;0.0001;5;0.01) conew 'BatchNorm2D' > > > > a2=: 'relu' conew 'Activation' > > > > p1=: 2 conew 'PoolLayer' > > > > > > > > fl=: 3 conew 'FlattenLayer' > > > > fc=: (12;3;'softmax';'adam';0.01) conew 'SimpleLayer' > > > > b3=: (0; 1 ;0.0001;2;0.01) conew 'BatchNorm' > > > > a3=: 'softmax' conew 'Activation' > > > > > > > > addLayer__pipe c1 > > > > addLayer__pipe p1 > > > > NB.addLayer__pipe b1 > > > > addLayer__pipe a1 > > > > addLayer__pipe c2 > > > > NB.addLayer__pipe b2 > > > > addLayer__pipe a2 > > > > addLayer__pipe fl > > > > addLayer__pipe fc > > > > NB.addLayer__pipe b3 > > > > addLayer__pipe a3 > > > > require 'plot viewmat' > > NB. check the input images (per channel) > > NB. viewmat"2 A1 > > NB. viewmat"2 B1 > > NB. viewmat"2 C1 > > > > > > OUTPUT fit__pipe INPUT NB. <--- should get 100%ish accuracy after only a > > few iterations. > > NB. > > ========================================================================= > > > > > > Running the above doesn't prove much, as there is no training / testing set > > split. It is just to see *if* the training will push the networks > > parameters in the correct direction. Getting accurate predictions on all > > the A,B,C images will at least show that the network is not doing anything > > completely wrong. It is also just useful as a playground to see if > > different ideas work. > > > > You can test the accuracy with > > OUTPUT -:"1 1 (=>./)"1 >{: predict__pipe INPUT > > On Thursday, April 18, 2019, 11:36:35 AM GMT+9, Brian Schott > ><[email protected]> wrote: > > > > I have renamed this message because the topic has changed, but considered > > moving it to jchat as well. However I settled on jprogramming because there > > are definitely some j programming issues to discuss. > > > > Jon, > > > > Your script code is beautifully commented and very valuable, imho. The lack > > of an example has slowed down my study of the script, but now I have some > > questions and comments. > > > > I gather from your comments that the word tensor is used to designate a 4 > > dimensional array. That's new to me, but it is very logical. > > > > Your definition convFunc=: +/@:,@:* works very well. However, for some > > reason I wish I could think of a way to defined convFunc in terms of X=: > > dot=: +/ . * . > > > > The main insight I have gained from your code is that (x u;.+_3 y) can be > > used with x of shape 2 n where n>2 (and not just 2 2). This is great > > information. And that you built the convFunc directly into cf is also very > > enlightening. > > > > I have created a couple of examples of the use of your function `cf` to > > better understand how it works. [The data is borrowed from the fine example > > at http://cs231n.github.io/convolutional-networks/#conv . Beware that the > > dynamic example seen at the link changes everytime the page is refreshed, > > so you will not see the exact data I present, but the shapes of the data > > are constant.] > > > > Notice that in my first experiments both `filter` and the RHA of cf"3 are > > arrays and not tensors. Consequently(?) the result is an array, not a > > tensor, either. > > > > i=: _7]\".;._2 (0 : 0) > > 0 0 0 0 0 0 0 > > 0 0 0 1 2 2 0 > > 0 0 0 2 1 0 0 > > 0 0 0 1 2 2 0 > > 0 0 0 0 2 0 0 > > 0 0 0 2 2 2 0 > > 0 0 0 0 0 0 0 > > 0 0 0 0 0 0 0 > > 0 2 1 2 2 2 0 > > 0 0 1 0 2 0 0 > > 0 1 1 1 1 1 0 > > 0 2 0 0 0 2 0 > > 0 0 0 2 2 2 0 > > 0 0 0 0 0 0 0 > > 0 0 0 0 0 0 0 > > 0 0 0 1 2 1 0 > > 0 1 1 0 0 0 0 > > 0 2 1 2 0 2 0 > > 0 1 0 0 2 2 0 > > 0 1 0 1 2 2 0 > > 0 0 0 0 0 0 0 > > ) > > > > k =: _3]\".;._2(0 :0) > > 1 0 0 > > 1 _1 0 > > _1 _1 1 > > 0 _1 1 > > 0 0 1 > > 0 _1 1 > > 1 0 1 > > 0 _1 0 > > 0 _1 0 > > ) > > > > $i NB. 3 7 7 > > $k NB. 3 3 3 > > > > filter =: k > > convFunc=: +/@:,@:* > > > > cf=: 4 : '|:"2 |: +/ x filter&(convFunc"3 3);._3 y' > > (1 2 2,:3 3 3) cf"3 i NB. 3 3$1 1 _2 _2 3 _7 _3 1 0 > > > > My next example makes both the `filter` and the RHA into tensors. And > > notice the shape of the result shows it is a tensor, also. > > > > filter2 =: filter,:_1+filter > > cf2=: 4 : '|:"2 |: +/ x filter2&(convFunc"3 3);._3 y' > > $ (1 2 2,:3 3 3) cf2"3 i,:5+i NB. 2 2 3 3 > > > > Much of my effort regarding CNN has been studying the literature that > > discusses efficient ways of computing these convolutions by translating the > > filters and the image data into flattened (and somewhat sparse} forms that > > can be restated in matrix formats. These matrices accomplish the > > convolution and deconvolution as *efficient* matrix products. Your > > demonstration of the way that j's ;._3 can be so effective challenges the > > need for such efficiencies. > > > > On the other hand, I could use some help understanding how the 1 0 2 3 |: > > transpose you apply to `filter` is effective in the backpropogation stage. > > Part of my confusion is that I would have thought the transpose would have > > been 0 1 3 2 |:, instead. Can you say more about that? > > > > I have yet to try to understand your verbs `forward` and `backward`, but I > > look forward to doing so. > > > > I could not find definitions for the following functions and wonder if you > > can say more about them, please? > > > > bmt_jLearnUtil_ > > setSolver > > > > I noticed that your definitions of relu and derivRelu were more complicated > > than mine, so I attempted to test yours out against mine as follows. > > > > > > > > relu =: 0&>. > > derivRelu =: 0&< > > (relu -: 0:`[@.>&0) i: 4 > > 1 > > (derivRelu -: 0:`1:@.>&0) i: 4 > > 1 > > > > > > > > > > On Sun, Apr 14, 2019 at 8:31 AM jonghough via Programming < > > [email protected]> wrote: > > > >> I had a go writing conv nets in J. > >> See > >> https://github.com/jonghough/jlearn/blob/master/adv/conv2d.ijs > >> > >> This uses ;.3 to do the convolutions. Using a version of this , with a > >> couple of fixes/, I managed to get 88% accuracy on the cifar-10 imageset. > >> Took several days to run, as my algorithms are not optimized in any way, > >> and no gpu was used. > >> If you look at the references in the above link, you may get some ideas. > >> > >> the convolution verb is defined as: > >> cf=: 4 : 0 > >> |:"2 |: +/ x filter&(convFunc"3 3);._3 y > >> ) > >> > >> Note that since the input is an batch of images, each 3-d (width, height, > >> channels), we are actually doing the whole forward pass over a 4d array, > >> and outputting another 4d array of different shape, depending on output > >> channels, filter width, and filter height. > >> > >> Thanks, > >> Jon > >> > > Thank you, > > > > > --- > This email has been checked for viruses by AVG. > https://www.avg.com > > ---------------------------------------------------------------------- > For information about J forums see http://www.jsoftware.com/forums.htm ---------------------------------------------------------------------- For information about J forums see http://www.jsoftware.com/forums.htm ---------------------------------------------------------------------- For information about J forums see http://www.jsoftware.com/forums.htm
