Re: [Jprogramming] averages of non-zero elements of matrices

Dan Bron Wed, 02 Dec 2009 09:00:30 -0800

>   mask #"1 mat
> produces
>   1  2  3  0 0

You used the right phrase.  And if you wrote:


           mask avg@:#"1 mat    NB.  note  @:
        2 7 0 16.5

or
           mask avg@(#"1) mat   NB.  note  @  but  ()
        2 7 0 16.5
           
you'd have the answer you want.

Remember that all J arrays are rectangular (orthotope).  All rows of a
matrix must be the same length -- you cannot have rows with items
"missing".  If you try to create rows with "holes", J will simply collect
all the remaining data in each row (as vectors, even empty vectors), and
then pad each vector with as many 0s as needed to make its length the same
as all the other rows' (which means as long as the longest vector after
filtering).

To visualize the mechanism:

           ] remainders =. mask <@:#"1  mat
        +-----+---------++-----------+
        |1 2 3|5 6 7 8 9||15 16 17 18|
        +-----+---------++-----------+
           ] lengths =. #&> remainders
        3 5 0 4

           ] max =. >./ lengths
        5
           ] padded=. max {.&.> remainders
        +---------+---------+---------+-------------+
        |1 2 3 0 0|5 6 7 8 9|0 0 0 0 0|15 16 17 18 0|
        +---------+---------+---------+-------------+
           > padded
         1  2  3  0 0
         5  6  7  8 9
         0  0  0  0 0
        15 16 17 18 0
           
But this is generalized - it doesn't just apply to numeric arrays; literal
arrays are padded with ' ', box arrays are padded with  a:  etc.  And it
doesn't just apply to tables; all arrays, of any rank, must be
rectangular, meaning all their items must have the same shape (which means
the items' items must have the same shape, and so on), and so in general
new axes may be introduced in addition to padding with fill elements.  See
http://www.jsoftware.com/help/dictionary/dictb.htm#fill .

As the example above may have hinted, one may deal with ragged arrays in J
by using boxes to collect arrays of different shapes (or even types), and
then applying the function of interest within each box.  If the results of
the function are homogeneous (in both shape and type), then you may open
the boxes to get a natural J rectangular array.  Otherwise you may leave
them closed.  For the former, the idiom is  foo&> boxes  and for the
latter  foo&.>  boxes  .  The similarity is not coincidental.

In many cases, like your average, you may skip the intermediate boxing, and
simply use rank.  This is what I did above, for your row averages.  Rank
can be thought of as an implicit kind of boxing (and unboxing).  If you
apply a function at a rank, you have the ability to process each of its
outputs individually.  Again, if all of the outputs of your processing are
homogeneous, then rank will not pad them.

>  Ultimately, I will be working with multi-dimensional matrices and extracting
>  info of different shapes and sizes from the various dimensions. Rather than
>  doing these selections on a case by case basis, I am hoping to use some form
>  of a multi-dimensional mask to select the data and then do the calculations.

Depending on exactly what you're trying to do, this may be cumbersome.  J
makes it easiest to address leading axes.  For example, it is much more
straightforward and common to box the rows of a table than the columns. 
To box the columns, one usually transposes the table, so that the columns
become the rows (and hence, a leading axis).   J's  +"3  is different from
 APL's  +[3]  .

Addressing various dimensions and extracting strange shapes might be
troublesome.   

>  There may be other more efficient approaches which would be great to hear
>  about.

With that said, it is possible.  I suggest you look at the cut conjunction,
 ;.  .  In particular the definition when the left hand argument (to the
derived verb) is boxed.  For example, we can directly box the columns of a
matrix, without transposing, with  ('';1) <;.1  matrix  .  

You might find value in all three flavors of  ;.  (0, 3 _3, and 1 2 _1 _2 ).

-Dan

PS:  Instead of  mat=. 4 5 $ i.100  we can say  mat=. i.4 5  (a nice
upgrade from iota).


----------------------------------------------------------------------
For information about J forums see http://www.jsoftware.com/forums.htm

Re: [Jprogramming] averages of non-zero elements of matrices

Reply via email to