Re: [Jprogramming] avoid boxing with fills - ?

Dan Bron Fri, 24 Jul 2009 15:13:06 -0700

Alexander Mikhailov wrote:
>  I wonder how in J one should properly solve the following problem.


Before I show you a workaround, I want to make 3 points:

    (1)  J is designed for processing rectilinear arrays 
         (arrays where the items along each dimension have 
         the same length)

Therefore it is best to design your J programs to deal in rectilinear
arrays.  For the parts of your program where this is impossible and you
need "ragged" arrays, boxes are provided.  

But processing large numbers of boxes is slow and this is expected to
represent a small part of your program.  If your data is inherently
ragged, perhaps it is not a problem suited for a J solution.

    (2)  J's syntax is designed around one- and two-argument 
         functions.  Handling three (or more) is cumbersome. 

Your sentence,  f2 & (3 4) 2 3 f1 4   contains 3 noun arguments (one of
which is hard coded).  That is, it's in the form  z F x G y  where  x G y 
isn't rectilinear (before padding & fill) and you want to pair the items
of  z  with each output of  G  before it is padded/filled, but without
boxing.  That is difficult (though not impossible, as I'll demonstrate).  

However, if you wanted  F x G y  (that is, without the  z  ), then the
solution is trivial (in fact primitive), and doesn't involve boxing:   
f...@g y  .  There is also a way to achieve  z F G y  (no  x  ) without using
boxes.

I assume you know this (ie. that it's difficult to solve "3 argument"
problems in J), so you hard-coded one of the arguments.  But even with
that degree of freedom I couldn't come up with a straightforward
(non-tricky) solution.

   (3)  Later, you said:  "does it mean J forces you
        to have a performance penalty here, or the 
        whole program design isn't good?"

Don't avoid boxing just because you think it *might* impose a performance
penalty.  For example, a small number of boxes each containing a large
number of (unboxed) data doesn't carry much of a penalty (in contrast to a
large number of boxes each containing a small number of data).

Put another way:  J provides boxes specifically to address heterogeneous
data (J considers data with different shapes to be heterogeneous).  That
is, the way to handle ragged data in J is with boxes.  Lots of data is
ragged and over the years many people have successfully used J to process
it, using boxes.  Box performance has been improved several times, and
some common boxy idioms are supported by special code.

So, instead of asking for a way to avoid boxing a priori, I suggest you try
writing your program in the natural (boxy) way.  If the performance is
unacceptable, profile your code.  If the problem turns out to be
processing boxes, complain on the Forum.

For one thing, if the "natural" way to write the solution is slow, and you
make this fact public, the implementors of J might be motivated to improve
its performance.  For another, other J users might be able to suggest an
optimization in the context of your specific program.  

This is likely to be more helpful than an answer to the general question of
"how can I process ragged arrays in J, without using boxes?".

Having said all that, here's a way to answer that question:


           f1 =: 4 : 'y + i. x' "0 0
           f2 =: *

           NB.  Avoid boxing using sparse arrays
           ab =: (2 : '((u"2 0)&) ([:`) ($.`) (0:`) (`($...@v)) (`:6) ')   

           NB.  Note relationship to f2 & (3 4) 2 3 f1 4           
           (3 4) f2 ab f1
        0: $. [: f2"2 0&3 4 $...@f1

           2 3 (3 4) f2 ab f1 4
        12 15  0
        12 15 18
        
        16 20  0
        16 20 24

... but I bet this will be even more expensive, performance wise, than the
natural boxy solution.  

Furthermore, since sparse arrays get less attenion in J, if you try to
extend the solution, you're likely to hit gaps and bugs in the
implementation.  For example:
   
           NB.  In general  "2 0  should be  "_1  and in this case they should 
be
identical
           ab =: (2 : '((u"_1)&) ([:`) ($.`) (0:`) (`($...@v)) (`:6) ')
           
           NB.  Whoops, that's not right
           2 3 (3 4) f2 ab f1 4
        12 15  0
        16 20 24
   
   
           NB.  Ok, reverting...
           ab =: (2 : '((u"2 0)&) ([:`) ($.`) (0:`) (`($...@v)) (`:6) ')   
           
           NB.  Now let's try the original f2  ...
           f2 =: 4 : 'x , y'"0 0

           2 3 (3 4) f2 ab f1 4
        |non-unique sparse elements: f2
        |   2 3    (3 4)f2 ab f1 4
        |[-18] 


           NB.  This crashes J  [1]
           2 3 (3 4) [ ab f1 4
   
I'm pretty sure you could work around these limitations by effectively
reimplementing  $.  but that would be a lot of work, and it is very
unlikely to perform better than the boxy solutions.

-Dan

[1]  This crash wasn't even the coup de gras -- I actually got one of the
rare "scheck" errors, but I forgot to record the sentence that produced it.



----------------------------------------------------------------------
For information about J forums see http://www.jsoftware.com/forums.htm

Re: [Jprogramming] avoid boxing with fills - ?

Reply via email to