On Mon, 2 Aug 2021, Henry Rich wrote:

1a. What's an example that is greatly sped up by bitarrays?

Logical boolean ops (and/or/xor).
Perhaps more importantly, lower cache usage improves overall program locality. (Though this becomes doubly false every time you have to promote a bitarray to a bytearray.)


1b. J uses bitarrays internally in dyad i. when the arguments are large and integral.

That is 'third class' in vi.c? I see--somewhat. I will study that code more closely.


1d. With bitarrays, a single pointer is not enough to designate an atom of an array.

Yes, and you cannot associate a size with an atom either. I have been tentatively supporting them in my own (nascent) implementation and there are a lot of special cases to support. Hence the initial question. Obviously it's much easier to design a system from the ground up to support such a feature than to add it after the fact; I certainly wasn't proposing JE add such a feature, just curious in general about the tradeoffs there.


compelling example.

Single instruction 8x8 bitmatrix transpose on powerpc ;)


2a. Padding to what boundary?  The boundary that would satisfy all needs is 32 or 64 bytes, and that would make an nx2 character array pretty big.

Good point. I guess you could do it heuristically, when rows are large enough that the space overhead is negligible.


2f. As it happens, I tried rewriting (x |: y) recently to execute in place & had to scrap the project because the unaligned stores were too expensive.  But that's the ONLY case I've found where alignment caused trouble.

Here's the paper I was referring to: https://www.researchgate.net/publication/273912700_In-Place_Matrix_Transposition_on_GPUs

(It refers specifically to monadic transpose; I'm not sure whether it
 can be applied to the dyadic form.)

It wasn't actually to do with alignment, but rather with getting more composite array dimensions. I believe that those aspects of the technique are still useful on the CPU; their primary focus is on parallelism, but they also improve locality. Particularly:

the dimensions of an M × N matrix are factorized as M' × m × N' × n or a 4D array, where M = M' × m and N = N' × n. This factorization defines a blocked format on the matrix

 -E
----------------------------------------------------------------------
For information about J forums see http://www.jsoftware.com/forums.htm

Reply via email to