Re: [Jprogramming] Two implementation questions

Elijah Stone Mon, 02 Aug 2021 17:37:42 -0700

On Mon, 2 Aug 2021, Henry Rich wrote:

1a. What's an example that is greatly sped up by bitarrays?


Logical boolean ops (and/or/xor).

Perhaps more importantly, lower cache usage improves overall programlocality. (Though this becomes doubly false every time you have topromote a bitarray to a bytearray.)

1b. J uses bitarrays internally in dyad i. when the arguments are largeand integral.

That is 'third class' in vi.c? I see--somewhat. I will study that codemore closely.

1d. With bitarrays, a single pointer is not enough to designate an atomof an array.

Yes, and you cannot associate a size with an atom either. I have beententatively supporting them in my own (nascent) implementation and thereare a lot of special cases to support. Hence the initial question.Obviously it's much easier to design a system from the ground up tosupport such a feature than to add it after the fact; I certainly wasn'tproposing JE add such a feature, just curious in general about thetradeoffs there.

compelling example.


Single instruction 8x8 bitmatrix transpose on powerpc ;)

2a. Padding to what boundary? The boundary that would satisfy all needsis 32 or 64 bytes, and that would make an nx2 character array prettybig.

Good point. I guess you could do it heuristically, when rows are largeenough that the space overhead is negligible.

2f. As it happens, I tried rewriting (x |: y) recently to execute inplace & had to scrap the project because the unaligned stores were tooexpensive. But that's the ONLY case I've found where alignment causedtrouble.

Here's the paper I was referring to:https://www.researchgate.net/publication/273912700_In-Place_Matrix_Transposition_on_GPUs


(It refers specifically to monadic transpose; I'm not sure whether it
 can be applied to the dyadic form.)

It wasn't actually to do with alignment, but rather with getting morecomposite array dimensions. I believe that those aspects of the techniqueare still useful on the CPU; their primary focus is on parallelism, butthey also improve locality. Particularly:

the dimensions of an M × N matrix are factorized as M' × m × N' × n or a4D array, where M = M' × m and N = N' × n. This factorization defines ablocked format on the matrix


 -E
----------------------------------------------------------------------
For information about J forums see http://www.jsoftware.com/forums.htm

Re: [Jprogramming] Two implementation questions

Reply via email to