I guess it is obvious to all involved, but the transpose can be avoided and in C++ the +./ can be terminated if one true bit is found.

Are the value ranges of X1 and X2 limited in some way? If not, a lookup table for X2 would be too big. /Erling


On 2016-10-14 05:22, Michal Dobrogost wrote:
Thanks Mike!

I'm scared of diving off the deep end here, nevertheless, what you describe
is exactly how I solve "one side" of the problem (in C++, J is doing the
design work for now. A pipe dream would be to write my own mini-J-to-C
compiler with only the operations required here - thus the desire to
minimize the number of different operations used).

So in the full problem we have:

17      = $X1
1024    = $X2
17 1024 = $Y

and we want to compute both X1 (+./@:*.) Y and X2 (+./@:*.) |:Y many many
times with Y being static.

For X2 your approach works great, however for X1 it's better to take
advantage of the bit-level parallelism (like you demonstrated for X2) which
on modern SIMD machines can be 256 bits (or even 512 on the latest and
greatest). I guess in J my suggested non-LUT approach for X1 would go
something like:

    Y=. 1 > ? 17 1024 $ 10
    X=. ? 17 $ 2

    YDense=. _32 #.\"1 Y
    ResultDense=. X ((23 b.)/ @: *) YDense
    Result=. , #: ResultDense

    NB. Everything checks out
    ~. Result = (X (+./@:*.) Y)
1

For X1, the LUT version of ((0~:(17 b.))&(#.@|:) requires 1024 ANDs, 1024
compares and then 1024 of the associated bit shuffling operations to get
those results into contiguous register values.

For X1, ((23 b.)/@:*) requires 68 ORs and 17 bit shuffling operations (to
extract 0 or 1 from the bits of XDense).

The LUT method I was originally suggesting goes one step further and
precomputes ORs between adjacent rows of Y. The 3-bit version would
precompute 3 rows at a time. The number of bits in the lut, linearly
decreases the number of ORs and bit shuffling ops required (and memory
accesses) at the cost of increased memory use.

Cheers,

Mike

On Thu, Oct 13, 2016 at 2:09 AM, 'Mike Day' via Programming <
programm...@jsoftware.com> wrote:

While I'm also not too clear what you're really trying to achieve,
it might be worth offering this solution to your particular example
knowing your required solution:

    x  ((0~:(17 b.))&(#.@|:)) y
1 1 0 0 1 1

It uses the boolean and operation (17 b.) on bits at word level.
0~: could be 0< if you're sure the sign bit isn't involved.
Inducing |: on x is redundant,  but it makes for a neater expression.

If you were doing lots of these,  it would be better to
set up your "look-up" vector as

     [LU =: #.|:y
16 13 1 9 10 7

and then do

     LU(0~:(17 b.)) #.x
1 1 0 0 1 1

For repeated operation on several x s,  perhaps define a verb:

    lu =: LU &(0 ~: 17 b.)@#."1

eg

     lu x,:|.x
1 1 0 0 1 1
0 1 1 1 1 1

Binding LU into lu might be excessive for spatially large arrays of y.

This approach should work well for fewer that 32 or 64 bits.  Otherwise,
you'd need to break the arrays up into word-sized chunks. Perhaps that
is what you were getting at?

Any help,

Mike



On 13/10/2016 08:11, Michal Dobrogost wrote:

Sorry for spamming.

In fact my paranoid use of @: can be replaced with @ everywhere. Giving:

L3=. _3((|."1@#:@i.@(2^#)) (+./@#) ])\Y

So I should also say that the real goal is to minimize the number of
different operations used. Right now this stands at: #: i. 2^ # +. / @ # \
if we assume #: is implemented in reverse.

On Thu, Oct 13, 2016 at 12:05 AM, Michal Dobrogost <
michal.dobrog...@gmail.com> wrote:

And +./"2@:# can be rewritten as +./@#.
Anything else?

On Wed, Oct 12, 2016 at 11:55 PM, Michal Dobrogost <
michal.dobrog...@gmail.com> wrote:

You can rewrite *."0 1"1 _ as #
     ]L3=. _3((|."1@:#:@:i.@:(2^#)) (+./"2 @: #) ])\Y
0 0 0 0 0 NB. zero
0 0 0 0 1 NB.         0 { Y
0 0 0 1 0 NB.         1 { Y
0 0 0 1 1 NB. +./   0 1 { Y
0 0 1 0 0 NB.         2 { y
0 0 1 0 1 NB.       0 2 { Y
0 0 1 1 0 NB. +./   0 1 { Y
0 0 1 1 1 NB. +./ 0 1 2 { Y

0 0 0 0 0 NB. zero
0 1 0 0 0 NB.       3 { Y
1 0 0 0 0 NB.       4 { Y
1 1 0 0 0 NB. +./ 3 4 { Y
0 0 0 0 0
0 0 0 0 0
0 0 0 0 0
0 0 0 0 0

On Wed, Oct 12, 2016 at 11:34 PM, Michal Dobrogost <
michal.dobrog...@gmail.com> wrote:

I've tried to clean up the code to be more explanatory. All we really
care about is generating L2 given an arbitrary Y. Rewriting this has
also
helped me catch a subtle bug with arrays going left-to-right but #.
and #:
interpreting numbers right-to-left.

     NB. X and Y are completely arbitary and are chosen to be
     NB. easy to trace.
     ] X =. 0 1 1 0 1
0 1 1 0 1
     ] Y =. |.=/~i.5
0 0 0 0 1
0 0 0 1 0
0 0 1 0 0
0 1 0 0 0
1 0 0 0 0

     NB. This is the final result but we are actually trying to
     NB. generate lookup tables for highly optimized C++ code.
     X (+./@:*.) Y
1 0 1 1 0

     NB. Break up x into 2-bit chunks.
     NB. Note: reversed with |. because #. below interprets right-most
     NB.       digit as most significant but J's array has first
     NB.       item as left-most.
     X
0 1 1 0 1
     _2<@|.\X
┌───┬───┬─┐
│1 0│0 1│1│
└───┴───┴─┘

     NB. We convert 2-bit chunks of x into indices.
     _2#.@|.\X
2 1 1

     ]L2=. _2((|."1@:#:@:i.@:(2^#)) (+./"2 @: (*."0 1"1 _)) ])\Y
0 0 0 0 0 NB. zero
0 0 0 0 1 NB. 0 { y
0 0 0 1 0 NB. 1 { y
0 0 0 1 1 NB. +./ 0 1 { y

0 0 0 0 0 NB. zero
0 0 1 0 0 NB. 2 { y
0 1 0 0 0 NB. 3 { y
0 1 1 0 0 NB. +./ 2 3 { y

0 0 0 0 0 NB. zero
1 0 0 0 0 NB. 4 { y
0 0 0 0 0
0 0 0 0 0

     NB. Compute by 2-bit lookups (same result as +./@:*. above).
     NB. We don't really care, this just demonstrates that the LUT
works.
     +./ (_2#.@|.\x) {"0 2 L2
1 0 1 1 0

Cheers,

Mike

On Wed, Oct 12, 2016 at 7:00 PM, Michal Dobrogost <
michal.dobrog...@gmail.com> wrote:

Hi Raul,
What is the definition of *inds*?


I would describe the higher level operation as: select the rows of Y
where X is 1. Then take the column-wise OR of them.

However the real thing I'm interested in is generating the lookup
tables which serve as an intermediate step in the higher level
operation.

On Wed, Oct 12, 2016 at 9:35 AM, Raul Miller <rauldmil...@gmail.com>
wrote:

I'm not quite sure I understand what you are doing here.
Here's what I understand from your description:

You're looking for rows where a bit set in X has a bit set in Y, and
you want to split up X and Y into smaller pieces for your
intermediate
result, and you want to use indices rather than bit operations for
your final operation.

That gives me something like this.

     X =: ? 5 $ 2
     Y =: 30 > ? 5 6 $ 100
     X
0 1 1 1 1
     Y
0 0 0 0 0 0
1 0 0 0 0 0
1 1 1 0 0 1
1 0 0 1 0 0
0 0 0 0 0 1
     (_2 {. X) inds _2 {. Y
0 3 5
     (3 {. X) inds 3 {. Y
0 1 2 5
     0 3 5 ([ -. -.) 0 1 2 5
0 5

So maybe you would be looking at making an adverb along the lines of:

L=: adverb define
:
     (m {. x) inds m {. y
)

Or, more concisely:

L=: adverb define
     inds&(m&{.)
)

But when I look at your calculations, I don't see anything like this.

If I am off base here, could you describe how what you are looking
for
differs from what I am understanding?

Thanks,

--
Raul


On Wed, Oct 12, 2016 at 12:14 PM, Michal Dobrogost
<michal.dobrog...@gmail.com> wrote:

Hi All,

I've been mucking around generating look up tables in C++. Getting
frustrated, I wrote up this J single liner. Can you think of ways to
simplify the LUT computation (the expression we assign to L2 and L3

below)?

*Original Operator (no LUT)*

     ] x =. ? 5 $ 2
1 0 1 1 0
     ] y =. 30 > ? 5 6 $ 100
1 0 0 0 0 0
0 1 0 1 1 0
0 1 0 0 0 1
0 0 0 0 1 1
0 1 1 1 0 1
     x (+./@:*.) y
1 1 0 0 1 1

*LUT Explanation*

The idea is to break up x into smaller chunks (2-bits, 3-bits, etc)

and

precompute the operation for the corresponding chunks of y. Then we

just

convert the chunks into indices and look them up in the LUT.

     _2<\x
┌───┬───┬─┐
│1 0│1 1│0│
└───┴───┴─┘
     _2#.\x
2 3 0

*2-bit LUT*

     ]L2=. _2((#:@:i.@:(2&^)@:#) (+./"2 @: (*."0 1"1 _)) ])\y
0 0 0 0 0 0
0 1 0 1 1 0
1 0 0 0 0 0
1 1 0 1 1 0

0 0 0 0 0 0
0 0 0 0 1 1
0 1 0 0 0 1
0 1 0 0 1 1

0 0 0 0 0 0
0 1 1 1 0 1
0 0 0 0 0 0
0 0 0 0 0 0

     +./ (_2#.\x) {"0 2 L2    NB. Compute by 2-bit lookups
1 1 0 0 1 1

*3-Bit LUT*

     ]L3=. _3((#:@:i.@:(2&^)@:#) (+./"2 @: (*."0 1"1 _)) ])\y
0 0 0 0 0 0
0 1 0 0 0 1
0 1 0 1 1 0
0 1 0 1 1 1
1 0 0 0 0 0
1 1 0 0 0 1
1 1 0 1 1 0
1 1 0 1 1 1

0 0 0 0 0 0
0 1 1 1 0 1
0 0 0 0 1 1
0 1 1 1 1 1
0 0 0 0 0 0
0 0 0 0 0 0
0 0 0 0 0 0
0 0 0 0 0 0

     +./ (_3#.\x) {"0 2 L3    NB. Compute by 3-bit lookups
1 1 0 0 1 1
------------------------------------------------------------

----------

For information about J forums see http://www.jsoftware.com/forum

s.htm
------------------------------------------------------------
----------
For information about J forums see http://www.jsoftware.com/forum
s.htm


------------------------------------------------------------
----------
For information about J forums see http://www.jsoftware.com/forums.htm


---
This email has been checked for viruses by Avast antivirus software.
https://www.avast.com/antivirus


----------------------------------------------------------------------
For information about J forums see http://www.jsoftware.com/forums.htm

----------------------------------------------------------------------
For information about J forums see http://www.jsoftware.com/forums.htm

----------------------------------------------------------------------
For information about J forums see http://www.jsoftware.com/forums.htm

Reply via email to