hi dan, > OK, based on the existing nova code, attached is what I've written for > arm neon. It's *entirely theoretical* so far, 100% untested. I'm > sending it now in case of useful comments - I don't know how soon I'll > be able to get the code actually in use/testing on an ARMv7+NEON > machine.
looks quite good for now, some comments:
- gen_zero can usually be implemented efficiently with an xor operation:
zero = something ^ something
- set_vec: can you do this with vdupq_n_f32?
- the relational operators provide bitmasks, that can be used for argument
selection. something like:
for code like:
if (val > 1)
val = 1;
you can write vectorized:
vec = {0, 2, 0, 2};
bitmask = vec > {1, 1, 1, 1}; // gives [0, 0xffffffff, 0, 0xffffffff]
result = {1, 1, 1, 1} & bitmask // gives [0, 1, 0, 1]
the bitmask is used to compute the desired result. in more general cases, a
vectorized algorithm basically computes both sides of the `if' clause and uses
the bitmask to select (compare vec::select) the result.
in general, the implementation looks fine for me, if i can be of any further
help, please let me know ...
cheers, tim
--
[email protected]
http://tim.klingt.org
Desperation is the raw material of drastic change. Only those who can
leave behind everything they have ever believed in can hope to escape.
William S. Burroughs
signature.asc
Description: This is a digitally signed message part.
_______________________________________________ nova-dev mailing list [email protected] http://klingt.org/cgi-bin/mailman/listinfo/nova-dev http://tim.klingt.org/nova
