https://bugs.kde.org/show_bug.cgi?id=385409

--- Comment #44 from Julian Seward <jsew...@acm.org> ---
(In reply to Vadim Barkov from comment #43)

> What do you think about it?

Honestly, I do not understand.  You need to explain much more clearly:

* what these instructions actually do

* what your implementation strategy is

* what the problem cases are

I have the impression from comment 34 that vfenezbs %v0, %v0, %v0 finds
the index of the first zero in the operand and writes that somewhere in
the result register.  (But why do they all have to to be v0 ?).  We have
solved such problems in the past for amd64 and so it would help if you
could relate your work to the amd64 implementation.

For amd64, we generate the sequence

  t1 = CmpEQ8x16(vec, zero-vector)
  t2 = pmovmskb(t1), which moves one bit from each lane into t2
  t3 = count-leading-zeroes(t2)

Because the count-leading-zeroes operation can be done in a way in
which the result is defined even if there are undefined bits to the
right of the leftmost 1 bit, this means that the resulting t3 value
will be defined if the input vector consisted of defined non-zero
bytes terminated by a defined-zero, even if the bytes after that
are undefined.

-- 
You are receiving this mail because:
You are watching all bug changes.

Reply via email to