Don,

The HSA on the z10 and z196 is 16GiB, more than enough room for a 64KiB
table if that is the way the POPCNT is implemented on the z196.

I think it is likely that if the POPCNT instruction is milicoded, then it
takes advantage of the translate hardware. And if so, there is no way to
match the speed of POPCNT using the instructions available to us.

I need to run some benchmarks on the z10 to see if using TRTO is faster than
TR/TROO for the same number of bytes.  If TRTO is faster, then my guess is
that a 64KiB table is used for POPCNT.

David

On Sun, 1 Aug 2010 20:59:27 -0400, Don Higgins wrote:
>Of course you are right that a larger static reference table such as a 64k
>table to count bits in halfword would speed up simple loop solution, but I
>don't think we will see that sort of solution in microcode for z196
>instruction.  It just seems like too much of a memory requirement for one
>instruction.

Reply via email to