Don, The HSA on the z10 and z196 is 16GiB, more than enough room for a 64KiB table if that is the way the POPCNT is implemented on the z196.
I think it is likely that if the POPCNT instruction is milicoded, then it takes advantage of the translate hardware. And if so, there is no way to match the speed of POPCNT using the instructions available to us. I need to run some benchmarks on the z10 to see if using TRTO is faster than TR/TROO for the same number of bytes. If TRTO is faster, then my guess is that a 64KiB table is used for POPCNT. David On Sun, 1 Aug 2010 20:59:27 -0400, Don Higgins wrote: >Of course you are right that a larger static reference table such as a 64k >table to count bits in halfword would speed up simple loop solution, but I >don't think we will see that sort of solution in microcode for z196 >instruction. It just seems like too much of a memory requirement for one >instruction.
