On Fri, Feb 6, 2026 at 11:13 PM Nathan Bossart <[email protected]> wrote:
>
> On Thu, Feb 05, 2026 at 02:48:44PM +0700, John Naylor wrote:
> > On Thu, Feb 5, 2026 at 4:43 AM Nathan Bossart <[email protected]> 
> > wrote:
> >> Sure.  I'm tempted to suggest that we only use the plain C version here,
> >> too.  The SSE4.2 bms_num_members() test I did yesterday used it and showed
> >> improvement at one word.  If we do that, we can rip out even more code
> >> since we no longer need the popcount built-ins.
> >
> > Unlike the 32-bit case, people do run production on 64-bit platforms
> > that are not Arm/x86, so that would require effort to see if the
> > builtins are worth it for them. That seems like a separate effort. I
> > can help with that, but let's get the tested stuff in first.
>
> Alright.  I moved that to a new 0004 patch that we can consider separately
> once 0001-0003 have been committed.

Okay, this is looking good. I have just one more suggestion: For 0002,
just copy the word-wise functions verbatim. That way, it's a pure
refactoring commit and the exception doesn't need explaining. With
that, I'd say go ahead and commit 0001/2.

Then after a bit more research, the final form of the inline functions
can be visible in a single commit. I've tested S390X already and hope
to test one other platform.

-- 
John Naylor
Amazon Web Services


Reply via email to