On Tue, Jul 31, 2012 at 3:14 PM, Ben Pfaff <b...@nicira.com> wrote: > On Tue, Jul 31, 2012 at 10:38:21AM -0700, Ethan Jackson wrote: >> How performance critical is this popcount implementation going to be? >> I assume you've put all this work into testing it because the >> classifier will be relying on it heavily? > > Yes, I think it's going to be at least fairly common in the > classifier. I didn't measure that yet, because I think that there are > opportunities to avoid some of them. > >> Why do you think the gcc builtin is slow? That's surprising to me. Is >> it possible that in newer versions of gcc (i.e. 4.7 and later) would >> simply generate the assembly instruction? > > The GCC builtin is portable. I guess it's the same code as popcount4, > since they run at the same speed. > > The assembly instruction isn't portable. It isn't an architectural > instruction, that is, you can't rely on say, anything newer than Core > 2 to have it. There is a separate CPU feature bit for it that you > need to check before using it. So my guess is that GCC will never > generate it, even in the future, without some kind of specific > compiler option that says "CPU has popcnt instruction". > >> If it's so performance critical, could we simply check for the >> assembly instruction in the configure script, and if it exists use it. >> Of course, if it doesn't exist we would fall back to what you >> currently have. > > Configure time wouldn't be good enough, because we need to know about > the machine we're going to run on, not the one that we're building > on. We'd have to check at runtime instead.
Ah yes this makes sense. Figuring out whether or not the instruction exists at runtime would be a mess. Ethan _______________________________________________ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev