Re: value range propagation for _bitwise_ OR

Adam D. Ruppe Tue, 13 Apr 2010 08:35:31 -0700

On Tue, Apr 13, 2010 at 11:10:24AM -0400, Clemens wrote:
> That's strange. Looking at src/backend/cod4.c, function cdbscan, in the dmd 
> sources, bsr seems to be implemented in terms of the bsr opcode [1] (which I 
> guess is the reason it's an intrinsic in the first place). I would have 
> expected this to be much, much faster than a user function. Anyone care 
> enough to check the generated assembly?


The opcode is fairly slow anyway (as far as opcodes go) - odds are the
implementation inside the processor is similar to Jerome's method, and
the main savings come from it loading fewer bytes into the pipeline.

I remember a line from a blog, IIRC it was the author of the C++ FQA
writing it, saying hardware and software are pretty much the same thing -
moving an instruction to hardware doesn't mean it will be any faster,
since it is the same algorithm, just done in processor microcode instead of
user opcodes.

-- 
Adam D. Ruppe
http://arsdnet.net

Re: value range propagation for _bitwise_ OR

Reply via email to