On Apr 6, 2012, at 11:26 PM, Jonathan Sauer wrote:
>> The speedup from vectorization isn't very large, as we fall back to bytewise
>> scanning when we hit a newline. There might be a way to avoid leaving the sse
>> loop but everything I tried didn't work out because a call to push_back
>> clobbers xmm registers.
> 
> Wouldn't that indicate a codegen bug? Judging from 
> <http://www.agner.org/optimize/calling_conventions.pdf>,
> most calling conventions specify that at least some xmm registers are scratch 
> registers, i.e. must be
> saved on the stack before calling a function and restored afterwards, if they 
> should keep their value.
> If that doesn't happen, it seems to me to be a bug in the compiler you used 
> to compile clang.

When we talk about clobbering a register, we usually assume that the compiler 
understands that the register has been invalidated and is taking precautions.  
Usually those precautions are expensive, e.g. spilling the register to the 
stack a lot.  In this case, Benjamin is saying that the performance impact of 
vectorization isn't as large as it could be because the compiler has to spill 
and rematerialize his vectors across a call that occurs at every newline.

The only ABI I know of that makes any of the XMM registers non-scratch is Win64.

John.
_______________________________________________
cfe-commits mailing list
[email protected]
http://lists.cs.uiuc.edu/mailman/listinfo/cfe-commits

Reply via email to