On Apr 6, 2012, at 11:26 PM, Jonathan Sauer wrote: >> The speedup from vectorization isn't very large, as we fall back to bytewise >> scanning when we hit a newline. There might be a way to avoid leaving the sse >> loop but everything I tried didn't work out because a call to push_back >> clobbers xmm registers. > > Wouldn't that indicate a codegen bug? Judging from > <http://www.agner.org/optimize/calling_conventions.pdf>, > most calling conventions specify that at least some xmm registers are scratch > registers, i.e. must be > saved on the stack before calling a function and restored afterwards, if they > should keep their value. > If that doesn't happen, it seems to me to be a bug in the compiler you used > to compile clang.
When we talk about clobbering a register, we usually assume that the compiler understands that the register has been invalidated and is taking precautions. Usually those precautions are expensive, e.g. spilling the register to the stack a lot. In this case, Benjamin is saying that the performance impact of vectorization isn't as large as it could be because the compiler has to spill and rematerialize his vectors across a call that occurs at every newline. The only ABI I know of that makes any of the XMM registers non-scratch is Win64. John. _______________________________________________ cfe-commits mailing list [email protected] http://lists.cs.uiuc.edu/mailman/listinfo/cfe-commits
