Hello, > The speedup from vectorization isn't very large, as we fall back to bytewise > scanning when we hit a newline. There might be a way to avoid leaving the sse > loop but everything I tried didn't work out because a call to push_back > clobbers xmm registers.
Wouldn't that indicate a codegen bug? Judging from <http://www.agner.org/optimize/calling_conventions.pdf>, most calling conventions specify that at least some xmm registers are scratch registers, i.e. must be saved on the stack before calling a function and restored afterwards, if they should keep their value. If that doesn't happen, it seems to me to be a bug in the compiler you used to compile clang. Jonathan _______________________________________________ cfe-commits mailing list [email protected] http://lists.cs.uiuc.edu/mailman/listinfo/cfe-commits
