On Mon, Dec 11, 2017 at 06:00:11PM +0100, Kilian Verhetsel wrote: > Jakub Jelinek <ja...@redhat.com> writes: > > Of course it can be done efficiently, what we care most is that the body of > > the vectorized loop is efficient. > > That's fair, I was looking at the x86 assembly being generated when a single > vectorized iteration was enough (because that is the context in which I > first encountered this bug): > > int f(unsigned int *x, unsigned int k) { > unsigned int result = 8; > for (unsigned int i = 0; i < 8; i++) { > if (x[i] == k) result = i; > } > return result; > } > > where the vpand instruction this generates would have to be replaced > with a variable blend if the default value weren't 0 — although I had > not realized even SSE4.1 on x86 includes such an instruction, making > this point less relevant.
See https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80631#c6 where I've attached so far untested prototype. If it is added before your patch makes it in, your patch would start by introducing another kind (say SIMPLE_INTEGER_INDUC_COND_REDUCTION) and would use that for the spots that are handled by the PR80631 patch as INTEGER_INDUC_COND_REDUCTION right now and your code for the rest. E.g. the above testcase with my patch, because i is unsigned and base is the minimum of the type is emitted as COND_REDUCTION, which is what your patch would improve. > > Another thing is, as your patch is quite large, we need a copyright > > assignment for the changes before we can accept it, see > > https://gcc.gnu.org/contribute.html for details. > > > > If you are already covered by an assignment of some company, please tell > > us which one it is, otherwise contact us and we'll get you the needed > > forms. > > I am not covered by any copyright assignment yet. Do I need to send you > any additional information? I'll send it offlist. Jakub