On Mon, 29 Aug 2011, Uros Bizjak wrote:

> On Mon, Aug 29, 2011 at 4:31 PM, Richard Guenther <rguent...@suse.de> wrote:
> 
> > This patch makes a conversion optab from the direct optabs vcond
> > and vcondu.  This allows to specify different modes for the
> > actual comparison and the value that is selected.
> >
> > All targets but i386 are trivially converted by
> > s/vcond<mode>/vcond<mode><mode>/.  The i386 port is enhanced
> > to support a OP b ? c : d as ({ mask = a OP b; (c & mask) | (d & ~mask);
> > }), constraining it to what the middle-end constrained itself to
> > (matching number of vector elements in the comparison operands with
> > the result vector types) would explode patterns too much.
> > Thus, only a subset of mode combinations will be excercised
> > (but none at the moment - a followup will fix the vectorizer,
> > and generic vectors from the C extensions have a patch pending).
> >
> > Bootstrapped on x86_64-unknown-linux-gnu, tests are currently
> > running for {,-m32}.
> >
> > Ok if that succeeds?
> >
> > Thanks,
> > Richard.
> >
> > 2011-08-29  Richard Guenther  <rguent...@suse.de>
> >
> >        * genopinit.c (optabs): Turn vcond{,u}_optab into a conversion
> >        optab with two modes.
> >        * optabs.h (enum convert_optab_index): Add COI_vcond, COI_vcondu.
> >        (enum direct_optab_index): Remove DOI_vcond, DOI_vcondu.
> >        (vcond_optab): Adjust.
> >        (vcondu_optab): Likewise.
> >        (expand_vec_cond_expr_p): Adjust prototype.
> >        * optabs.c (get_vcond_icode): Adjust.
> >        (expand_vec_cond_expr_p): Likewise.
> >        (expand_vec_cond_expr): Likewise.
> >        * tree-vect-stmt.c (vectorizable_condition): Adjust.
> >
> >        * config/i386/sse.md (vcond<mode>): Split to
> >        vcond<V_256:mode><VF_256:mode>, vcond<V_128:mode><VF_128:mode>,
> >        vcond<V_128:mode><VI124_128:mode> and
> >        vcondu<V_128:mode><VI124_128:mode>.
> >        (vcondv2di): Change to vcond<VI8F_128:mode>v2di.
> >        (vconduv2di): Likewise.
> >        * config/arm/neon.md (vcond<mode>): Change to vcond*<mode><mode>.
> >        (vcondu<mode>): Likewise.
> >        * config/ia64/vect.md (vcond<mode>): Likewise.
> >        (vcondu<mode>): Likewise.
> >        (vcondv2sf): Likewise.
> >        * config/mips/mips-ps-3d.md (vcondv2sf): Likewise.
> >        * config/rs6000/paired.md (vcondv2sf): Likewise.
> >        * config/rs6000/vector.md (vcond<mode>): Likewise.
> >        (vcondu<mode>): Likewise.
> >        * config/spu/spu.md (vcond<mode>): Likewise.
> >        (vcondu<mode>): Likewise.
> 
> Do we really want to introduce stuff like:
> 
> ! (define_expand "vcond<V_128:mode><VF_128:mode>"
> 
> You are in fact introducing 6x2 = 12 patterns, many of them (i.e.
> v16qiv2df combination) invalid.

Well, in principle they are not "invalid" they would be a short-hand -
the example of (subreg:V16QI (vcond:V2DI (... )).

> I'd prefer a pattern with mode-less operands 4 and 5, rejected in insn
> constraints for invalid combinations:

Hm, ok - that was the first variant I tried (well, but with modeless
operands 1 and 2, to keep 4 and 5 selcting int vs. fp compare).  But
modeless operands get you that annoying warning from the gen* programs.

How'd you ask if a pattern is available for vcondv4si with v4sf
operands 4 and 5?  The vectorizer would need to be able to ask this
question.

In the end putting both mask generation and apply into one
instruction pattern causes all this issues - but it helps
not exposing the mask representation of the HW.

> (define_expand "vcond<mode>"
>   [(set (match_operand:VI124_128 0 "register_operand" "")
>       (if_then_else:VI124_128
>         (match_operator 3 ""
>           [(match_operand 4 "nonimmediate_operand" "")
>            (match_operand 5 "nonimmediate_operand" "")])
>          (match_operand:VI124_128 1 "general_operand" "")
>          (match_operand:VI124_128 2 "general_operand" "")))]
>    "TARGET_SSE && (... some constraint involving modes of op4 and op5 ...)"
> 
> We could also introduce VI1248_128 with conditional V2DImode for ops 0, 1 and 
> 2:
> 
> (define_mode_iterator VI1248_128 [V16QI V8HI V4SI (V2DI "TARGET_SSE4_2")])

Yes, that sounds like a cleanup that is independently possible.

Richard.

Reply via email to