On Mon, May 07, 2012 at 01:04:33AM +0200, Uros Bizjak wrote:
> Index: i386.md
> ===================================================================
> --- i386.md (revision 187217)
> +++ i386.md (working copy)
> @@ -12112,9 +12112,22 @@
> (set (match_operand:SWI48 0 "register_operand" "=r")
> (ctz:SWI48 (match_dup 1)))]
> ""
> - "bsf{<imodesuffix>}\t{%1, %0|%0, %1}"
> +{
> + if (optimize_function_for_size_p (cfun))
> + return "bsf{<imodesuffix>}\t{%1, %0|%0, %1}";
> + else if (TARGET_BMI)
> + return "tzcnt{<imodesuffix>}\t{%1, %0|%0, %1}";
> + else
> + /* tzcnt expands to rep;bsf and we can use it even if !TARGET_BMI. */
> + return "rep; bsf{<imodesuffix>}\t{%1, %0|%0, %1}";
> +}
Shouldn't that be done only for generic tuning? If somebody uses
-mtune=native, then emitting rep; bsf is overkill, the code is intended
to be run on a CPU without (or with TARGET_BMI with) tzcnt insn support.
Jakub