On Wed, 25 Jul 2012, Luca Barbato wrote:
> On 07/25/2012 07:40 AM, Jason Garrett-Glaser wrote:
>
>>> Do the x264 functions sign-extend all their integer arguments? Or put
>>> differently, does the problem occur for 32-bit builds also, or only
>>> for 64-bit builds?
>>
>> Yes they do, and such a problem wouldn't target solely _avx functions;
>> that wouldn't make any sense.
>>
>> How about start by checking for a missing vzeroupper?
>
> Is there a simple way to do that? We could fix that part sooner than later.

Done.
... And it does find a hit, ff_mix_1_to_2_fltp_flt_avx.

--Loren Merritt
#!/usr/bin/perl -w
$exe = $ARGV[0];
@ARGV==1 and -f $exe and -x $exe or die
    "usage: missing_vzeroupper.pl avconv\n".
    "Finds functions that use ymm and fail to reset the ymm state with vzeroupper,\n".
    "which thus incur a large speed penalty when mixed with non-avx xmm.\n";

open FH, "-|", "objdump", "-d", "-M", "intel", $exe or die "failed to run objdump: $!\n";
@funcs = split /^[0-9a-f]+ <([^. ]+)>:/m, join "", <FH>;
close FH or die;

# The fft functions are ok because the vzeroupper happens in fft_dispatch.
$exceptions = qr/^_?(fft\d+|pass)_/;

$err = 0;
shift @funcs;
while(@funcs) {
    my $funcname = shift @funcs;
    my $asm = shift @funcs;
    if($asm =~ /\bymm\d/ and $asm !~ /\bvzeroupper\b/ and $funcname !~ $exceptions) {
        print "$funcname\n";
        $err = 1;
    }
}
exit $err;
_______________________________________________
libav-devel mailing list
[email protected]
https://lists.libav.org/mailman/listinfo/libav-devel

Reply via email to