On Wed, 25 Jul 2012, Luca Barbato wrote:
> On 07/25/2012 07:40 AM, Jason Garrett-Glaser wrote:
>
>>> Do the x264 functions sign-extend all their integer arguments? Or put
>>> differently, does the problem occur for 32-bit builds also, or only
>>> for 64-bit builds?
>>
>> Yes they do, and such a problem wouldn't target solely _avx functions;
>> that wouldn't make any sense.
>>
>> How about start by checking for a missing vzeroupper?
>
> Is there a simple way to do that? We could fix that part sooner than later.
Done.
... And it does find a hit, ff_mix_1_to_2_fltp_flt_avx.
--Loren Merritt
#!/usr/bin/perl -w
$exe = $ARGV[0];
@ARGV==1 and -f $exe and -x $exe or die
"usage: missing_vzeroupper.pl avconv\n".
"Finds functions that use ymm and fail to reset the ymm state with vzeroupper,\n".
"which thus incur a large speed penalty when mixed with non-avx xmm.\n";
open FH, "-|", "objdump", "-d", "-M", "intel", $exe or die "failed to run objdump: $!\n";
@funcs = split /^[0-9a-f]+ <([^. ]+)>:/m, join "", <FH>;
close FH or die;
# The fft functions are ok because the vzeroupper happens in fft_dispatch.
$exceptions = qr/^_?(fft\d+|pass)_/;
$err = 0;
shift @funcs;
while(@funcs) {
my $funcname = shift @funcs;
my $asm = shift @funcs;
if($asm =~ /\bymm\d/ and $asm !~ /\bvzeroupper\b/ and $funcname !~ $exceptions) {
print "$funcname\n";
$err = 1;
}
}
exit $err;
_______________________________________________
libav-devel mailing list
[email protected]
https://lists.libav.org/mailman/listinfo/libav-devel