Diego Biurrun <[email protected]> writes:
> On Fri, Jan 27, 2012 at 07:54:36AM +0800, Ronald S. Bultje wrote:
>> Also convert to yasm, add a loop to add_png_paeth_pred() to handle
>> the case where bpp>4 (e.g. RGB48), and add a SSE2 version for
>> add_bytes_l2().
>
> I'm with Mans on this one, the patch would benefit from splitting.
He already sent a split set.
>> --- /dev/null
>> +++ b/libavcodec/pngdsp.c
>> @@ -0,0 +1,50 @@
>> +
>> +static void add_bytes_l2_c(uint8_t *dst, uint8_t *src1, uint8_t *src2, int
>> w)
>> +{
>> + long i;
>> +
>> + for (i = 0; i <= w - sizeof(long); i += sizeof(long)){
>> + long a = *(long *)(src1 + i);
>> + long b = *(long *)(src2 + i);
>> + *(long *)(dst + i) = ((a & pb_7f) + (b & pb_7f)) ^ ((a ^ b) &
>> pb_80);
>
> Gah, pointer type punning ..
Sure, but it's not Ronald's fault. Moving it to another file doesn't
make it worse.
>> +void ff_pngdsp_init(PNGDSPContext *dsp)
>> +{
>> + dsp->add_bytes_l2 = add_bytes_l2_c;
>> + dsp->add_png_paeth_prediction = ff_add_png_paeth_prediction;
>
> nit: align
>
>> +
>> + if (HAVE_MMX)
>> + ff_pngdsp_x86_init(dsp);
>
> The arch-independent code should not have to care about x86 implementation
> details like MMX, YASM or whatever, so just ARCH_X86 should be enough.
> That said, we are not currently consistent about this. See also my remark
> on the x86 build part below.
>
> And yes, I have made a mental remark to look into this.
Almost all the x86 init functions use if(HAVE_MMX). I looked into it
once, and changing it to ARCH_X86 got rather messy.
>> --- a/libavcodec/x86/Makefile
>> +++ b/libavcodec/x86/Makefile
>> @@ -42,6 +42,8 @@ MMX-OBJS-$(CONFIG_ENCODERS) +=
>> x86/dsputilenc_mmx.o
>> YASM-OBJS-$(CONFIG_ENCODERS) += x86/dsputilenc_yasm.o
>> MMX-OBJS-$(CONFIG_GPL) += x86/idct_mmx.o
>> MMX-OBJS-$(CONFIG_LPC) += x86/lpc_mmx.o
>> +YASM-OBJS-$(CONFIG_PNG_DECODER) += x86/pngdsp.o
>> +MMX-OBJS-$(CONFIG_PNG_DECODER) += x86/pngdsp-init.o
>
> Everything in pngdsp-init.c depends on yasm, so I think both should
> be YASM-OBJS.
No, that won't work. The init function itself has to be available, or
the call to it would need a HAVE_YASM protection, which you (rightly)
disallowed just a moment ago.
>> +void ff_pngdsp_x86_init(PNGDSPContext *dsp)
>> +{
>> +#if HAVE_YASM
>> + int flags = av_get_cpu_flags();
>> +
>> +#if ARCH_X86_32
>> + if (flags & AV_CPU_FLAG_MMX) {
>> + dsp->add_bytes_l2 = ff_add_bytes_l2_mmx;
>> + }
>> +#endif
>> +
>> + if (flags & AV_CPU_FLAG_MMX2) {
>> + dsp->add_png_paeth_prediction = ff_add_png_paeth_prediction_mmx2;
>> + }
>> +
>> + if (flags & AV_CPU_FLAG_SSE2) {
>> + dsp->add_bytes_l2 = ff_add_bytes_l2_sse2;
>> + }
>> +
>> + if (flags & AV_CPU_FLAG_SSSE3) {
>> + dsp->add_png_paeth_prediction = ff_add_png_paeth_prediction_ssse3;
>> + }
>
> Doesn't ARCH_X86_64 imply SSE2?
Yes.
> What's the point of the ARCH_X86_32 #ifdef?
Maybe the function is only built for x86_32, since x86_64 is guaranteed
to have SSE2, which is better.
--
Måns Rullgård
[email protected]
_______________________________________________
libav-devel mailing list
[email protected]
https://lists.libav.org/mailman/listinfo/libav-devel