Hi, On Sat, Nov 12, 2011 at 2:50 PM, Martin Storsjö <[email protected]> wrote: > On Sat, 12 Nov 2011, Ronald S. Bultje wrote: > >> Hi, >> >> On Sat, Nov 12, 2011 at 2:25 PM, Martin Storsjö <[email protected]> wrote: >>> >>> On Sat, 12 Nov 2011, Ronald S. Bultje wrote: >>> >>>> On Sat, Nov 12, 2011 at 2:18 PM, Martin Storsjö <[email protected]> >>>> wrote: >>>>> >>>>> On Fri, 11 Nov 2011, Ronald S. Bultje wrote: >>>>> >>>>>> Hi, >>>>>> >>>>>> On Fri, Nov 11, 2011 at 3:27 PM, Martin Storsjö <[email protected]> >>>>>> wrote: >>>>>>> >>>>>>> On Fri, 11 Nov 2011, Ronald S. Bultje wrote: >>>>>>> >>>>>>>> As said, this is for the case where we read planar YUV, then >>>>>>>> SwsContext->convertData[] is not used, instead the planar pointer is >>>>>>>> used directly without any conversion. This is used-provided and may >>>>>>>> thus be unaligned (we currently make no alignment requirements). >>>>>>> >>>>>>> Hmm, since this commit last Saturday >>>>>>> >>>>>>>> Commit: c435653627529e22d74214c2266f571255e404d6 >>>>>>>> >>>>>>>> Author: Ronald S. Bultje <[email protected]> >>>>>>>> Committer: Ronald S. Bultje <[email protected]> >>>>>>>> Date: Sat Nov 5 17:31:40 2011 -0700 >>>>>>>> >>>>>>>> swscale: write yuv2plane1 MMX/SSE2/SSE4/AVX functions. >>>>>>> >>>>>>> code that passes unaligned buffers to swscale started crashing. Do I >>>>>>> read >>>>>>> the the comment above correctly, this is a bug - calling code >>>>>>> shouldn't >>>>>>> need >>>>>>> to provide any particular alignment? >>>>>> >>>>>> Yeah, if you tell me which instruction fails I'll fix it, or a >>>>>> backtrace + disass is fine also. >>>>> >>>>> On one machine, I get this backtrace + disassembly: >>>>> >>>>> #0 ff_yuv2plane1_8_avx.loop_a () at libswscale/x86/scale.asm:804 >>>>> #1 0x00000000007acdd1 in swScale (c=0xed3300, src=0x7fffffffe4e0, >>>>> srcStride=0x7fffffffe500, srcSliceY=0, srcSliceH=<value optimized >>>>> out>, >>>>> dst=0x7fffffffe4c0, dstStride=0x7fffffffe510) at >>>>> libswscale/swscale.c:2510 >>>>> #2 0x0000000000792d5d in sws_scale (c=0xed3300, srcSlice=0xec5c00, >>>>> srcStride=<value optimized out>, srcSliceY=0, srcSliceH=19, >>>>> dst=<value optimized out>, dstStride=0xecb240) >>>>> at libswscale/swscale_unscaled.c:913 >>>>> >>>>> Dump of assembler code for function ff_yuv2plane1_8_avx.loop_a: >>>>> 0x00000000007c264e <+0>: vpaddsw (%rdi,%rdx,2),%xmm2,%xmm0 >>>>> 0x00000000007c2653 <+5>: vpaddsw 0x10(%rdi,%rdx,2),%xmm3,%xmm1 >>>>> 0x00000000007c2659 <+11>: psraw $0x7,%xmm0 >>>>> 0x00000000007c265e <+16>: psraw $0x7,%xmm1 >>>>> 0x00000000007c2663 <+21>: packuswb %xmm1,%xmm0 >>>>> => 0x00000000007c2667 <+25>: movdqa %xmm0,(%rsi,%rdx,1) >>>>> 0x00000000007c266c <+30>: add $0x10,%rdx >>>>> 0x00000000007c2670 <+34>: jl 0x7c264e >>>>> <ff_yuv2plane1_8_avx.loop_a> >>>>> 0x00000000007c2672 <+36>: repz retq >>>>> End of assembler dump. >>>> >>>> info all-registers? I wonder what rsi/rdx are here. >>> >>> rax 0xed9370 15569776 >>> rbx 0xed3300 15545088 >>> rcx 0x9212f0 9573104 >>> rdx 0xffffffffffffff9c -100 >> >> Ah, your width is not a multiple of 16. Does adding these two lines at >> the bottom of scale.asm fix it? >> >> %macro yuv2plane1_fn 3 >> cglobal yuv2plane1_%1, %3, %3, %2 >> + add r2, mmsize - 1 >> + and r2, ~(mmsize - 1) >> %if %1 == 8 >> add r1, r2 > > Yes, that seems to fix the issue - it doesn't crash any longer in normal use > (haven't tested all configurations, but some simple use works fine) and the > output looks correct. Thanks!
Feel free to submit the patch, working on something else right now so I'll be another few days. Ronald _______________________________________________ libav-devel mailing list [email protected] https://lists.libav.org/mailman/listinfo/libav-devel
