On Sat, 12 Nov 2011, Ronald S. Bultje wrote:

On Sat, Nov 12, 2011 at 2:18 PM, Martin Storsjö <[email protected]> wrote:
On Fri, 11 Nov 2011, Ronald S. Bultje wrote:

Hi,

On Fri, Nov 11, 2011 at 3:27 PM, Martin Storsjö <[email protected]> wrote:

On Fri, 11 Nov 2011, Ronald S. Bultje wrote:

As said, this is for the case where we read planar YUV, then
SwsContext->convertData[] is not used, instead the planar pointer is
used directly without any conversion. This is used-provided and may
thus be unaligned (we currently make no alignment requirements).

Hmm, since this commit last Saturday

Commit: c435653627529e22d74214c2266f571255e404d6

Author:    Ronald S. Bultje <[email protected]>
Committer: Ronald S. Bultje <[email protected]>
Date:      Sat Nov  5 17:31:40 2011 -0700

swscale: write yuv2plane1 MMX/SSE2/SSE4/AVX functions.

code that passes unaligned buffers to swscale started crashing. Do I read
the the comment above correctly, this is a bug - calling code shouldn't
need
to provide any particular alignment?

Yeah, if you tell me which instruction fails I'll fix it, or a
backtrace + disass is fine also.

On one machine, I get this backtrace + disassembly:

#0  ff_yuv2plane1_8_avx.loop_a () at libswscale/x86/scale.asm:804
#1  0x00000000007acdd1 in swScale (c=0xed3300, src=0x7fffffffe4e0,
   srcStride=0x7fffffffe500, srcSliceY=0, srcSliceH=<value optimized out>,
   dst=0x7fffffffe4c0, dstStride=0x7fffffffe510) at
libswscale/swscale.c:2510
#2  0x0000000000792d5d in sws_scale (c=0xed3300, srcSlice=0xec5c00,
   srcStride=<value optimized out>, srcSliceY=0, srcSliceH=19,
   dst=<value optimized out>, dstStride=0xecb240)
   at libswscale/swscale_unscaled.c:913

Dump of assembler code for function ff_yuv2plane1_8_avx.loop_a:
  0x00000000007c264e <+0>:     vpaddsw (%rdi,%rdx,2),%xmm2,%xmm0
  0x00000000007c2653 <+5>:     vpaddsw 0x10(%rdi,%rdx,2),%xmm3,%xmm1
  0x00000000007c2659 <+11>:    psraw  $0x7,%xmm0
  0x00000000007c265e <+16>:    psraw  $0x7,%xmm1
  0x00000000007c2663 <+21>:    packuswb %xmm1,%xmm0
=> 0x00000000007c2667 <+25>:    movdqa %xmm0,(%rsi,%rdx,1)
  0x00000000007c266c <+30>:    add    $0x10,%rdx
  0x00000000007c2670 <+34>:    jl     0x7c264e <ff_yuv2plane1_8_avx.loop_a>
  0x00000000007c2672 <+36>:    repz retq
End of assembler dump.

info all-registers? I wonder what rsi/rdx are here.

rax            0xed9370 15569776
rbx            0xed3300 15545088
rcx            0x9212f0 9573104
rdx            0xffffffffffffff9c       -100
rsi            0xecabd0 15510480
rdi            0xed9588 15570312
rbp            0xed8e88 0xed8e88
rsp            0x7fffffffe1b8   0x7fffffffe1b8
r8             0x0      0
r9             0x38     56
r10            0x0      0
r11            0x0      0
r12            0x1      1
r13            0xed8b04 15567620
r14            0xed8c10 15567888
r15            0xed8e10 15568400
rip            0x7c2667 0x7c2667 <ff_yuv2plane1_8_avx.loop_a+25>

// Martin
_______________________________________________
libav-devel mailing list
[email protected]
https://lists.libav.org/mailman/listinfo/libav-devel

Reply via email to