Re: [FFmpeg-devel] [PATCH 2/3] x86/hevcdsp: add ff_hevc_sao_edge_filter_8_{ssse3, avx2}

2015-02-06 Thread James Almer
On 06/02/15 1:02 PM, Christophe Gisquet wrote: > Sure. Hopefully, the difference in number of gpr-passed args between > UNIX64 and WIN64 can be handled. I'll send a patch in a moment. Thanks. ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://f

Re: [FFmpeg-devel] [PATCH 2/3] x86/hevcdsp: add ff_hevc_sao_edge_filter_8_{ssse3, avx2}

2015-02-06 Thread Christophe Gisquet
Hi, 2015-02-06 16:41 GMT+01:00 James Almer : > Wouldn't it be better to just use the same code as UNIX64 instead? Now that > we're going to load > the address of the table to a reg, there's not point in having a whole > separate init path for > WIN64. > One for X86_64 and one for X86_32 (Where a

Re: [FFmpeg-devel] [PATCH 2/3] x86/hevcdsp: add ff_hevc_sao_edge_filter_8_{ssse3, avx2}

2015-02-06 Thread James Almer
On 06/02/15 9:49 AM, Christophe Gisquet wrote: > diff --git a/libavcodec/x86/hevc_sao.asm b/libavcodec/x86/hevc_sao.asm > index 5136121..8619716 100644 > --- a/libavcodec/x86/hevc_sao.asm > +++ b/libavcodec/x86/hevc_sao.asm > @@ -296,14 +296,16 @@ HEVC_SAO_BAND_FILTER_16 12, 64, 2 > %if WIN64 > c

Re: [FFmpeg-devel] [PATCH 2/3] x86/hevcdsp: add ff_hevc_sao_edge_filter_8_{ssse3, avx2}

2015-02-06 Thread James Almer
On 06/02/15 7:07 AM, Hendrik Leppkes wrote: > For some reason, the non-WIN64 version does just this, but the WIN64 > version does not. > Any particular reason for this difference? We support PIC on WIN64 but not with others. Or at least x86inc.asm enables it that way. According to msvc documenta

Re: [FFmpeg-devel] [PATCH 2/3] x86/hevcdsp: add ff_hevc_sao_edge_filter_8_{ssse3, avx2}

2015-02-06 Thread Christophe Gisquet
Hi, 2015-02-06 11:07 GMT+01:00 Hendrik Leppkes : > I looked into the MSVC 64bit failure from this patch, and from what I > can tell doing this doesn't work: > movsx a_strideq, byte [pb_eo+eoq*4+1] > > I'm not entirely sure on the specifics why it breaks however.. > But all I could find sugges

Re: [FFmpeg-devel] [PATCH 2/3] x86/hevcdsp: add ff_hevc_sao_edge_filter_8_{ssse3, avx2}

2015-02-06 Thread Hendrik Leppkes
On Thu, Feb 5, 2015 at 7:07 PM, James Almer wrote: > > On 05/02/15 12:49 PM, Christophe Gisquet wrote: > > Hi, > > > > 2015-02-05 5:18 GMT+01:00 James Almer : > >> Original x86 intrinsics code and initial yasm port by Pierre-Edouard > >> Lepere. > >> Refactoring and optimizations by James Almer.

Re: [FFmpeg-devel] [PATCH 2/3] x86/hevcdsp: add ff_hevc_sao_edge_filter_8_{ssse3, avx2}

2015-02-05 Thread James Almer
On 05/02/15 12:49 PM, Christophe Gisquet wrote: > Hi, > > 2015-02-05 5:18 GMT+01:00 James Almer : >> Original x86 intrinsics code and initial yasm port by Pierre-Edouard Lepere. >> Refactoring and optimizations by James Almer. > > No further comment from me. Pushed, thanks.

Re: [FFmpeg-devel] [PATCH 2/3] x86/hevcdsp: add ff_hevc_sao_edge_filter_8_{ssse3, avx2}

2015-02-05 Thread Christophe Gisquet
Hi, 2015-02-05 5:18 GMT+01:00 James Almer : > Original x86 intrinsics code and initial yasm port by Pierre-Edouard Lepere. > Refactoring and optimizations by James Almer. No further comment from me. -- Christophe ___ ffmpeg-devel mailing list ffmpeg-d

[FFmpeg-devel] [PATCH 2/3] x86/hevcdsp: add ff_hevc_sao_edge_filter_8_{ssse3, avx2}

2015-02-04 Thread James Almer
Original x86 intrinsics code and initial yasm port by Pierre-Edouard Lepere. Refactoring and optimizations by James Almer. Benchmarks of BQTerrace_1920x1080_60_qp22.bin with an Intel Core i5-4200U Width 32 158583 decicycles in edge, sao_edge_filter_8 runs, 0 skips 5205 decicycles in ff_hevc_sao_e