Hi,
On Sat, Jul 28, 2012 at 4:57 PM, Justin Ruggles
justin.rugg...@gmail.com wrote:
---
libavresample/x86/audio_convert.asm| 49
libavresample/x86/audio_convert_init.c |9 ++
2 files changed, 58 insertions(+), 0 deletions(-)
OK.
Ronald
From: Ronald S. Bultje rsbul...@gmail.com
This way, the code looks less like spaghetti, and is easier to parse
for external preprocessors.
---
avconv.c | 44
1 file changed, 28 insertions(+), 16 deletions(-)
diff --git a/avconv.c b/avconv.c
index
From: Ronald S. Bultje rsbul...@gmail.com
This way, the code looks less like spaghetti, and is easier to parse
for external preprocessors.
---
libavformat/utils.c |9 +++--
1 file changed, 7 insertions(+), 2 deletions(-)
diff --git a/libavformat/utils.c b/libavformat/utils.c
index
From: Ronald S. Bultje rsbul...@gmail.com
This way, the code looks less like spaghetti, and is easier to parse
for external preprocessors.
---
libavfilter/avfilter.c | 10 +++---
libavfilter/vsrc_testsrc.c |7 +--
2 files changed, 12 insertions(+), 5 deletions(-)
diff --git
From: Ronald S. Bultje rsbul...@gmail.com
---
libavcodec/h264_ps.c |3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/libavcodec/h264_ps.c b/libavcodec/h264_ps.c
index 3f53af8..7d9d596 100644
--- a/libavcodec/h264_ps.c
+++ b/libavcodec/h264_ps.c
@@ -431,6 +431,7 @@ int
Hi,
On Thu, Jul 26, 2012 at 11:40 PM, Luca Barbato lu_z...@gentoo.org wrote:
On 07/27/2012 07:16 AM, Ronald S. Bultje wrote:
From: Ronald S. Bultje rsbul...@gmail.com
64-bit CPUs always have SSE2, and a SSE2 version exists, thus the MMX
version will never be used.
---
libavcodec/x86
Hi,
On Fri, Jul 27, 2012 at 8:54 AM, Justin Ruggles
justin.rugg...@gmail.com wrote:
On 07/21/2012 05:39 PM, Justin Ruggles wrote:
---
Updated patch to allow float vs. dword min/max as a parameter to CLIPD
instead of using 2 separate macros.
libavcodec/x86/dsputil_mmx.c|6 ++--
Hi,
On Sat, Jul 21, 2012 at 2:39 PM, Justin Ruggles
justin.rugg...@gmail.com wrote:
%macro VECTOR_CLIP_INT32 2
cglobal vector_clip_int32, 5,5,11, dst, src, min, max, len
+SPLATD_LOW m4, minm
+SPLATD_LOW m5, maxm
%if notcpuflag(sse4) cpuflag(sse2) notcpuflag(atom)
-cvtsi2ss
Hi,
On Thu, Jul 26, 2012 at 6:42 AM, Måns Rullgård m...@mansr.com wrote:
Ronald S. Bultje rsbul...@gmail.com writes:
Hi,
On Thu, Jul 26, 2012 at 2:23 AM, Måns Rullgård m...@mansr.com wrote:
Ronald S. Bultje rsbul...@gmail.com writes:
From: Ronald S. Bultje rsbul...@gmail.com
Hi,
On Thu, Jul 26, 2012 at 9:46 AM, Ronald S. Bultje rsbul...@gmail.com wrote:
On Thu, Jul 26, 2012 at 9:05 AM, Måns Rullgård m...@mansr.com wrote:
Ronald S. Bultje rsbul...@gmail.com writes:
On Thu, Jul 26, 2012 at 7:30 AM, Martin Storsjö mar...@martin.st wrote:
On Thu, 26 Jul 2012, Ronald
From: Loren Merritt lor...@u.washington.edu
This allows us to unconditionally set the cglobal num_args
parameter to a bigger value, thus making writing yasm code
even easier than before.
Signed-off-by: Ronald S. Bultje rsbul...@gmail.com
---
libavutil/x86/x86inc.asm |3 +++
1 file changed
From: Ronald S. Bultje rsbul...@gmail.com
This completes the conversion of h264dsp to yasm; note that h264 also
uses some dsputil functions, most notably qpel. Performance-wise, the
yasm-version is ~10 cycles faster (182-172) on x86-64, and ~8 cycles
faster (201-193) on x86-32.
---
libavcodec
Hi,
On Fri, Jul 27, 2012 at 11:39 AM, Diego Biurrun di...@biurrun.de wrote:
On Thu, Jul 26, 2012 at 08:38:27PM -0700, Ronald S. Bultje wrote:
--- a/libavcodec/x86/proresdsp.asm
+++ b/libavcodec/x86/proresdsp.asm
@@ -406,27 +405,25 @@ cglobal prores_idct_put_10_%1, 4, 4, %2
-INIT_XMM
From: Ronald S. Bultje rsbul...@gmail.com
---
libavcodec/x86/h264_deblock.asm | 126 +++--
libavcodec/x86/h264_deblock_10bit.asm | 77 ++--
libavcodec/x86/h264dsp_mmx.c | 60
3 files changed, 141 insertions(+), 122
From: Ronald S. Bultje rsbul...@gmail.com
---
libavcodec/x86/vp3dsp.asm | 94 ++---
1 file changed, 47 insertions(+), 47 deletions(-)
diff --git a/libavcodec/x86/vp3dsp.asm b/libavcodec/x86/vp3dsp.asm
index af2f60c..5877520 100644
--- a/libavcodec/x86
Hi,
On Fri, Jul 27, 2012 at 2:49 PM, Diego Biurrun di...@biurrun.de wrote:
On Thu, Jul 26, 2012 at 08:54:30PM -0700, Ronald S. Bultje wrote:
--- a/libavcodec/x86/h264_idct_10bit.asm
+++ b/libavcodec/x86/h264_idct_10bit.asm
@@ -72,25 +72,25 @@ SECTION .text
;;; NO FATE SAMPLES TRIGGER
Hi,
On Fri, Jul 27, 2012 at 4:45 PM, Diego Biurrun di...@biurrun.de wrote:
On Fri, Jul 27, 2012 at 03:08:26PM -0700, Ronald S. Bultje wrote:
--- a/libavcodec/x86/h264_deblock.asm
+++ b/libavcodec/x86/h264_deblock.asm
@@ -282,8 +282,8 @@ cextern pb_A1
Hi,
On Fri, Jul 27, 2012 at 5:04 PM, Diego Biurrun di...@biurrun.de wrote:
On Fri, Jul 27, 2012 at 04:49:18PM -0700, Ronald S. Bultje wrote:
On Fri, Jul 27, 2012 at 4:45 PM, Diego Biurrun di...@biurrun.de wrote:
On Fri, Jul 27, 2012 at 03:08:26PM -0700, Ronald S. Bultje wrote
From: Ronald S. Bultje rsbul...@gmail.com
---
libavcodec/x86/h264_deblock.asm | 104 -
libavcodec/x86/h264_deblock_10bit.asm | 77
libavcodec/x86/h264dsp_mmx.c | 60 +--
3 files changed, 120 insertions
Hi,
On Fri, Jul 27, 2012 at 2:43 PM, Måns Rullgård m...@mansr.com wrote:
However, the question still remains why it is in generic code.
That's hard to say in hindsight, but it seems it was for simplicity so
that you don't have to add it to each individual mmx function, thus
making the asumption
Hi,
On Wed, Jul 25, 2012 at 10:32 PM, Luca Barbato lu_z...@gentoo.org wrote:
From: Ronald S. Bultje rsbul...@gmail.com
---
Here my initial twist about it, ideally I'd consider moving os_support
in libavu and include it automagically from config.h
I'm not sure why, we do similar hacks
Hi,
On Wed, Jul 25, 2012 at 11:05 PM, Alex Converse alex.conve...@gmail.com wrote:
On Wed, Jul 25, 2012 at 8:42 PM, Ronald S. Bultje rsbul...@gmail.com wrote:
From: Ronald S. Bultje rsbul...@gmail.com
This fixes make fate-eval on MSVC builds. Without this, the test outputs
-1.#NaN instead
Hi,
On Thu, Jul 26, 2012 at 2:06 AM, Diego Biurrun di...@biurrun.de wrote:
On Thu, Jul 26, 2012 at 05:10:10AM +0200, Luca Barbato wrote:
On 07/26/2012 04:27 AM, Ronald S. Bultje wrote:
From: Ronald S. Bultje rsbul...@gmail.com
---
libswscale/swscale.c |2 +-
1 file changed, 1
Hi,
On Thu, Jul 26, 2012 at 2:23 AM, Måns Rullgård m...@mansr.com wrote:
Ronald S. Bultje rsbul...@gmail.com writes:
From: Ronald S. Bultje rsbul...@gmail.com
---
libavcodec/dct-test.c |2 +-
libavcodec/x86/dsputilenc_mmx.c | 80
Hi,
On Thu, Jul 26, 2012 at 2:06 AM, Diego Biurrun di...@biurrun.de wrote:
On Thu, Jul 26, 2012 at 05:10:10AM +0200, Luca Barbato wrote:
On 07/26/2012 04:27 AM, Ronald S. Bultje wrote:
From: Ronald S. Bultje rsbul...@gmail.com
---
libswscale/swscale.c |2 +-
1 file changed, 1
Hi,
On Thu, Jul 26, 2012 at 7:30 AM, Martin Storsjö mar...@martin.st wrote:
On Thu, 26 Jul 2012, Ronald S. Bultje wrote:
Hi,
On Thu, Jul 26, 2012 at 2:06 AM, Diego Biurrun di...@biurrun.de wrote:
On Thu, Jul 26, 2012 at 05:10:10AM +0200, Luca Barbato wrote:
On 07/26/2012 04:27 AM, Ronald
Hi,
On Thu, Jul 26, 2012 at 9:05 AM, Måns Rullgård m...@mansr.com wrote:
Ronald S. Bultje rsbul...@gmail.com writes:
On Thu, Jul 26, 2012 at 7:30 AM, Martin Storsjö mar...@martin.st wrote:
On Thu, 26 Jul 2012, Ronald S. Bultje wrote:
On Thu, Jul 26, 2012 at 2:06 AM, Diego Biurrun di
Hi guys,
discussion thread. We currently use HAVE_SSSE3 and related macros to
indicate that we want to compile these and that our compiler tools are
good enough to know what to do with it. As a result, we currently use
HAVE_AVX around all avx code (yasm only - we don't have any avx inline
asm),
Hi,
On Thu, Jul 26, 2012 at 2:39 PM, Diego Biurrun di...@biurrun.de wrote:
On Thu, Jul 26, 2012 at 01:50:17PM -0700, Ronald S. Bultje wrote:
discussion thread. We currently use HAVE_SSSE3 and related macros to
indicate that we want to compile these and that our compiler tools are
good enough
Hi,
On Thu, Jul 26, 2012 at 3:51 PM, Loren Merritt lor...@u.washington.edu wrote:
14% faster on penryn, 2% on sandybridge, 9% on bulldozer
---
libavfilter/vf_hqdn3d.c | 157 +++---
1 files changed, 51 insertions(+), 106 deletions(-)
Looks good.
I am
Hi,
On Thu, Jul 26, 2012 at 3:51 PM, Loren Merritt lor...@u.washington.edu wrote:
11% faster on penryn, 7% on sandybridge, 5% on bulldozer
Negligible change to output.
---
libavfilter/vf_hqdn3d.c | 62 --
1 files changed, 32 insertions(+), 30
Hi,
On Thu, Jul 26, 2012 at 3:51 PM, Loren Merritt lor...@u.washington.edu wrote:
---
libavfilter/vf_hqdn3d.c | 68 +-
1 files changed, 49 insertions(+), 19 deletions(-)
Can you add 9bpp support also? Not that it's used much, but it'll use
the
Hi,
On Thu, Jul 26, 2012 at 3:54 PM, Diego Biurrun di...@biurrun.de wrote:
On Thu, Jul 26, 2012 at 03:42:24PM -0700, Ronald S. Bultje wrote:
On Thu, Jul 26, 2012 at 2:39 PM, Diego Biurrun di...@biurrun.de wrote:
On Thu, Jul 26, 2012 at 01:50:17PM -0700, Ronald S. Bultje wrote:
discussion
Hi,
On Thu, Jul 26, 2012 at 5:15 PM, Diego Biurrun di...@biurrun.de wrote:
---
libavcodec/Makefile |2 +-
tests/fate/dct.mak |2 +-
2 files changed, 2 insertions(+), 2 deletions(-)
This test tests a lot more than just aan dct?
Ronald
___
From: Ronald S. Bultje rsbul...@gmail.com
---
libavutil/x86/x86inc.asm | 216 ++
1 file changed, 124 insertions(+), 92 deletions(-)
diff --git a/libavutil/x86/x86inc.asm b/libavutil/x86/x86inc.asm
index b76a10c..23d9d57 100644
--- a/libavutil/x86
Hi,
On Thu, Jul 26, 2012 at 6:42 PM, Loren Merritt lor...@u.washington.edu wrote:
---
libavfilter/vf_hqdn3d.c | 72 ++
1 files changed, 53 insertions(+), 19 deletions(-)
OK.
Ronald
___
libav-devel
From: Ronald S. Bultje rsbul...@gmail.com
---
libavcodec/x86/proresdsp.asm | 39 ++-
1 file changed, 18 insertions(+), 21 deletions(-)
diff --git a/libavcodec/x86/proresdsp.asm b/libavcodec/x86/proresdsp.asm
index 9b2e11e..70fd686 100644
--- a/libavcodec
From: Ronald S. Bultje rsbul...@gmail.com
---
libavcodec/x86/dsputil_mmx.c | 16 ++---
libavcodec/x86/h264_chromamc_10bit.asm | 40
2 files changed, 28 insertions(+), 28 deletions(-)
diff --git a/libavcodec/x86/dsputil_mmx.c b/libavcodec
From: Ronald S. Bultje rsbul...@gmail.com
---
libavcodec/x86/h264_idct_10bit.asm | 210 ++--
1 file changed, 105 insertions(+), 105 deletions(-)
diff --git a/libavcodec/x86/h264_idct_10bit.asm
b/libavcodec/x86/h264_idct_10bit.asm
index 934a7ff..fd61c98 100644
From: Ronald S. Bultje rsbul...@gmail.com
---
libavcodec/x86/h264_deblock.asm | 120 ++-
libavcodec/x86/h264dsp_mmx.c| 42 +++---
2 files changed, 88 insertions(+), 74 deletions(-)
diff --git a/libavcodec/x86/h264_deblock.asm b/libavcodec/x86
From: Ronald S. Bultje rsbul...@gmail.com
---
libavcodec/x86/h264_deblock.asm | 124 +++--
libavcodec/x86/h264_deblock_10bit.asm | 77 ++--
libavcodec/x86/h264dsp_mmx.c | 60
3 files changed, 139 insertions(+), 122
From: Ronald S. Bultje rsbul...@gmail.com
---
libavcodec/x86/vp56dsp.asm | 34 +++---
1 file changed, 15 insertions(+), 19 deletions(-)
diff --git a/libavcodec/x86/vp56dsp.asm b/libavcodec/x86/vp56dsp.asm
index 66a97f1..27a82bc 100644
--- a/libavcodec/x86
From: Ronald S. Bultje rsbul...@gmail.com
All x86-64 CPUs have SSE2, so the MMX version will never be used. This
leads to smaller binaries.
---
libavcodec/x86/vp56dsp.asm|2 ++
libavcodec/x86/vp56dsp_init.c |2 ++
2 files changed, 4 insertions(+)
diff --git a/libavcodec/x86
From: Ronald S. Bultje rsbul...@gmail.com
---
libavcodec/x86/vp3dsp.asm | 36 ++--
1 file changed, 22 insertions(+), 14 deletions(-)
diff --git a/libavcodec/x86/vp3dsp.asm b/libavcodec/x86/vp3dsp.asm
index af2f60c..98b1cb5 100644
--- a/libavcodec/x86/vp3dsp.asm
From: Ronald S. Bultje rsbul...@gmail.com
64-bit CPUs always have SSE2, and a SSE2 version exists, thus the MMX
version will never be used.
---
libavcodec/x86/vp3dsp.asm|3 +++
libavcodec/x86/vp3dsp_init.c |2 ++
2 files changed, 5 insertions(+)
diff --git a/libavcodec/x86
From: Ronald S. Bultje rsbul...@gmail.com
---
libavcodec/x86/rv34dsp.asm | 11 ++-
1 file changed, 6 insertions(+), 5 deletions(-)
diff --git a/libavcodec/x86/rv34dsp.asm b/libavcodec/x86/rv34dsp.asm
index 32bcdce..c43b77a 100644
--- a/libavcodec/x86/rv34dsp.asm
+++ b/libavcodec/x86
Hi,
On Mon, Jul 23, 2012 at 5:30 PM, Derek Buitenhuis
derek.buitenh...@gmail.com wrote:
From: Yang Wang yang.y.w...@intel.com
In ff_put_pixels_clamped_mmx(), there are two assembly code blocks.
In the first block (in the unrolled loop), the instructions
movq 8%3, %%mm1 \n\t, and so forth,
Hi,
On Tue, Jul 24, 2012 at 2:03 PM, Justin Ruggles
justin.rugg...@gmail.com wrote:
---
configure|5 +
libavutil/x86/x86inc.asm | 16 +++-
2 files changed, 16 insertions(+), 5 deletions(-)
OK.
Ronald
___
From: Ronald S. Bultje rsbul...@gmail.com
---
libswscale/swscale.c |2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/libswscale/swscale.c b/libswscale/swscale.c
index 5cfa7f2..0f8ef2b 100644
--- a/libswscale/swscale.c
+++ b/libswscale/swscale.c
@@ -661,7 +661,7 @@ static int
From: Ronald S. Bultje rsbul...@gmail.com
---
libavcodec/x86/dct32_sse.asm|2 --
libavcodec/x86/dsputil_yasm.asm | 14 --
libavcodec/x86/fft_mmx.asm |6 --
libavresample/x86/audio_convert.asm | 10 --
libavresample/x86/audio_mix.asm
From: Ronald S. Bultje rsbul...@gmail.com
---
libavcodec/dct-test.c |2 +-
libavcodec/x86/dsputilenc_mmx.c | 80 +++
libavcodec/x86/fdct_mmx.c |4 ++
libavcodec/x86/motion_est_mmx.c |6 +++
libavcodec/x86/mpegvideo_mmx.c |6
From: Ronald S. Bultje rsbul...@gmail.com
---
libavcodec/x86/mpegaudiodec_mmx.c |5 +
1 file changed, 5 insertions(+)
diff --git a/libavcodec/x86/mpegaudiodec_mmx.c
b/libavcodec/x86/mpegaudiodec_mmx.c
index f51a06d..88a3477 100644
--- a/libavcodec/x86/mpegaudiodec_mmx.c
+++ b
From: Ronald S. Bultje rsbul...@gmail.com
---
libavutil/eval.c | 35 +++
1 file changed, 35 insertions(+)
diff --git a/libavutil/eval.c b/libavutil/eval.c
index ff3191d..ef37ad8 100644
--- a/libavutil/eval.c
+++ b/libavutil/eval.c
@@ -26,6 +26,7 @@
* see http
From: Ronald S. Bultje rsbul...@gmail.com
This fixes make fate-eval on MSVC builds. Without this, the test outputs
-1.#NaN instead of nan on MSVS 2010.
---
libavutil/eval.c |5 +
1 file changed, 5 insertions(+)
diff --git a/libavutil/eval.c b/libavutil/eval.c
index ef37ad8..6131263
From: Ronald S. Bultje rsbul...@gmail.com
---
libavcodec/x86/dct32_sse.asm|2 --
libavcodec/x86/dsputil_yasm.asm | 14 --
libavcodec/x86/fft_mmx.asm |6 --
libavresample/x86/audio_convert.asm | 10 --
libavresample/x86/audio_mix.asm
From: Ronald S. Bultje rsbul...@gmail.com
---
avconv.c |5 +++--
avprobe.c |5 +++--
2 files changed, 6 insertions(+), 4 deletions(-)
diff --git a/avconv.c b/avconv.c
index 7142ab4..439672a 100644
--- a/avconv.c
+++ b/avconv.c
@@ -104,7 +104,7 @@ typedef struct MetadataMap {
int
Hi,
On Tue, Jul 24, 2012 at 7:25 AM, Luca Barbato lu_z...@gentoo.org wrote:
On 7/24/12 4:45 AM, Ronald S. Bultje wrote:
Hi,
On Mon, Jul 23, 2012 at 7:45 PM, Ronald S. Bultje rsbul...@gmail.com
wrote:
Hi,
On Mon, Jul 23, 2012 at 5:37 PM, Daniel Kang daniel.d.k...@gmail.com
wrote
From: Ronald S. Bultje rsbul...@gmail.com
Idea stolen from webp (by Pascal Massimino) - because it's Cool.
---
libavcodec/vp8.c | 12 ++--
1 file changed, 6 insertions(+), 6 deletions(-)
diff --git a/libavcodec/vp8.c b/libavcodec/vp8.c
index d0e2a0c..e4cfbcb 100644
--- a/libavcodec
Hi,
On Tue, Jul 24, 2012 at 3:05 PM, Jason Garrett-Glaser ja...@x264.com wrote:
On Tue, Jul 24, 2012 at 9:02 AM, John Stebbins stebb...@jetheaddev.com
wrote:
On 07/24/2012 05:53 PM, Jason Garrett-Glaser wrote:
On Tue, Jul 24, 2012 at 8:34 AM, Måns Rullgård m...@mansr.com wrote:
Jason
Hi,
On Sat, Jul 14, 2012 at 10:33 AM, Justin Ruggles
justin.rugg...@gmail.com wrote:
On 06/26/2012 04:55 PM, Justin Ruggles wrote:
Removes a false dependency on existing contents of the 2nd dst register,
giving better performance for OOE.
---
libavresample/x86/util.asm |3 ++-
1 files
Hi,
On Sat, Jul 14, 2012 at 9:29 PM, Justin Ruggles
justin.rugg...@gmail.com wrote:
---
libavresample/x86/audio_convert.asm| 36
libavresample/x86/audio_convert_init.c | 13 +++
2 files changed, 49 insertions(+), 0 deletions(-)
diff --git
Hi,
On Sat, Jul 14, 2012 at 9:29 PM, Justin Ruggles
justin.rugg...@gmail.com wrote:
---
libavresample/x86/audio_convert.asm| 62
libavresample/x86/audio_convert_init.c |9 +
2 files changed, 71 insertions(+), 0 deletions(-)
diff --git
Hi,
On Sat, Jul 14, 2012 at 9:29 PM, Justin Ruggles
justin.rugg...@gmail.com wrote:
---
libavresample/x86/audio_convert.asm| 49
libavresample/x86/audio_convert_init.c |9 ++
2 files changed, 58 insertions(+), 0 deletions(-)
LGTM.
Ronald
Hi,
On Sat, Jul 21, 2012 at 12:12 PM, Justin Ruggles
justin.rugg...@gmail.com wrote:
+%if cpuflag(ssse3)
+pshufb m3, m0, unpack_odd ; m3 = 12, 13, 14, 15
+pshufb m0, unpack_even ; m0 = 0, 1, 2, 3
+pshufb m4, m1, unpack_odd ; m4 =
Hi,
On Sat, Jul 14, 2012 at 9:29 PM, Justin Ruggles
justin.rugg...@gmail.com wrote:
---
libavresample/x86/audio_convert.asm| 37
libavresample/x86/audio_convert_init.c |9 +++
2 files changed, 46 insertions(+), 0 deletions(-)
diff --git
Hi,
On Sat, Jul 14, 2012 at 9:29 PM, Justin Ruggles
justin.rugg...@gmail.com wrote:
+%else ; sse
+mova xmm0, [srcq ]
+mova xmm1, [srcq+src1q]
+mova xmm2, [srcq+src2q]
+mova xmm3, [srcq+src3q]
+mova xmm4, [srcq+src4q]
+mova xmm5,
Hi,
On Sat, Jul 14, 2012 at 9:29 PM, Justin Ruggles
justin.rugg...@gmail.com wrote:
+mova [dstq ], m0
+mova [dstq+1*mmsize], m1
+mova [dstq+2*mmsize], m2
+mova [dstq+3*mmsize], m3
+add srcq, mmsize*2
+add dstq, mmsize*4
+sub lend, mmsize/2
You
Hi,
On Sat, Jul 14, 2012 at 9:29 PM, Justin Ruggles
justin.rugg...@gmail.com wrote:
---
libavresample/x86/audio_convert.asm| 38
libavresample/x86/audio_convert_init.c | 11 +
2 files changed, 49 insertions(+), 0 deletions(-)
diff --git
Hi,
On Tue, Jul 24, 2012 at 9:41 PM, Ronald S. Bultje rsbul...@gmail.com wrote:
Hi,
On Sat, Jul 14, 2012 at 9:29 PM, Justin Ruggles
justin.rugg...@gmail.com wrote:
---
libavresample/x86/audio_convert.asm| 38
libavresample/x86/audio_convert_init.c
Hi,
On Sat, Jul 14, 2012 at 9:29 PM, Justin Ruggles
justin.rugg...@gmail.com wrote:
+mova m0, [srcq ] ; m0 = 0, 1, 2, 3, 4, 5, 6, 7
+mova m2, [srcq+2*mmsize] ; m2 = 16, 17, 18, 19, 20, 21, 22, 23
+movq m3, [srcq+ mmsize+mmsize/2]
+
Hi,
On Sat, Jul 14, 2012 at 9:29 PM, Justin Ruggles
justin.rugg...@gmail.com wrote:
---
libavresample/x86/audio_convert.asm| 41
libavresample/x86/audio_convert_init.c | 13 ++
2 files changed, 54 insertions(+), 0 deletions(-)
LGTM.
Ronald
Hi,
On Sat, Jul 14, 2012 at 9:29 PM, Justin Ruggles
justin.rugg...@gmail.com wrote:
+mova m0, [srcq ] ; m0 = 0, 1, 2, 3, 4, 5, 6, 7
+mova m1, [srcq+ mmsize] ; m1 = 8, 9, 10, 11, 12, 13, 14, 15
+mova m2, [srcq+2*mmsize] ; m2 = 16, 17, 18, 19,
Hi,
On Sat, Jul 14, 2012 at 9:29 PM, Justin Ruggles
justin.rugg...@gmail.com wrote:
---
libavresample/x86/audio_convert.asm| 49
libavresample/x86/audio_convert_init.c |9 ++
2 files changed, 58 insertions(+), 0 deletions(-)
LGTM.
Ronald
Hi,
On Sat, Jul 14, 2012 at 9:29 PM, Justin Ruggles
justin.rugg...@gmail.com wrote:
+movhlpsm3, m1
+movlhpsm3, m2 ; m3 = 12, 13, 14, 15, 16, 17, 18, 19
+movlhpsm1, m1
+movhlpsm1, m0 ; m1 = 4, 5, 6, 7, 8, 9, 10, 11
+psrldq
Hi,
On Sat, Jul 14, 2012 at 9:29 PM, Justin Ruggles
justin.rugg...@gmail.com wrote:
---
libavresample/x86/audio_convert.asm| 34
libavresample/x86/audio_convert_init.c |9
2 files changed, 43 insertions(+), 0 deletions(-)
OK.
(Can this
Hi,
On Tue, Jul 17, 2012 at 6:16 AM, Justin Ruggles
justin.rugg...@gmail.com wrote:
---
libavresample/x86/audio_convert.asm| 63
libavresample/x86/audio_convert_init.c |9 +
2 files changed, 72 insertions(+), 0 deletions(-)
(I'm going to
Hi,
On Tue, Jul 24, 2012 at 7:45 PM, Loren Merritt lor...@u.washington.edu wrote:
-long x, y;
-uint32_t pixel;
+uint32_t tmp;
-for (y = 0; y h; y++) {
-for (x = 0; x w; x++) {
-pixel = lowpass(frame_ant[x]8, src[x]16, temporal);
-
Hi,
On Mon, Jul 23, 2012 at 7:12 AM, Ronald S. Bultje rsbul...@gmail.com wrote:
On Sun, Jul 22, 2012 at 2:38 PM, Ronald S. Bultje rsbul...@gmail.com wrote:
From: Ronald S. Bultje rsbul...@gmail.com
Mixing yasm and inline asm is a bad idea, since if either yasm or inline
asm is not supported
Hi,
On Sun, Jul 22, 2012 at 3:27 PM, Derek Buitenhuis
derek.buitenh...@gmail.com wrote:
On 22/07/2012 6:14 PM, Ronald S. Bultje wrote:
From: Ronald S. Bultje rsbul...@gmail.com
This allows compiling with compilers that don't support gcc-style
inline assembly.
---
I think this looks OK
Hi,
On Mon, Jul 23, 2012 at 2:05 AM, Diego Biurrun di...@biurrun.de wrote:
On Sun, Jul 22, 2012 at 08:46:10PM -0700, Ronald S. Bultje wrote:
From: Ronald S. Bultje rsbul...@gmail.com
Write out the NAL decoding loops in full so that they are easier to
parse for a preprocessor without
Hi,
On Sun, Jul 22, 2012 at 2:38 PM, Ronald S. Bultje rsbul...@gmail.com wrote:
From: Ronald S. Bultje rsbul...@gmail.com
Mixing yasm and inline asm is a bad idea, since if either yasm or inline
asm is not supported by your toolchain, all of the asm stops working.
Thus, better to use either
Hi,
On Sun, Jul 22, 2012 at 1:16 PM, Ronald S. Bultje rsbul...@gmail.com wrote:
From: Ronald S. Bultje rsbul...@gmail.com
This completes the conversion of h264dsp to yasm; note that h264 also
uses some dsputil functions, most notably qpel. Performance-wise, the
yasm-version is ~10 cycles
Hi,
On Mon, Jul 23, 2012 at 7:14 AM, Kostya Shishkov
kostya.shish...@gmail.com wrote:
On Mon, Jul 23, 2012 at 07:11:49AM -0700, Ronald S. Bultje wrote:
Hi,
On Mon, Jul 23, 2012 at 2:05 AM, Diego Biurrun di...@biurrun.de wrote:
On Sun, Jul 22, 2012 at 08:46:10PM -0700, Ronald S. Bultje wrote
Hi,
On Mon, Jul 23, 2012 at 7:29 AM, Luca Barbato lu_z...@gentoo.org wrote:
From: Ronald S. Bultje rsbul...@gmail.com
Write out the NAL decoding loops in full so that they are easier
to parse for a preprocessor without it having to be aware of macros
or other such things in C code
Hi,
On Mon, Jul 23, 2012 at 5:37 PM, Daniel Kang daniel.d.k...@gmail.com wrote:
On Mon, Jul 23, 2012 at 5:21 PM, Diego Biurrun di...@biurrun.de wrote:
On Mon, Jul 23, 2012 at 05:12:23PM -0700, Daniel Kang wrote:
From: Daniel Kang daniel.d.k...@gmail.com
The only CPUs that have 3dnow and
Hi,
On Mon, Jul 23, 2012 at 7:45 PM, Ronald S. Bultje rsbul...@gmail.com wrote:
Hi,
On Mon, Jul 23, 2012 at 5:37 PM, Daniel Kang daniel.d.k...@gmail.com wrote:
On Mon, Jul 23, 2012 at 5:21 PM, Diego Biurrun di...@biurrun.de wrote:
On Mon, Jul 23, 2012 at 05:12:23PM -0700, Daniel Kang wrote
Hi,
On Sat, Jul 21, 2012 at 5:03 PM, Ronald S. Bultje rsbul...@gmail.com wrote:
From: Ronald S. Bultje rsbul...@gmail.com
This allows compiling this code using compilers that do not understand
gcc-style inline assembly.
---
libavfilter/x86/gradfun.c |6 ++
libavfilter/x86/yadif.c
Hi,
On Sat, Jul 21, 2012 at 5:19 PM, Måns Rullgård m...@mansr.com wrote:
Ronald S. Bultje rsbul...@gmail.com writes:
From: Ronald S. Bultje rsbul...@gmail.com
This removes some code duplication between the 3 different versions,
and aligns brackets in such a way that it is now possible
Hi,
On Sun, Jul 22, 2012 at 8:17 AM, Måns Rullgård m...@mansr.com wrote:
Ronald S. Bultje rsbul...@gmail.com writes:
This manner of splitting things is incredibly weird-looking. Instead of
trying to unify these rather different fragments, turning the second
half of the loop into a macro
From: Ronald S. Bultje rsbul...@gmail.com
This completes the conversion of h264dsp to yasm; note that h264 also
uses some dsputil functions, most notably qpel. Performance-wise, the
yasm-version is ~10 cycles faster (182-172) on x86-64, and ~8 cycles
faster (201-193) on x86-32.
---
libavcodec
From: Ronald S. Bultje rsbul...@gmail.com
This completes the conversion of h264dsp to yasm; note that h264 also
uses some dsputil functions, most notably qpel. Performance-wise, the
yasm-version is ~10 cycles faster (182-172) on x86-64, and ~8 cycles
faster (201-193) on x86-32.
---
libavcodec
From: Ronald S. Bultje rsbul...@gmail.com
The function called in this block is under HAVE_INLINE_ASM itself also.
---
libswscale/swscale.c |2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/libswscale/swscale.c b/libswscale/swscale.c
index 7ae5af3..5cfa7f2 100644
From: Ronald S. Bultje rsbul...@gmail.com
---
libswscale/utils.c | 10 +-
1 file changed, 5 insertions(+), 5 deletions(-)
diff --git a/libswscale/utils.c b/libswscale/utils.c
index d8fee58..a6b5a18 100644
--- a/libswscale/utils.c
+++ b/libswscale/utils.c
@@ -576,7 +576,7 @@ fail
From: Ronald S. Bultje rsbul...@gmail.com
This allows compiling with compilers that don't support gcc-style
inline assembly.
---
libavcodec/x86/dsputil_mmx.c | 69 --
libavcodec/x86/h264_qpel_mmx.c |4 ++-
libavcodec/x86/idct_mmx.c|4
From: Ronald S. Bultje rsbul...@gmail.com
Mixing yasm and inline asm is a bad idea, since if either yasm or inline
asm is not supported by your toolchain, all of the asm stops working.
Thus, better to use either one or the other alone.
---
libavcodec/x86/vp3dsp.asm | 120
From: Ronald S. Bultje rsbul...@gmail.com
This allows compiling with compilers that don't support gcc-style
inline assembly.
---
libavcodec/dct-test.c|2 +-
libavcodec/x86/dsputil_mmx.c | 69 --
libavcodec/x86/h264_qpel_mmx.c |4
From: Ronald S. Bultje rsbul...@gmail.com
This allows compiling with compilers that don't support gcc-style
inline assembly.
---
libavcodec/dct-test.c|2 +-
libavcodec/x86/dsputil_mmx.c | 69 --
libavcodec/x86/h264_qpel_mmx.c |4
From: Ronald S. Bultje rsbul...@gmail.com
---
libswscale/utils.c | 10 +-
1 file changed, 5 insertions(+), 5 deletions(-)
diff --git a/libswscale/utils.c b/libswscale/utils.c
index d8fee58..a6b5a18 100644
--- a/libswscale/utils.c
+++ b/libswscale/utils.c
@@ -576,7 +576,7 @@ fail
Hi,
On Sun, Jul 22, 2012 at 3:30 PM, Diego Biurrun di...@biurrun.de wrote:
---
libswscale/output.c | 15 ---
libswscale/ppc/swscale_altivec.c |3 ++-
libswscale/ppc/yuv2rgb_altivec.c | 11 +++
libswscale/rgb2rgb.c |3 ++-
Hi,
On Sun, Jul 22, 2012 at 5:35 PM, Måns Rullgård m...@mansr.com wrote:
Diego Biurrun di...@biurrun.de writes:
On Mon, Jul 23, 2012 at 01:16:07AM +0100, Måns Rullgård wrote:
Diego Biurrun di...@biurrun.de writes:
On Mon, Jul 23, 2012 at 12:16:41AM +0100, Mans Rullgard wrote:
This allows
501 - 600 of 3612 matches
Mail list logo