When allocating stack space with an alignment requirement that is larger
than the current stack alignment we need to store a copy of the original
stack pointer in order to be able to restore it later.
If we chose to use another register for this purpose we should not pick
eax/rax since it can be o
On Mon, Dec 26, 2016 at 2:32 AM, Ronald S. Bultje wrote:
> I know I'm terribly nitpicking here for the limited scope of the comment,
> but this only matters for functions that have a return value. Do you think
> it makes sense to allow functions to opt out of this requirement if they
> explicitly
On Mon, Dec 26, 2016 at 2:52 PM, Ronald S. Bultje wrote:
> Hm, OK, I think it affects unix64/x86-32 also when using 32-byte
> alignment. We do use the stack pointer then.
On 32-bit and UNIX64 it simply uses a different caller-saved register
which doesn't require additional instructions.
> I thi
On Sun, Jan 29, 2017 at 8:59 PM, Mark Thompson wrote:
> strncmp
Any particular reason for not just using plain strcmp()?
___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel
On Fri, Mar 10, 2017 at 3:17 PM, Diego Biurrun wrote:
> +%macro INTERL 5
> +%if cpuflag(avx)
> +vunpckhps %3, %2, %1
> +vunpcklps %2, %2, %1
> +vextractf128 %4(%5), %2, 0
> +vextractf128 %4 %+ H(%5), %3, 0
> +vextractf128 %4(%5 + 1), %2, 1
> +vextractf128 %4
sion from 2014-11-09. I might have some additional
changes locally on top of that, but I'm not at home right now so I
can't check it at this moment.
/Henrik
From 6efacab3472889ba6acb88d60d7ca7413512edb8 Mon Sep 17 00:00:00 2001
From: Henrik Gramner
Date: Sun, 9 Nov 2014 15:43:40 +0100
Su
Attaching updated version with all local changes merged.
I'll support it and make changes if necessary if there's any interest
in getting it committed this time.
/Henrik
From 14e611818921403230c70177fa36e201a9d4d6f6 Mon Sep 17 00:00:00 2001
From: Henrik Gramner
Date: Mon, 6 Jul 201
or you now?
(no other changes compared to the previous version aside from Makefile stuff)
/Henrik
From b685f0a886d8e460735f1d2be9e68678fe7521f2 Mon Sep 17 00:00:00 2001
From: Henrik Gramner
Date: Tue, 7 Jul 2015 13:51:49 +0200
Subject: [PATCH] Checkasm: assembly testing and benchmarking to
Improves the accuracy of measurements, especially in short sections.
To quote the Intel 64 and IA-32 Architectures Software Developer's Manual:
"The RDTSC instruction is not a serializing instruction. It does not necessarily
wait until all previous instructions have been executed before reading th
Updated with some fairly minor changes (mainly using AV_READ_TIME())
after comments on IRC etc.
/Henrik
From a2b30e3e4dbab5912947a820c63fd32baf7b81ab Mon Sep 17 00:00:00 2001
From: Henrik Gramner
Date: Sat, 11 Jul 2015 20:32:11 +0200
Subject: [PATCH] Checkasm: assembly testing and benchmarking
0..06bc6ad
--- /dev/null
+++ b/tests/checkasm/h264qpel.c
@@ -0,0 +1,80 @@
+/*
+ * Copyright (c) 2015 Henrik Gramner
+ *
+ * This file is part of Libav.
+ *
+ * Libav is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+
Attaching a modified version that also verifies that the src buffers
are identical after the calls. Probably overkill, but might as well
check while we're at it.
/Henrik
From 259c3acc59581f13bb0a4b8debe62ea1c65d66f4 Mon Sep 17 00:00:00 2001
From: Henrik Gramner
Date: Mon, 13 Jul 2015 23:
diff --git a/tests/checkasm/bswapdsp.c b/tests/checkasm/bswapdsp.c
new file mode 100644
index 000..7b1566b
--- /dev/null
+++ b/tests/checkasm/bswapdsp.c
@@ -0,0 +1,73 @@
+/*
+ * Copyright (c) 2015 Henrik Gramner
+ *
+ * This file is part of Libav.
+ *
+ * Libav is free software; you can
The upper halves are not guaranteed to be zero in x86-64.
Also use `test` instead of `and` when the result isn't used for anything other
than as a branch condition, this allows some register moves to be eliminated.
---
libavcodec/x86/bswapdsp.asm | 23 ++-
1 file changed, 10 i
lgtm.
___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel
---
tests/checkasm/checkasm.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/tests/checkasm/checkasm.c b/tests/checkasm/checkasm.c
index 7b1ea8f..0aa3d1c 100644
--- a/tests/checkasm/checkasm.c
+++ b/tests/checkasm/checkasm.c
@@ -317,7 +317,7 @@ int main(int argc, char *argv[])
From: Michael Niedermayer
Signed-off-by: Michael Niedermayer
---
tests/checkasm/checkasm.h | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/tests/checkasm/checkasm.h b/tests/checkasm/checkasm.h
index 1a46e9b..b54be16 100644
--- a/tests/checkasm/checkasm.h
+++ b/tests/checkasm
On Fri, Jul 17, 2015 at 8:08 PM, Luca Barbato wrote:
> -qpel_mc_func (*tab)[16] = op ? h.avg_h264_qpel_pixels_tab :
> h.put_h264_qpel_pixels_tab;
> +qpel_mc_func(*tab)[16] = op ? h.avg_h264_qpel_pixels_tab :
> h.put_h264_qpel_pixels_tab;
No space between type and identifier? I d
On Mon, Jul 20, 2015 at 11:58 AM, Janne Grunau wrote:
> ---
> tests/checkasm/h264pred.c | 1 +
> 1 file changed, 1 insertion(+)
Shouldn't it be NULL instead of 0 since those are pointers?
Otherwise OK.
___
libav-devel mailing list
libav-devel@libav.or
On Mon, Jul 20, 2015 at 11:18 PM, Janne Grunau wrote:
> Fixes MSVC compilation.
> ---
> tests/checkasm/h264pred.c | 3 +--
> 1 file changed, 1 insertion(+), 2 deletions(-)
Ok.
___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/m
On Thu, Jul 23, 2015 at 9:04 AM, Martin Storsjö wrote:
> Why is this suddenly using "command" instead of "which" now? This won't work
> in a linux environment.
Why wouldn't it work in a linux environment? `command` is POSIX.
This stackoverflow post sums it up fairly well:
https://stackoverflow.c
On Thu, Jul 23, 2015 at 7:23 PM, Steve Lhomme wrote:
> On Thu, Jul 23, 2015 at 7:02 PM, Derek Buitenhuis
> wrote:
>> Broken permissions.
>
> Not sure how I can tweak that under Windows.
git update-index --chmod=+x
___
libav-devel mailing list
libav-d
From: Michael Niedermayer
Fixes alignment issues and bus errors.
---
tests/checkasm/bswapdsp.c | 9 +
tests/checkasm/h264pred.c | 5 +++--
tests/checkasm/h264qpel.c | 9 +
3 files changed, 13 insertions(+), 10 deletions(-)
diff --git a/tests/checkasm/bswapdsp.c b/tests/checkasm/
Makes it a bit more clear where each test belongs.
Suggested by Anton Khirnov.
---
tests/checkasm/checkasm.c | 57 +++
tests/checkasm/checkasm.h | 2 +-
tests/checkasm/h264qpel.c | 2 +-
3 files changed, 30 insertions(+), 31 deletions(-)
diff --git a
On Wed, Jul 29, 2015 at 10:09 PM, Martin Storsjö wrote:
> configure does check for isatty, and checkasm properly checks
> HAVE_ISATTY, but on some platforms (e.g. WinRT), io.h needs to be
> included for isatty to be available.
Ok.
___
libav-devel mailin
---
libavcodec/x86/dcadsp.asm | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/libavcodec/x86/dcadsp.asm b/libavcodec/x86/dcadsp.asm
index c42ee23..c99df12 100644
--- a/libavcodec/x86/dcadsp.asm
+++ b/libavcodec/x86/dcadsp.asm
@@ -148,7 +148,7 @@ DECODE_HF
addps m4, v
There is an SSE2 implementation so the SSE version is never used. The "SSE"
version also happens to contain SSE2 instructions on x86-64.
---
libavcodec/x86/dct32.asm | 3 +++
libavcodec/x86/dct_init.c | 2 ++
2 files changed, 5 insertions(+)
diff --git a/libavcodec/x86/dct32.asm b/libavcodec/x86
From: Anton Mitrofanov
Emulation requires a temporary register if arguments 1 and 4 are the same; this
doesn't obey the semantics of the original instruction, so we can't emulate
that in x86inc.
Also add pmacsdql emulation.
Signed-off-by: Henrik Gramner
---
libavutil/x86/x86i
Change ALLOC_STACK to always align the stack before allocating stack space for
consistency. Previously alignment would occur either before or after allocating
stack space depending on whether manual alignment was required or not.
---
libavcodec/x86/h264_deblock.asm | 4 +--
libavutil/x86/x86inc.a
rrent cpuflags are
used
Christophe Gisquet (1):
x86inc: Fix instantiation of YMM registers
Henrik Gramner (5):
x86inc: Support arbitrary stack alignments
x86inc: Disable vpbroadcastq workaround in newer yasm versions
x86inc: Drop SECTION_TEXT macro
x86inc: nasm support
x86inc: Va
---
configure| 3 ---
libavutil/x86/x86inc.asm | 42 +-
2 files changed, 29 insertions(+), 16 deletions(-)
diff --git a/configure b/configure
index 482be43..79dd3a5 100755
--- a/configure
+++ b/configure
@@ -1353,7 +1353,6 @@ ARCH_EXT_LIST_
From: Christophe Gisquet
Signed-off-by: Henrik Gramner
---
libavutil/x86/x86inc.asm | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/libavutil/x86/x86inc.asm b/libavutil/x86/x86inc.asm
index 96ebe37..2844fdf 100644
--- a/libavutil/x86/x86inc.asm
+++ b/libavutil/x86
The .text section is already 16-byte aligned by default on all supported
platforms so `SECTION_TEXT` isn't any different from `SECTION .text`.
---
libavcodec/x86/apedsp.asm | 2 +-
libavcodec/x86/audiodsp.asm | 2 +-
libavcodec/x86/bswapdsp.asm | 2 +-
libavcodec/x86/d
The bug was fixed in 1.3.0, so only perform the workaround in earlier versions.
---
libavutil/x86/x86inc.asm | 20 +++-
1 file changed, 11 insertions(+), 9 deletions(-)
diff --git a/libavutil/x86/x86inc.asm b/libavutil/x86/x86inc.asm
index 2844fdf..d4ce68f 100644
--- a/libavutil/x
From: Anton Mitrofanov
Signed-off-by: Henrik Gramner
---
libavutil/x86/x86inc.asm | 587 ---
1 file changed, 299 insertions(+), 288 deletions(-)
diff --git a/libavutil/x86/x86inc.asm b/libavutil/x86/x86inc.asm
index ae6813a..96ebe37 100644
--- a
---
libavutil/x86/x86inc.asm | 10 +-
1 file changed, 5 insertions(+), 5 deletions(-)
diff --git a/libavutil/x86/x86inc.asm b/libavutil/x86/x86inc.asm
index d70a5f9..0e2f447 100644
--- a/libavutil/x86/x86inc.asm
+++ b/libavutil/x86/x86inc.asm
@@ -1,7 +1,7 @@
;
On Sat, Aug 1, 2015 at 8:28 PM, Anton Khirnov wrote:
>Any specific reason you use ARCH_X86_64 in one file and
>ARCH_X86_32 in the other?
I missed that there's a define for ARCH_X86_32 in asm (some other code
used %if ARCH_X86_64 == 0 so I assumed it didn't).
Using ARCH_X86_32 in both places is o
On Sat, Aug 1, 2015 at 8:49 PM, James Almer wrote:
> I however think movq/sd should be used here for sse2 and above instead of
> movlps.
That's a moot point in this case since the code in question is SSE
only (and even if it wasn't I'm skeptical to the claim that it would
be measurably slower tha
On Sat, Aug 1, 2015 at 5:27 PM, Henrik Gramner wrote:
> ---
> configure| 3 ---
> libavutil/x86/x86inc.asm | 42 +-
> 2 files changed, 29 insertions(+), 16 deletions(-)
Skip this one for now, nasm seems to have a bug wit
On Sat, Aug 1, 2015 at 9:34 PM, James Almer wrote:
> The same could be done in av_parse_cpu_flags().
> It doesn't affect this patch, and can be done separately. Just throwing
> the idea out there.
Yeah, I guess.
> What about bmi/bmi2, for that matter?
What about them?
__
---
tests/checkasm/checkasm.c | 4
1 file changed, 4 deletions(-)
diff --git a/tests/checkasm/checkasm.c b/tests/checkasm/checkasm.c
index 82c635e..b564e7e 100644
--- a/tests/checkasm/checkasm.c
+++ b/tests/checkasm/checkasm.c
@@ -33,10 +33,6 @@
#include
#endif
-#if ARCH_X86
-#include "
---
libavutil/x86/x86inc.asm | 32 +---
1 file changed, 21 insertions(+), 11 deletions(-)
diff --git a/libavutil/x86/x86inc.asm b/libavutil/x86/x86inc.asm
index a519fd5..6ad9785 100644
--- a/libavutil/x86/x86inc.asm
+++ b/libavutil/x86/x86inc.asm
@@ -1,7 +1,7 @@
;
Now we no longer have to rely on function pointers intentionally
declared without specified argument types.
This makes it easier to support functions with floating point parameters
or return values as well as functions returning 64-bit values on 32-bit
architectures. It also avoids having to expli
If the return value doesn't fit in a single register rdx/edx can in some
cases be used in addition to rax/eax.
Doesn't affect any of the existing checkasm tests but might be useful later.
Also comment the relevant code a bit better.
---
tests/checkasm/x86/checkasm.asm | 7 +++
1 file changed
Now we no longer have to rely on function pointers intentionally
declared without specified argument types.
This makes it easier to support functions with floating point parameters
or return values as well as functions returning 64-bit values on 32-bit
architectures. It also avoids having to expli
On Wed, Aug 19, 2015 at 9:43 PM, Anton Khirnov wrote:
> +const int srcstride = FFALIGN(width, 16) * sizeof(*src0);
> +const int dststride = FFALIGN(width, 16) * PIXEL_SIZE(bit_depth);
Strides, and any other pointer-sized value, should be ptrdiff_t - or
more preferable, review/push my che
Minor nits:
> +#define randomize_buffers(buf, size, depth)
s/buffers/buffer/ since you're only randomizing a single one at a time.
> +static const char *interp_names[2][2] = { { "pixels", "h" }, { "v", "hv" } };
const char * const
Otherwise lgtm.
___
On Sun, Aug 23, 2015 at 8:27 PM, Anton Khirnov wrote:
> Quoting James Almer (2015-08-22 23:58:41)
>> You need to use the d suffix
>> instead of q on the register names to make sure the high bits are cleared.
>
> Eh? Perhaps I'm misunderstading something, but I'd expect that using d
> here would do
---
tests/checkasm/x86/checkasm.asm | 10 +++---
1 file changed, 7 insertions(+), 3 deletions(-)
diff --git a/tests/checkasm/x86/checkasm.asm b/tests/checkasm/x86/checkasm.asm
index 4948fc9..828352c 100644
--- a/tests/checkasm/x86/checkasm.asm
+++ b/tests/checkasm/x86/checkasm.asm
@@ -103,16
null
+++ b/tests/checkasm/v210enc.c
@@ -0,0 +1,94 @@
+/*
+ * Copyright (c) 2015 Henrik Gramner
+ *
+ * This file is part of Libav.
+ *
+ * Libav is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundati
---
tests/checkasm/v210enc.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/tests/checkasm/v210enc.c b/tests/checkasm/v210enc.c
index cdb8e76..4f5f6ba 100644
--- a/tests/checkasm/v210enc.c
+++ b/tests/checkasm/v210enc.c
@@ -43,7 +43,7 @@
AV_WN32A(v0 + i, r);
On Tue, Sep 22, 2015 at 9:28 PM, Vittorio Giovara
wrote:
> I am puzzled as well, msdn reports this function available only from
> vs2013, but there is a vs2012 fate instance which seems to compile
> fine with it.
That wouldn't exactly be the first incorrect thing in the MSDN
documentation though.
The System V ABI on x86-64 specifies that the al register contains an upper
bound of the number of arguments passed in vector registers when calling
variadic functions, so we aren't allowed to clobber it.
checkasm_fail_func() is a variadic function so also zero al before calling it.
---
tests/che
Tested functions are internally kept in a binary search tree for efficient
lookups. The downside of the current implementation is that the tree quickly
becomes unbalanced which causes an unneccessary amount of comparisons between
nodes. Improve this by changing the tree into a self-balancing left-l
They're short enough that inlining them actually reduces code size due to
all the overhead associated with making a function call.
---
libavutil/avstring.c | 22 --
libavutil/avstring.h | 22 ++
2 files changed, 18 insertions(+), 26 deletions(-)
diff --git
The previous implementation was behaving incorrectly in some corner cases.
---
tests/checkasm/checkasm.c | 8 ++--
1 file changed, 6 insertions(+), 2 deletions(-)
diff --git a/tests/checkasm/checkasm.c b/tests/checkasm/checkasm.c
index 013e197..9219a83 100644
--- a/tests/checkasm/checkasm.c
+
On Mon, Sep 28, 2015 at 9:49 AM, Anton Khirnov wrote:
> But does it actually improve performance measurably? I'd argue that
> those functions are used in places where it doesn't really matter.
I was using some perf tools through checkasm when I noticed an awful
lot of time was spent calling av_is
---
tests/checkasm/checkasm.c | 24 +---
1 file changed, 13 insertions(+), 11 deletions(-)
diff --git a/tests/checkasm/checkasm.c b/tests/checkasm/checkasm.c
index 9219a83..3ed78b6 100644
--- a/tests/checkasm/checkasm.c
+++ b/tests/checkasm/checkasm.c
@@ -57,17 +57,19 @@ stati
On Sun, Oct 4, 2015 at 8:39 PM, Luca Barbato wrote:
> Alternatively we might make sure if avcodec is disabled all its
> components are as well.
>
> might simplify a lot the code...
Yes, that's indeed a solid approach as well. Who's volunteering for
that though? I don't really know much about the
The upper halves are not guaranteed to be zero in x86-64.
---
libavcodec/x86/h264_intrapred.asm | 98 +++
1 file changed, 49 insertions(+), 49 deletions(-)
diff --git a/libavcodec/x86/h264_intrapred.asm
b/libavcodec/x86/h264_intrapred.asm
index 4a4fa10..df657a
ff0fd4437d83 Mon Sep 17 00:00:00 2001
From: Henrik Gramner
Date: Fri, 10 Oct 2014 21:53:38 +0200
Subject: [PATCH] Checkasm: assembly testing and benchmarking tool
It provides the following features:
* verify correctness by comparing the output to the C version.
* detect failure to save and rest
On Fri, Oct 10, 2014 at 10:18 PM, Luca Barbato wrote:
> Given that probably all the init functions do not have a cpuflag parameter
> wouldn't be better override the cpu state with av_set_cpu_flags_mask and not
> go over the codebase to add the parameter?
Oh, I totally missed the existence of that
---
libavutil/avstring.h | 12 ++--
1 file changed, 6 insertions(+), 6 deletions(-)
diff --git a/libavutil/avstring.h b/libavutil/avstring.h
index 9a18ddd..7c30ee1 100644
--- a/libavutil/avstring.h
+++ b/libavutil/avstring.h
@@ -154,22 +154,22 @@ char *av_get_token(const char **buf, const
n x86-64
(the upper halves are not guaranteed to be zero - but in practice they
very often are, which makes those bugs hard to spot otherwise).
* easy benchmarking.
From 6efacab3472889ba6acb88d60d7ca7413512edb8 Mon Sep 17 00:00:00 2001
From: Henrik Gramner
Date: Sun, 9 Nov 2014 15:43:40 +0100
Su
On Mon, Nov 10, 2014 at 10:42 PM, Kieran Kunhya wrote:
> +pw_1: times 8 dw 1
cextern pw_1
> +psllw m0, 1
> +psllw m1, 1
paddw mN, mN
___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel
Ping.
On Sun, Nov 9, 2014 at 4:35 PM, Henrik Gramner wrote:
> Libav currently doesn't have any good unit tests for assembly code
> which makes it difficult to write new assembly functions and/or
> improve the existing ones.
>
> x264 have the checkasm tool which does the job
On Sat, Nov 15, 2014 at 12:33 AM, Luca Barbato wrote:
> On 09/11/14 16:35, Henrik Gramner wrote:
>> Libav currently doesn't have any good unit tests for assembly code
>> which makes it difficult to write new assembly functions and/or
>> improve the existing ones.
>>
> +mova m2, [r2+r1]
> +mova m3, [r2+r1+mmsize]
> +pxor m2, m6
> +pxor m3, m6
pxor m2, m6, [r2+r1]
pxor m3, m6, [r2+r1+mmsize]
Avoids two moves in AVX, otherwise LGTM.
___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.
On Sat, Nov 21, 2015 at 7:53 AM, Hendrik Leppkes wrote:
> msys2 provides various .sh scripts to setup the environment, one for
> msys2 building, and one for mingw32/64 respectively.
> You need to launch it using the appropriate shell script, but just
> running sh.exe.
>
> - Hendrik
Which is not r
On Fri, Dec 11, 2015 at 6:40 PM, Janne Grunau wrote:
> +#define declare_new_emms(cpu_flags, ret, ...) \
> +ret (*checked_call)(void *, int, int, int, int, int, __VA_ARGS__) = \
> +((cpu_flags) & av_get_cpu_flags()) ? (void
> *)checkasm_checked_call_emms : \
> +
On Tue, Dec 22, 2015 at 5:41 PM, Janne Grunau wrote:
> I found HTML copy from 1999 of Intel's manual(1) which says that
> cvtpi2ps with a memory location as source doesn't cause a transition to
> MMX state. The current documentation for cvtpi2pd (packed int to packed
> double conversion) says the
On Tue, Dec 22, 2015 at 10:44 PM, Janne Grunau wrote:
>> Intel's current documentation is very clear on cvtpi2ps: "This
>> instruction causes a transition from x87 FPU to MMX technology
>> operation".
>
> every tested silicon (nothing ancient or SSE only though) and the copy
> of the manual from 1
On Tue, Dec 22, 2015 at 10:59 PM, Janne Grunau wrote:
> This reverts commit 5dfe4edad63971d669ae456b0bc40ef9364cca80.
> ---
> libavcodec/x86/fmtconvert.asm | 5 +
> 1 file changed, 1 insertion(+), 4 deletions(-)
Ok.
___
libav-devel mailing list
lib
On Tue, Dec 22, 2015 at 10:59 PM, Janne Grunau wrote:
> Check the full FPU tag word instead of only the upper half and simplify
> the comparison.
It previously only checked the lower half, not the upper.
> Use upper-case function base name as macro name to instantiate both
> checked_call variant
On Tue, Dec 29, 2015 at 12:32 PM, Janne Grunau wrote:
> Intel's Instruction Set Reference (as of September 2015) clearly states
> that cvtpi2ps switches to MMX state. Actual CPUs do not switch if the
> source is a memory location. The Instruction Set Reference from 1999
> (Order Number 243191) des
On Wed, Dec 30, 2015 at 1:43 PM, Janne Grunau wrote:
> libavcodec/x86/fmtconvert.asm | 9 -
> 1 file changed, 8 insertions(+), 1 deletion(-)
Ok.
___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-d
Makes it possible to use them in arithmetic expressions.
---
libavutil/x86/x86inc.asm | 5 +++--
1 file changed, 3 insertions(+), 2 deletions(-)
diff --git a/libavutil/x86/x86inc.asm b/libavutil/x86/x86inc.asm
index 6ad9785..afcd6b8 100644
--- a/libavutil/x86/x86inc.asm
+++ b/libavutil/x86/x86inc
When allocating stack space with a larger alignment than the known stack
alignment a temporary register is used for storing the stack pointer.
Ensure that this isn't one of the registers used for passing arguments.
---
libavutil/x86/x86inc.asm | 6 --
1 file changed, 4 insertions(+), 2 deletio
---
libavutil/x86/x86inc.asm | 134 +++
1 file changed, 67 insertions(+), 67 deletions(-)
diff --git a/libavutil/x86/x86inc.asm b/libavutil/x86/x86inc.asm
index c355ee7..de20e76 100644
--- a/libavutil/x86/x86inc.asm
+++ b/libavutil/x86/x86inc.asm
@@ -18
cpuflags is never undefined any more, it's set to 0 instead.
Also fix an incorrect comment.
---
libavutil/x86/x86inc.asm | 6 ++
1 file changed, 2 insertions(+), 4 deletions(-)
diff --git a/libavutil/x86/x86inc.asm b/libavutil/x86/x86inc.asm
index de20e76..05a5790 100644
--- a/libavutil/x86/
The following patches were recently pushed to x264.
Geza Lore (1):
x86inc: Add debug symbols indicating sizes of compiled functions
Henrik Gramner (7):
x86inc: Make cpuflag() and notcpuflag() return 0 or 1
x86inc: Be more verbose in assertion failures
x86inc: Improve FMA instruction
The REP_RET workaround is only needed on old AMD cpus, and the labels clutter
up the symbol table and confuse debugging/profiling tools, so use EQU to
create SHN_ABS symbols instead of creating local labels. Furthermore, skip
the workaround completely in functions that definitely won't run on such
From: Geza Lore
Some debuggers/profilers use this metadata to determine which function a
given instruction is in; without it they get can confused by local labels
(if you haven't stripped those). On the other hand, some tools are still
confused even with this metadata. e.g. this fixes `gdb`, but
---
libavutil/x86/x86inc.asm | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/libavutil/x86/x86inc.asm b/libavutil/x86/x86inc.asm
index afcd6b8..dabb6cc 100644
--- a/libavutil/x86/x86inc.asm
+++ b/libavutil/x86/x86inc.asm
@@ -295,7 +295,7 @@ DECLARE_REG_TMP_SIZE 0,1,2,3,4,5,6,7,
* Correctly handle FMA instructions with memory operands.
* Print a warning if FMA instructions are used without the correct cpuflag.
* Simplify the instantiation code.
* Clarify documentation.
Only the last operand in FMA3 instructions can be a memory operand. When
converting FMA4 instruction
On Mon, Jan 18, 2016 at 2:35 PM, Ronald S. Bultje wrote:
> On Sun, Jan 17, 2016 at 6:21 PM, Henrik Gramner wrote:
>> @@ -386,8 +386,10 @@ DECLARE_REG_TMP_SIZE
>> 0,1,2,3,4,5,6,7,8,9,10,11,12,13,14
>> %if %1 != 0 && required_stack_alignment > STACK_ALIG
When allocating stack space with a larger alignment than the known stack
alignment a temporary register is used for storing the stack pointer.
Ensure that this isn't one of the registers used for passing arguments.
---
libavutil/x86/x86inc.asm | 7 +--
1 file changed, 5 insertions(+), 2 deleti
---
configure | 1 +
1 file changed, 1 insertion(+)
diff --git a/configure b/configure
index c5bcb78..0bf29c2 100755
--- a/configure
+++ b/configure
@@ -2951,6 +2951,7 @@ msvc_common_flags(){
-lz) echo zlib.lib ;;
-lavifil32) echo vfw32.lib ;;
On Wed, Jul 29, 2015 at 10:51 PM, Luca Barbato wrote:
> And restrict the string to ascii text.
Restricting to printable characters would be even better.
___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-de
On Sat, Feb 6, 2016 at 1:03 PM, Luca Barbato wrote:
> +if (isprint(val))
Shouldn't we use a locale-independent version similar to the other
functions in libavutil/avstring.h?
___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav
On Sat, Feb 6, 2016 at 7:34 PM, Luca Barbato wrote:
> Give how this function is used it is not really important, its purpose
> is to not break the terminal printing garbage.
That's true I guess.
> Do you have time to get me a function that is local independent?
static inline av_const int av_isp
Anton Mitrofanov (3):
x86inc: Fix AVX emulation of some instructions
x86inc: Improve handling of %ifid with multi-token parameters
x86inc: Enable AVX emulation in additional cases
Henrik Gramner (1):
x86inc: Fix AVX emulation of scalar float instructions
libavutil/x86/x86inc.asm | 95
From: Anton Mitrofanov
---
libavutil/x86/x86inc.asm | 44
1 file changed, 24 insertions(+), 20 deletions(-)
diff --git a/libavutil/x86/x86inc.asm b/libavutil/x86/x86inc.asm
index 10352fc..60aad23 100644
--- a/libavutil/x86/x86inc.asm
+++ b/libavutil/
From: Anton Mitrofanov
The yasm/nasm preprocessor only checks the first token, which means that
parameters such as `dword [rax]` are treated as identifiers, which is
generally not what we want.
---
libavutil/x86/x86inc.asm | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/l
Those instructions are not commutative since they only change the first
element in the vector and leave the rest unmodified.
---
libavutil/x86/x86inc.asm | 28 ++--
1 file changed, 14 insertions(+), 14 deletions(-)
diff --git a/libavutil/x86/x86inc.asm b/libavutil/x86/x86i
From: Anton Mitrofanov
Allows emulation to work when dst is equal to src2 as long as the
instruction is commutative, e.g. `addps m0, m1, m0`.
---
libavutil/x86/x86inc.asm | 21 +
1 file changed, 13 insertions(+), 8 deletions(-)
diff --git a/libavutil/x86/x86inc.asm b/libavut
On Sun, Jul 10, 2016 at 1:10 PM, Alexandra Hájková
wrote:
Some fairly minor nits:
> +++ b/libavcodec/x86/hevc_idct.asm
> +cglobal hevc_idct_%1x%1_dc_%3, 1, 2, 1, coeff, tmp
> +movsx tmpq, word [coeffq]
> +add tmpw, ((1 << 14-%3) + 1)
> +sar tm
On Mon, Jul 18, 2016 at 8:11 PM, Alexandra Hájková
wrote:
> +if (check_func(h.idct_dc[i - 2], "idct_%dx%d_dc_%d", block_size,
> block_size, bit_depth)) {
> +call_ref(coeffs0);
> +call_new(coeffs1);
> +if (memcmp(coeffs0, coeffs1, sizeof(*coeffs0) * size
On Thu, Jul 14, 2016 at 7:25 PM, Josh de Kock wrote:
Some of those functions are several kilobytes large. That's going to
result in a lot of cache misses.
I suggest using loops instead of duplicating the same code over and
over with %reps.
___
libav-dev
On Thu, Jul 21, 2016 at 2:48 AM, Josh de Kock wrote:
> +cglobal hevc_add_residual_16_8, 3, 5, 7, dst, coeffs, stride
> +pxorm0, m0
> +lea r3, [strideq * 3]
> +RES_ADD_SSE_16_32_8 0, dstq, dstq + strideq
> +RES_ADD_SSE_16_32_8 64, dstq + strideq * 2,
1 - 100 of 117 matches
Mail list logo