On Sun, Jun 16, 2013 at 12:53:43AM +0000, Loren Merritt wrote:
> 5x-7x faster on sandybridge
>
> FIXME: This adds alignment constraints and changes a struct that used to be
> public. External use of LLS is already deprecated, but I guess we can't
> actually enable this optimization until the next major version bump.
As Anton said, no need to worry, lls was never public.
> --- a/libavutil/lls.h
> +++ b/libavutil/lls.h
> @@ -23,9 +23,10 @@
>
> -#include "version.h"
> +#include "common.h"
version.h ought to stay for FF_API_LLS_PRIVATE;
add mem.h instead of common.h for DECLARE_ALIGNED.
> @@ -33,15 +34,31 @@
>
> void avpriv_init_lls(LLSModel *m, int indep_count);
> +void avpriv_init_lls_x86(LLSModel *m);
> +
> +/**
> + * Take the outer product of param[] with itself, and add to the covariance
> matrix.
> + * @param param training samples, starting with the value to be predicted.
> + * 32-byte aligned, and any padding elements must be initialized
> + * (i.e not denormal/nan).
> + * @param decay if not 1.0, covariance matrix is a rolling average rather
> than a sum.
> + */
> void avpriv_update_lls(LLSModel *m, double *param, double decay);
> +
> +/**
> + * Inner product of param[] and the LPC coefs.
> + * @param param training samples, excluding the value to be predicted.
> unaligned.
> + */
> double avpriv_evaluate_lls(LLSModel *m, double *param, int order);
Please document all parameters to avoid Doxygen warnings.
> --- /dev/null
> +++ b/libavutil/x86/lls_init.c
> @@ -0,0 +1,36 @@
> +
> +#include "libavutil/lls.h"
> +#include "libavutil/x86/cpu.h"
> +
> +void ff_update_lls_sse2(LLSModel *m, double *var, int order);
> +void ff_update_lls_avx(LLSModel *m, double *var, int order);
> +
> +void avpriv_init_lls_x86(LLSModel *m)
Unless I'm missing what other things you have planned for this in the
future the function will only ever be called from avpriv_init_lls()
and should not be visible outside of libavutil. It should therefore
have an ff_, not an avpriv_ name prefix.
> +{
> + int cpu_flags = av_get_cpu_flags();
> + if (EXTERNAL_SSE2(cpu_flags))
> + m->update_lls = ff_update_lls_sse2;
> + if (EXTERNAL_AVX(cpu_flags))
> + m->update_lls = ff_update_lls_avx;
nit: Maybe add full {} to avoid changing lines twice when you insert
more lines in later patches.
Diego
_______________________________________________
libav-devel mailing list
[email protected]
https://lists.libav.org/mailman/listinfo/libav-devel