"Ronald S. Bultje" <[email protected]> writes:

> From: "Ronald S. Bultje" <[email protected]>
>
> Based on work (for GCI) by Aneesh Dogra <[email protected]>, and
> inspired by patch in Chromium by Chris Evans <[email protected]>.
>
> When turned on, H264/CAVLC gets ~15% (CVPCMNL1_SVA_C.264) slower for
> ultra-high-bitrate files, or ~2.5% (CVFI1_SVA_C.264) for lower-bitrate
> files. Other codecs are affected to a lesser extent because they are
> less optimized; e.g., VC-1 slows down by less than 1% (all on x86).
> The patch generated 3 extra instructions (cmp, cmovae and mov) per
> call to get_bits().
> ---
>  configure             |    2 ++
>  libavcodec/get_bits.h |   48 ++++++++++++++++++++++++++++++++++++++++++++++--
>  libavcodec/wmavoice.c |    2 ++
>  3 files changed, 50 insertions(+), 2 deletions(-)
>
> diff --git a/configure b/configure
> index b3af2e9..056f409 100755
> --- a/configure
> +++ b/configure
> @@ -113,6 +113,7 @@ Configuration options:
>    --disable-dxva2          disable DXVA2 code
>    --enable-runtime-cpudetect detect cpu capabilities at runtime (bigger 
> binary)
>    --enable-hardcoded-tables use hardcoded tables instead of runtime 
> generation
> +  --disable-safe-bitstream-reader disable buffer boundary checking in 
> bitreaders (faster, but may crash)
>    --enable-memalign-hack   emulate memalign, interferes with memory debuggers
>    --disable-everything     disable all components listed below
>    --disable-encoder=NAME   disable encoder NAME

The help output should list the flag needed to change the default.  As
written, it gives the impression it is on by default.

> @@ -144,7 +164,13 @@ for examples see get_bits, show_bits, skip_bits, get_vlc
>  # endif
>
>  // FIXME name?
> +#if UNCHECKED_BITSTREAM_READER
>  #   define SKIP_COUNTER(name, gb, num) name##_index += (num)
> +#else
> +#   define SKIP_COUNTER(name, gb, num) name##_index = \
> +               (gb)->size_in_bits_plus1 - name##_index < (num) ? \
> +               (gb)->size_in_bits_plus1 : name##_index + (num)
> +#endif

Thinking more carefully about it, this addition is overflow-safe since
the skip amount must be small and we require end padding.  Your old
version with FFMIN(size, index + num) is thus safe here after all,
should it result in better code.

>  #   define SKIP_BITS(name, gb, num) do {        \
>          SKIP_CACHE(name, gb, num);              \
> @@ -171,7 +197,12 @@ static inline int get_bits_count(const GetBitContext *s){
>  }
>
>  static inline void skip_bits_long(GetBitContext *s, int n){
> +#if UNCHECKED_BITSTREAM_READER
>      s->index += n;
> +#else
> +    s->index = s->size_in_bits_plus1 - s->index < n ?
> +               s->size_in_bits_plus1 : s->index + n;
> +#endif
>  }

This one allows an arbitrary n, so it must be done safely, e.g. like this.

>  #elif defined A32_BITSTREAM_READER
> @@ -196,7 +227,9 @@ static inline void skip_bits_long(GetBitContext *s, int 
> n){
>              const uint32_t next = av_be2ne32(*name##_buffer_ptr);       \
>              name##_cache0 |= NEG_USR32(next, name##_bit_count);         \
>              name##_cache1 |= next << name##_bit_count;                  \
> -            name##_buffer_ptr++;                                        \
> +            if (UNCHECKED_BITSTREAM_READER ||                           \
> +                name##_buffer_ptr < (const uint32_t *)(gb)->buffer_end) \
> +                name##_buffer_ptr++;                                    \
>              name##_bit_count -= 32;                                     \
>          }                                                               \
>      } while (0)
> @@ -230,7 +263,12 @@ static inline int get_bits_count(const GetBitContext *s) 
> {
>  static inline void skip_bits_long(GetBitContext *s, int n){
>      OPEN_READER(re, s);
>      re_bit_count += n;
> +#if UNCHECKED_BITSTREAM_READER
>      re_buffer_ptr += re_bit_count>>5;
> +#else
> +    re_buffer_ptr = (re_bit_count >> 5) < (const uint32_t *) s->buffer_end - 
> re_buffer_ptr ?
> +                    re_buffer_ptr + (re_bit_count >> 5) : (const uint32_t *) 
> s->buffer_end;
> +#endif

This is again guaranteed to fall within the padding region, so whatever
variant gives better code is OK to use here.

>      re_bit_count &= 31;
>      re_cache0 = av_be2ne32(re_buffer_ptr[-1]) << re_bit_count;
>      re_cache1 = 0;
> @@ -311,7 +349,10 @@ static inline unsigned int get_bits1(GetBitContext *s){
>      result <<= index & 7;
>      result >>= 8 - 1;
>  #endif
> -    index++;
> +#if !UNCHECKED_BITSTREAM_READER
> +    if (s->index < s->size_in_bits_plus1)
> +#endif
> +        index++;
>      s->index = index;

Why #if rather than if (UNCHECKED_BITSTREAM_READER || ...) here?

-- 
Måns Rullgård
[email protected]
_______________________________________________
libav-devel mailing list
[email protected]
https://lists.libav.org/mailman/listinfo/libav-devel

Reply via email to