vc2enc: Use LUT to assemble interleaved golomb, code

Lynne Wed, 12 Mar 2025 06:47:02 -0700

On 12/03/2025 06:27, Andreas Rheinhardt wrote:

Lynne:

On 12/03/2025 04:10, Andreas Rheinhardt wrote:

Patches attached.


- Andreas


First patch is wild, its surprising no one considered inverting the way
decoder parses codes for an encoder yet.


I didn't even look at the decoder.
(It is actually surprising that it took until
512e597932dfe05cf5665192efbe2c93c2e36af2 for the original code to be
improved.)

Rather than ORing and using put_bits63, I think it would make more sense
to write out each chunk using put_bits sequentially. It might be
possible to reverse the lookups such that you get the MSBs first so you
wouldn't need to reverse them out of place in a small array.
But either way, LGTM. Feel free to explore this in a follow-up.


I don't think that writing them sequentially will improve anything: In
order to be able to use a LUT, I would have to shift the bits starting
with the MSBs into position; and then there would be the internal shifts
and checks inside put_bits().
Apart from that: put_bits63() is the same as put_bits() when BUF_BITS is
64 (see ede2b391cc516f4f93621f6a214b3410b231f582).


Second patch seems a bit pointless. It's just one single call you're
uninlining? Chasing to save a few extra bytes of binary surely don't
deserve having a wrapper function for uninlining.


I am uninlining all calls besides the hot one. 31 callsites.
For GCC, this reduced codesize 2c36 to 25b1 (15% saved), for clang from
4b08 to 3338 (32% saved).


Oh, it was late and I didn't read carefully.
Both patches LGTM.
_______________________________________________
ffmpeg-devel mailing list
[email protected]
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
[email protected] with subject "unsubscribe".

Re: [FFmpeg-devel] [PATCH 1/2] avcodec/vc2enc: Use LUT to assemble interleaved golomb, code

Reply via email to