Re: [FFmpeg-devel] [PATCH 1/4] zmbvenc: don't sum the entropy when blocks are equal

2019-02-11 Thread Tomas Härdin
sön 2019-02-10 klockan 22:18 + skrev Matthew Fearnley:
> > On Thu, 31 Jan 2019 at 15:00, Tomas Härdin  wrote:
> 
> > 
> > > 1. The entropy calculation in block_cmp() omits the score of histogram[0]
> > > from the final sum.
> > > It's tempting to do this to bias the scores in favour of 0-bytes, but in
> > > reality, blocks with a majority of 0 (or any other byte) will already be
> > > naturally favoured by the entropy score anyway, and this bias will fail
> > 
> > to
> > > penalise blocks with an "average" (i.e. high entropy) number of 0's in
> > 
> > them.
> > > The exact implications are difficult to ponder, but in practical terms
> > 
> > this
> > > error does tend to produce worse results in the video compression.  Not
> > > massively so, but it's still noticeable.
> > 
> > Did you try combining the statistics of the current MV with the
> > statistics of the previous block, to bias the choice in favor of
> > similar bytes? Could work like a cheap order-1 entropy model.
> > 
> 
> I've had a go at this, but sadly, it seemed to adversely affect
> compression, producing larger files.
> Maybe it can be improved with some tweaking, or maybe there's a bug
> somewhere.
> But it feels to me like this approach on its own isn't enough to improve
> scoring.  Maybe as part of something larger though..
> Anyway, you can see the results of my efforts here:
> https://github.com/countingpine/FFmpeg/commit/prev_histogram.

Darn. Interesting result at least

/Tomas
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel


Re: [FFmpeg-devel] [PATCH 1/4] zmbvenc: don't sum the entropy when blocks are equal

2019-02-10 Thread Matthew Fearnley
On Thu, 31 Jan 2019 at 15:00, Tomas Härdin  wrote:

>
> > 1. The entropy calculation in block_cmp() omits the score of histogram[0]
> > from the final sum.
> > It's tempting to do this to bias the scores in favour of 0-bytes, but in
> > reality, blocks with a majority of 0 (or any other byte) will already be
> > naturally favoured by the entropy score anyway, and this bias will fail
> to
> > penalise blocks with an "average" (i.e. high entropy) number of 0's in
> them.
> > The exact implications are difficult to ponder, but in practical terms
> this
> > error does tend to produce worse results in the video compression.  Not
> > massively so, but it's still noticeable.
>
> Did you try combining the statistics of the current MV with the
> statistics of the previous block, to bias the choice in favor of
> similar bytes? Could work like a cheap order-1 entropy model.
>

I've had a go at this, but sadly, it seemed to adversely affect
compression, producing larger files.
Maybe it can be improved with some tweaking, or maybe there's a bug
somewhere.
But it feels to me like this approach on its own isn't enough to improve
scoring.  Maybe as part of something larger though..
Anyway, you can see the results of my efforts here:
https://github.com/countingpine/FFmpeg/commit/prev_histogram.

- Matthew
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel


Re: [FFmpeg-devel] [PATCH 1/4] zmbvenc: don't sum the entropy when blocks are equal

2019-01-31 Thread Tomas Härdin
tor 2019-01-31 klockan 13:31 + skrev Matthew Fearnley:
> > On Sun, 20 Jan 2019 at 15:16, Tomas Härdin  wrote:
> 
> > > Hi.  Just to say, I tried setting up additional score_tabs for the
> > > bottom/right partial blocks.  Unfortunately, after implementing it and
> > > ironing out the bugs, the new score tables actually caused a slight
> > > increase in file size!
> > > After a little informal investigation with a couple of videos, my
> > 
> > findings
> > > were that increasing the divisor - '(i / (double)(ZMBV_BLOCK *
> > 
> > ZMBV_BLOCK *
> > > some_factor))' - would generally lead to slight compression improvements.
> > > Given the score table is still "valid" for smaller blocks, and given the
> > > extra complexity of adding the score tables, plus the fact that it may
> > > generally hurt the compression, I'm inclined to leave it with the one
> > 
> > score
> > > table.  But there may be potential to improve the current compression
> > > method in future, by somehow tuning the divisor for better results
> > > generally.
> > 
> > Huh, that's strange. Sounds like something that warrants a comment in
> > the code. I also see we have an answer do Carl's question: you're still
> > experimenting with this :) I think we can hold off on pushing anything
> > until you feel happy with the result
> > 
> 
> Hi.
> Sorry, I had missed Carl's question.  Regrettably my work on this has been
> slow since my initial patch submissions, but I think I'm close to
> submitting some final changes.  Thanks for your patience so far.
> 
> I have recently made two helpful realisations about the above score table
> problem:
> 
> 1. The entropy calculation in block_cmp() omits the score of histogram[0]
> from the final sum.
> It's tempting to do this to bias the scores in favour of 0-bytes, but in
> reality, blocks with a majority of 0 (or any other byte) will already be
> naturally favoured by the entropy score anyway, and this bias will fail to
> penalise blocks with an "average" (i.e. high entropy) number of 0's in them.
> The exact implications are difficult to ponder, but in practical terms this
> error does tend to produce worse results in the video compression.  Not
> massively so, but it's still noticeable.

Did you try combining the statistics of the current MV with the
statistics of the previous block, to bias the choice in favor of
similar bytes? Could work like a cheap order-1 entropy model.

> 2. As long as the blocks being compared are the same size as each other,
> the entropy-based score comparisons (without the above bias) are unaffected
> by the base of the log, or the divisor used.
> (Mathematically, if 'a+b=c+d', then if 'a*log(a) + b*log(b) < c*log(c) +
> d*log(d)' then it is also true that 'a*log(a/N) + b*log(b/N) < c*log(c/N) +
> d*log(d/N)'.
> If that doesn't make sense, it helps to note that 'log(a/N) = log(a) -
> log(N)', so the log(N)'s cancel out.)

Convenient observation :)

> I am planning to organise my submission into two patches.  I intend for one
> to focus on problems/inefficiencies in block comparisons/score tables.  The
> other will be focused on motion estimation, i.e. the range of MV's that are
> compared, and to a very small extent, the order they're compared in.
> This avoids the overhead of patching each individual issue separately, but
> it will provide a logical split between the two main areas of change.

Sounds good!

/Tomas
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel


Re: [FFmpeg-devel] [PATCH 1/4] zmbvenc: don't sum the entropy when blocks are equal

2019-01-31 Thread Matthew Fearnley
On Sun, 20 Jan 2019 at 15:16, Tomas Härdin  wrote:

> > Hi.  Just to say, I tried setting up additional score_tabs for the
> > bottom/right partial blocks.  Unfortunately, after implementing it and
> > ironing out the bugs, the new score tables actually caused a slight
> > increase in file size!
> > After a little informal investigation with a couple of videos, my
> findings
> > were that increasing the divisor - '(i / (double)(ZMBV_BLOCK *
> ZMBV_BLOCK *
> > some_factor))' - would generally lead to slight compression improvements.
> > Given the score table is still "valid" for smaller blocks, and given the
> > extra complexity of adding the score tables, plus the fact that it may
> > generally hurt the compression, I'm inclined to leave it with the one
> score
> > table.  But there may be potential to improve the current compression
> > method in future, by somehow tuning the divisor for better results
> > generally.
>
> Huh, that's strange. Sounds like something that warrants a comment in
> the code. I also see we have an answer do Carl's question: you're still
> experimenting with this :) I think we can hold off on pushing anything
> until you feel happy with the result
>

Hi.
Sorry, I had missed Carl's question.  Regrettably my work on this has been
slow since my initial patch submissions, but I think I'm close to
submitting some final changes.  Thanks for your patience so far.

I have recently made two helpful realisations about the above score table
problem:

1. The entropy calculation in block_cmp() omits the score of histogram[0]
from the final sum.
It's tempting to do this to bias the scores in favour of 0-bytes, but in
reality, blocks with a majority of 0 (or any other byte) will already be
naturally favoured by the entropy score anyway, and this bias will fail to
penalise blocks with an "average" (i.e. high entropy) number of 0's in them.
The exact implications are difficult to ponder, but in practical terms this
error does tend to produce worse results in the video compression.  Not
massively so, but it's still noticeable.
(Math is a harsh mistress, and will often look unkindly on such flagrant
attempts to circumvent her laws.. :)

2. As long as the blocks being compared are the same size as each other,
the entropy-based score comparisons (without the above bias) are unaffected
by the base of the log, or the divisor used.
(Mathematically, if 'a+b=c+d', then if 'a*log(a) + b*log(b) < c*log(c) +
d*log(d)' then it is also true that 'a*log(a/N) + b*log(b/N) < c*log(c/N) +
d*log(d/N)'.
If that doesn't make sense, it helps to note that 'log(a/N) = log(a) -
log(N)', so the log(N)'s cancel out.)

This means that we can use a single score table for all block sizes!  It
only needs to be big enough for the largest block size, then it produces
optimal scores for all blocks up to that size.
It does mean that partial blocks with uniform bytes won't have a score of
0, but this property of entropy is not actually important when scoring the
blocks.
(Overall, this is a significant relief, because my attempts at multiple
score tables resulted in a lot of tricky decisions about how best to keep
the code DRY and minimise complexity, and I wasn't really happy with my
solution.)


I am planning to organise my submission into two patches.  I intend for one
to focus on problems/inefficiencies in block comparisons/score tables.  The
other will be focused on motion estimation, i.e. the range of MV's that are
compared, and to a very small extent, the order they're compared in.
This avoids the overhead of patching each individual issue separately, but
it will provide a logical split between the two main areas of change.

Kind regards,

Matthew
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel


Re: [FFmpeg-devel] [PATCH 1/4] zmbvenc: don't sum the entropy when blocks are equal

2019-01-20 Thread Tomas Härdin
lör 2019-01-19 klockan 22:40 + skrev Matthew Fearnley:
> > On Tue, 25 Dec 2018 at 09:35, Tomas Härdin  wrote:
> 
> > lör 2018-12-22 klockan 15:32 + skrev Matthew Fearnley:
> > > > > > 
> > > > > 
> > > > > Note that bw,bh aren't guaranteed to equal ZMBV_BLOCK, so
> > 
> > `histogram[0] ==
> > > > > bw*bh` would have to be used to guard against those (literal) edge
> > 
> > cases.
> > > > 
> > > > Right, yes. But if we have block sizes other than ZMBV_BLOCK^2 then we
> > > > need score tables for those sizes too.
> > > 
> > > I’ve thought about that a bit. I wondered if it would be worth it given:
> > > - the extra code, memory and logic needed
> > 
> > If you have a huge amount of DOS captures to optimize then it might be
> > worth it, else probably questionable
> > 
> > > - it would only improve the edge blocks
> > 
> > I imagine large blocks would be good for scenes with mostly global
> > motion. You cut down on the number of MVs and thus the amount of data
> > zlib has to compress, if the block size is a good fit.
> > 
> > > - the existing score table isn’t catastrophically bad for short blocks,
> > 
> > and would still favour blocks with more common pixels.
> > > 
> > > It would be better from a correctness perspective though, and effects on
> > 
> > running time should be negligible.
> > 
> > Good point. There's also no telling whether the current model is
> > actually an accurate prediction of how zlib behaves :)
> > 
> > 
> 
> Hi.  Just to say, I tried setting up additional score_tabs for the
> bottom/right partial blocks.  Unfortunately, after implementing it and
> ironing out the bugs, the new score tables actually caused a slight
> increase in file size!
> After a little informal investigation with a couple of videos, my findings
> were that increasing the divisor - '(i / (double)(ZMBV_BLOCK * ZMBV_BLOCK *
> some_factor))' - would generally lead to slight compression improvements.
> Given the score table is still "valid" for smaller blocks, and given the
> extra complexity of adding the score tables, plus the fact that it may
> generally hurt the compression, I'm inclined to leave it with the one score
> table.  But there may be potential to improve the current compression
> method in future, by somehow tuning the divisor for better results
> generally.

Huh, that's strange. Sounds like something that warrants a comment in
the code. I also see we have an answer do Carl's question: you're still
experimenting with this :) I think we can hold off on pushing anything
until you feel happy with the result

/Tomas
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel


Re: [FFmpeg-devel] [PATCH 1/4] zmbvenc: don't sum the entropy when blocks are equal

2019-01-19 Thread Matthew Fearnley
On Tue, 25 Dec 2018 at 09:35, Tomas Härdin  wrote:

> lör 2018-12-22 klockan 15:32 + skrev Matthew Fearnley:
> > > > >
> > > > Note that bw,bh aren't guaranteed to equal ZMBV_BLOCK, so
> `histogram[0] ==
> > > > bw*bh` would have to be used to guard against those (literal) edge
> cases.
> > >
> > > Right, yes. But if we have block sizes other than ZMBV_BLOCK^2 then we
> > > need score tables for those sizes too.
> >
> > I’ve thought about that a bit. I wondered if it would be worth it given:
> > - the extra code, memory and logic needed
>
> If you have a huge amount of DOS captures to optimize then it might be
> worth it, else probably questionable
>
> > - it would only improve the edge blocks
>
> I imagine large blocks would be good for scenes with mostly global
> motion. You cut down on the number of MVs and thus the amount of data
> zlib has to compress, if the block size is a good fit.
>
> > - the existing score table isn’t catastrophically bad for short blocks,
> and would still favour blocks with more common pixels.
> >
> > It would be better from a correctness perspective though, and effects on
> running time should be negligible.
>
> Good point. There's also no telling whether the current model is
> actually an accurate prediction of how zlib behaves :)
>
>
Hi.  Just to say, I tried setting up additional score_tabs for the
bottom/right partial blocks.  Unfortunately, after implementing it and
ironing out the bugs, the new score tables actually caused a slight
increase in file size!
After a little informal investigation with a couple of videos, my findings
were that increasing the divisor - '(i / (double)(ZMBV_BLOCK * ZMBV_BLOCK *
some_factor))' - would generally lead to slight compression improvements.
Given the score table is still "valid" for smaller blocks, and given the
extra complexity of adding the score tables, plus the fact that it may
generally hurt the compression, I'm inclined to leave it with the one score
table.  But there may be potential to improve the current compression
method in future, by somehow tuning the divisor for better results
generally.

Matthew
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel


Re: [FFmpeg-devel] [PATCH 1/4] zmbvenc: don't sum the entropy when blocks are equal

2018-12-25 Thread Tomas Härdin
lör 2018-12-22 klockan 15:32 + skrev Matthew Fearnley:
> > > > 
> > > Note that bw,bh aren't guaranteed to equal ZMBV_BLOCK, so `histogram[0] ==
> > > bw*bh` would have to be used to guard against those (literal) edge cases.
> > 
> > Right, yes. But if we have block sizes other than ZMBV_BLOCK^2 then we
> > need score tables for those sizes too.
> 
> I’ve thought about that a bit. I wondered if it would be worth it given:
> - the extra code, memory and logic needed

If you have a huge amount of DOS captures to optimize then it might be
worth it, else probably questionable

> - it would only improve the edge blocks

I imagine large blocks would be good for scenes with mostly global
motion. You cut down on the number of MVs and thus the amount of data
zlib has to compress, if the block size is a good fit.

> - the existing score table isn’t catastrophically bad for short blocks, and 
> would still favour blocks with more common pixels.
> 
> It would be better from a correctness perspective though, and effects on 
> running time should be negligible.

Good point. There's also no telling whether the current model is
actually an accurate prediction of how zlib behaves :)

/Tomas
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel


Re: [FFmpeg-devel] [PATCH 1/4] zmbvenc: don't sum the entropy when blocks are equal

2018-12-22 Thread Matthew Fearnley

> On 22 Dec 2018, at 12:11, Tomas Härdin  wrote:
> 
> tor 2018-12-20 klockan 17:46 + skrev Matthew Fearnley:
 On Thu, 20 Dec 2018 at 16:30, Tomas Härdin  wrote:
>>> 
>>> I have a feeling this could be sped up further by just doing *xored =
>>> histogram[0] == ZMBV_BLOCK*ZMBV_BLOCK after the loops, if [PATCH 3/4]
>>> is applied before this. Computing both histogram and xored in the loop
>>> seems pointless.
>>> 
>> 
>> You're right, that speedup didn't occur to me.  It makes the logic a bit
>> more tenuous, but it would be more efficient.
> 
> Eh, I wouldn't really call "we have xored data to output if and only if
> number of zeroes < number of pixel" fairly easy to grasp. A comment
> might be good tho
Agreed.
Just have to get the patch order sorted out now somehow.
>> Note that bw,bh aren't guaranteed to equal ZMBV_BLOCK, so `histogram[0] ==
>> bw*bh` would have to be used to guard against those (literal) edge cases.
> 
> Right, yes. But if we have block sizes other than ZMBV_BLOCK^2 then we
> need score tables for those sizes too.
I’ve thought about that a bit. I wondered if it would be worth it given:
- the extra code, memory and logic needed
- it would only improve the edge blocks
- the existing score table isn’t catastrophically bad for short blocks, and 
would still favour blocks with more common pixels.

It would be better from a correctness perspective though, and effects on 
running time should be negligible.

I can work on a patch, see how it looks.
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel


Re: [FFmpeg-devel] [PATCH 1/4] zmbvenc: don't sum the entropy when blocks are equal

2018-12-22 Thread Tomas Härdin
tor 2018-12-20 klockan 17:46 + skrev Matthew Fearnley:
> > On Thu, 20 Dec 2018 at 16:30, Tomas Härdin  wrote:
> 
> > I have a feeling this could be sped up further by just doing *xored =
> > histogram[0] == ZMBV_BLOCK*ZMBV_BLOCK after the loops, if [PATCH 3/4]
> > is applied before this. Computing both histogram and xored in the loop
> > seems pointless.
> > 
> 
> You're right, that speedup didn't occur to me.  It makes the logic a bit
> more tenuous, but it would be more efficient.

Eh, I wouldn't really call "we have xored data to output if and only if
number of zeroes < number of pixel" fairly easy to grasp. A comment
might be good tho

> Note that bw,bh aren't guaranteed to equal ZMBV_BLOCK, so `histogram[0] ==
> bw*bh` would have to be used to guard against those (literal) edge cases.

Right, yes. But if we have block sizes other than ZMBV_BLOCK^2 then we
need score tables for those sizes too.

/Tomas
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel


Re: [FFmpeg-devel] [PATCH 1/4] zmbvenc: don't sum the entropy when blocks are equal

2018-12-20 Thread Matthew Fearnley
On Thu, 20 Dec 2018 at 16:30, Tomas Härdin  wrote:

> I have a feeling this could be sped up further by just doing *xored =
> histogram[0] == ZMBV_BLOCK*ZMBV_BLOCK after the loops, if [PATCH 3/4]
> is applied before this. Computing both histogram and xored in the loop
> seems pointless.
>

You're right, that speedup didn't occur to me.  It makes the logic a bit
more tenuous, but it would be more efficient.
Note that bw,bh aren't guaranteed to equal ZMBV_BLOCK, so `histogram[0] ==
bw*bh` would have to be used to guard against those (literal) edge cases.
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel


Re: [FFmpeg-devel] [PATCH 1/4] zmbvenc: don't sum the entropy when blocks are equal

2018-12-20 Thread Tomas Härdin
ons 2018-12-19 klockan 22:00 + skrev matthew.w.fearn...@gmail.com:
> > From: Matthew Fearnley 
> 
> If *xored is 0, then histogram[0]==bw*bh and histogram[1..255]==0.
> 
> Because histogram[0] is skipped over for the entropy calculation, the
> return value is always 0 when *xored==0, so we don't need to waste time
> calculating it.
> 
> This addition both clarifies the behaviour of the code and improves
> the speed when the block matches.
> ---
>  libavcodec/zmbvenc.c | 5 +
>  1 file changed, 5 insertions(+)
> 
> diff --git a/libavcodec/zmbvenc.c b/libavcodec/zmbvenc.c
> index 4d9147657d..2f041dae32 100644
> --- a/libavcodec/zmbvenc.c
> +++ b/libavcodec/zmbvenc.c
> @@ -71,6 +71,7 @@ static inline int block_cmp(ZmbvEncContext *c, uint8_t 
> *src, int stride,
>  int i, j;
>  uint8_t histogram[256] = {0};
>  
> +/* build frequency histogram of byte values for src[] ^ src2[] */
>  *xored = 0;
>  for(j = 0; j < bh; j++){
>  for(i = 0; i < bw; i++){
> @@ -82,6 +83,10 @@ static inline int block_cmp(ZmbvEncContext *c, uint8_t 
> *src, int stride,
>  src2 += stride2;
>  }
>  
> +/* early out if src and src2 are equal */
> +if (!*xored) return 0;

I have a feeling this could be sped up further by just doing *xored =
histogram[0] == ZMBV_BLOCK*ZMBV_BLOCK after the loops, if [PATCH 3/4]
is applied before this. Computing both histogram and xored in the loop
seems pointless.

Beyond that this looks good

/Tomas
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel


[FFmpeg-devel] [PATCH 1/4] zmbvenc: don't sum the entropy when blocks are equal

2018-12-19 Thread matthew . w . fearnley
From: Matthew Fearnley 

If *xored is 0, then histogram[0]==bw*bh and histogram[1..255]==0.

Because histogram[0] is skipped over for the entropy calculation, the
return value is always 0 when *xored==0, so we don't need to waste time
calculating it.

This addition both clarifies the behaviour of the code and improves
the speed when the block matches.
---
 libavcodec/zmbvenc.c | 5 +
 1 file changed, 5 insertions(+)

diff --git a/libavcodec/zmbvenc.c b/libavcodec/zmbvenc.c
index 4d9147657d..2f041dae32 100644
--- a/libavcodec/zmbvenc.c
+++ b/libavcodec/zmbvenc.c
@@ -71,6 +71,7 @@ static inline int block_cmp(ZmbvEncContext *c, uint8_t *src, 
int stride,
 int i, j;
 uint8_t histogram[256] = {0};
 
+/* build frequency histogram of byte values for src[] ^ src2[] */
 *xored = 0;
 for(j = 0; j < bh; j++){
 for(i = 0; i < bw; i++){
@@ -82,6 +83,10 @@ static inline int block_cmp(ZmbvEncContext *c, uint8_t *src, 
int stride,
 src2 += stride2;
 }
 
+/* early out if src and src2 are equal */
+if (!*xored) return 0;
+
+/* sum the entropy of the non-zero values */
 for(i = 1; i < 256; i++)
 sum += c->score_tab[histogram[i]];
 
-- 
2.17.1

___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel