Re: [FFmpeg-devel] [PATCH 1/4] zmbvenc: don't sum the entropy when blocks are equal
sön 2019-02-10 klockan 22:18 + skrev Matthew Fearnley: > > On Thu, 31 Jan 2019 at 15:00, Tomas Härdin wrote: > > > > > > 1. The entropy calculation in block_cmp() omits the score of histogram[0] > > > from the final sum. > > > It's tempting to do this to bias the scores in favour of 0-bytes, but in > > > reality, blocks with a majority of 0 (or any other byte) will already be > > > naturally favoured by the entropy score anyway, and this bias will fail > > > > to > > > penalise blocks with an "average" (i.e. high entropy) number of 0's in > > > > them. > > > The exact implications are difficult to ponder, but in practical terms > > > > this > > > error does tend to produce worse results in the video compression. Not > > > massively so, but it's still noticeable. > > > > Did you try combining the statistics of the current MV with the > > statistics of the previous block, to bias the choice in favor of > > similar bytes? Could work like a cheap order-1 entropy model. > > > > I've had a go at this, but sadly, it seemed to adversely affect > compression, producing larger files. > Maybe it can be improved with some tweaking, or maybe there's a bug > somewhere. > But it feels to me like this approach on its own isn't enough to improve > scoring. Maybe as part of something larger though.. > Anyway, you can see the results of my efforts here: > https://github.com/countingpine/FFmpeg/commit/prev_histogram. Darn. Interesting result at least /Tomas ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
Re: [FFmpeg-devel] [PATCH 1/4] zmbvenc: don't sum the entropy when blocks are equal
On Thu, 31 Jan 2019 at 15:00, Tomas Härdin wrote: > > > 1. The entropy calculation in block_cmp() omits the score of histogram[0] > > from the final sum. > > It's tempting to do this to bias the scores in favour of 0-bytes, but in > > reality, blocks with a majority of 0 (or any other byte) will already be > > naturally favoured by the entropy score anyway, and this bias will fail > to > > penalise blocks with an "average" (i.e. high entropy) number of 0's in > them. > > The exact implications are difficult to ponder, but in practical terms > this > > error does tend to produce worse results in the video compression. Not > > massively so, but it's still noticeable. > > Did you try combining the statistics of the current MV with the > statistics of the previous block, to bias the choice in favor of > similar bytes? Could work like a cheap order-1 entropy model. > I've had a go at this, but sadly, it seemed to adversely affect compression, producing larger files. Maybe it can be improved with some tweaking, or maybe there's a bug somewhere. But it feels to me like this approach on its own isn't enough to improve scoring. Maybe as part of something larger though.. Anyway, you can see the results of my efforts here: https://github.com/countingpine/FFmpeg/commit/prev_histogram. - Matthew ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
Re: [FFmpeg-devel] [PATCH 1/4] zmbvenc: don't sum the entropy when blocks are equal
tor 2019-01-31 klockan 13:31 + skrev Matthew Fearnley: > > On Sun, 20 Jan 2019 at 15:16, Tomas Härdin wrote: > > > > Hi. Just to say, I tried setting up additional score_tabs for the > > > bottom/right partial blocks. Unfortunately, after implementing it and > > > ironing out the bugs, the new score tables actually caused a slight > > > increase in file size! > > > After a little informal investigation with a couple of videos, my > > > > findings > > > were that increasing the divisor - '(i / (double)(ZMBV_BLOCK * > > > > ZMBV_BLOCK * > > > some_factor))' - would generally lead to slight compression improvements. > > > Given the score table is still "valid" for smaller blocks, and given the > > > extra complexity of adding the score tables, plus the fact that it may > > > generally hurt the compression, I'm inclined to leave it with the one > > > > score > > > table. But there may be potential to improve the current compression > > > method in future, by somehow tuning the divisor for better results > > > generally. > > > > Huh, that's strange. Sounds like something that warrants a comment in > > the code. I also see we have an answer do Carl's question: you're still > > experimenting with this :) I think we can hold off on pushing anything > > until you feel happy with the result > > > > Hi. > Sorry, I had missed Carl's question. Regrettably my work on this has been > slow since my initial patch submissions, but I think I'm close to > submitting some final changes. Thanks for your patience so far. > > I have recently made two helpful realisations about the above score table > problem: > > 1. The entropy calculation in block_cmp() omits the score of histogram[0] > from the final sum. > It's tempting to do this to bias the scores in favour of 0-bytes, but in > reality, blocks with a majority of 0 (or any other byte) will already be > naturally favoured by the entropy score anyway, and this bias will fail to > penalise blocks with an "average" (i.e. high entropy) number of 0's in them. > The exact implications are difficult to ponder, but in practical terms this > error does tend to produce worse results in the video compression. Not > massively so, but it's still noticeable. Did you try combining the statistics of the current MV with the statistics of the previous block, to bias the choice in favor of similar bytes? Could work like a cheap order-1 entropy model. > 2. As long as the blocks being compared are the same size as each other, > the entropy-based score comparisons (without the above bias) are unaffected > by the base of the log, or the divisor used. > (Mathematically, if 'a+b=c+d', then if 'a*log(a) + b*log(b) < c*log(c) + > d*log(d)' then it is also true that 'a*log(a/N) + b*log(b/N) < c*log(c/N) + > d*log(d/N)'. > If that doesn't make sense, it helps to note that 'log(a/N) = log(a) - > log(N)', so the log(N)'s cancel out.) Convenient observation :) > I am planning to organise my submission into two patches. I intend for one > to focus on problems/inefficiencies in block comparisons/score tables. The > other will be focused on motion estimation, i.e. the range of MV's that are > compared, and to a very small extent, the order they're compared in. > This avoids the overhead of patching each individual issue separately, but > it will provide a logical split between the two main areas of change. Sounds good! /Tomas ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
Re: [FFmpeg-devel] [PATCH 1/4] zmbvenc: don't sum the entropy when blocks are equal
On Sun, 20 Jan 2019 at 15:16, Tomas Härdin wrote: > > Hi. Just to say, I tried setting up additional score_tabs for the > > bottom/right partial blocks. Unfortunately, after implementing it and > > ironing out the bugs, the new score tables actually caused a slight > > increase in file size! > > After a little informal investigation with a couple of videos, my > findings > > were that increasing the divisor - '(i / (double)(ZMBV_BLOCK * > ZMBV_BLOCK * > > some_factor))' - would generally lead to slight compression improvements. > > Given the score table is still "valid" for smaller blocks, and given the > > extra complexity of adding the score tables, plus the fact that it may > > generally hurt the compression, I'm inclined to leave it with the one > score > > table. But there may be potential to improve the current compression > > method in future, by somehow tuning the divisor for better results > > generally. > > Huh, that's strange. Sounds like something that warrants a comment in > the code. I also see we have an answer do Carl's question: you're still > experimenting with this :) I think we can hold off on pushing anything > until you feel happy with the result > Hi. Sorry, I had missed Carl's question. Regrettably my work on this has been slow since my initial patch submissions, but I think I'm close to submitting some final changes. Thanks for your patience so far. I have recently made two helpful realisations about the above score table problem: 1. The entropy calculation in block_cmp() omits the score of histogram[0] from the final sum. It's tempting to do this to bias the scores in favour of 0-bytes, but in reality, blocks with a majority of 0 (or any other byte) will already be naturally favoured by the entropy score anyway, and this bias will fail to penalise blocks with an "average" (i.e. high entropy) number of 0's in them. The exact implications are difficult to ponder, but in practical terms this error does tend to produce worse results in the video compression. Not massively so, but it's still noticeable. (Math is a harsh mistress, and will often look unkindly on such flagrant attempts to circumvent her laws.. :) 2. As long as the blocks being compared are the same size as each other, the entropy-based score comparisons (without the above bias) are unaffected by the base of the log, or the divisor used. (Mathematically, if 'a+b=c+d', then if 'a*log(a) + b*log(b) < c*log(c) + d*log(d)' then it is also true that 'a*log(a/N) + b*log(b/N) < c*log(c/N) + d*log(d/N)'. If that doesn't make sense, it helps to note that 'log(a/N) = log(a) - log(N)', so the log(N)'s cancel out.) This means that we can use a single score table for all block sizes! It only needs to be big enough for the largest block size, then it produces optimal scores for all blocks up to that size. It does mean that partial blocks with uniform bytes won't have a score of 0, but this property of entropy is not actually important when scoring the blocks. (Overall, this is a significant relief, because my attempts at multiple score tables resulted in a lot of tricky decisions about how best to keep the code DRY and minimise complexity, and I wasn't really happy with my solution.) I am planning to organise my submission into two patches. I intend for one to focus on problems/inefficiencies in block comparisons/score tables. The other will be focused on motion estimation, i.e. the range of MV's that are compared, and to a very small extent, the order they're compared in. This avoids the overhead of patching each individual issue separately, but it will provide a logical split between the two main areas of change. Kind regards, Matthew ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
Re: [FFmpeg-devel] [PATCH 1/4] zmbvenc: don't sum the entropy when blocks are equal
lör 2019-01-19 klockan 22:40 + skrev Matthew Fearnley: > > On Tue, 25 Dec 2018 at 09:35, Tomas Härdin wrote: > > > lör 2018-12-22 klockan 15:32 + skrev Matthew Fearnley: > > > > > > > > > > > > > > > > Note that bw,bh aren't guaranteed to equal ZMBV_BLOCK, so > > > > `histogram[0] == > > > > > bw*bh` would have to be used to guard against those (literal) edge > > > > cases. > > > > > > > > Right, yes. But if we have block sizes other than ZMBV_BLOCK^2 then we > > > > need score tables for those sizes too. > > > > > > I’ve thought about that a bit. I wondered if it would be worth it given: > > > - the extra code, memory and logic needed > > > > If you have a huge amount of DOS captures to optimize then it might be > > worth it, else probably questionable > > > > > - it would only improve the edge blocks > > > > I imagine large blocks would be good for scenes with mostly global > > motion. You cut down on the number of MVs and thus the amount of data > > zlib has to compress, if the block size is a good fit. > > > > > - the existing score table isn’t catastrophically bad for short blocks, > > > > and would still favour blocks with more common pixels. > > > > > > It would be better from a correctness perspective though, and effects on > > > > running time should be negligible. > > > > Good point. There's also no telling whether the current model is > > actually an accurate prediction of how zlib behaves :) > > > > > > Hi. Just to say, I tried setting up additional score_tabs for the > bottom/right partial blocks. Unfortunately, after implementing it and > ironing out the bugs, the new score tables actually caused a slight > increase in file size! > After a little informal investigation with a couple of videos, my findings > were that increasing the divisor - '(i / (double)(ZMBV_BLOCK * ZMBV_BLOCK * > some_factor))' - would generally lead to slight compression improvements. > Given the score table is still "valid" for smaller blocks, and given the > extra complexity of adding the score tables, plus the fact that it may > generally hurt the compression, I'm inclined to leave it with the one score > table. But there may be potential to improve the current compression > method in future, by somehow tuning the divisor for better results > generally. Huh, that's strange. Sounds like something that warrants a comment in the code. I also see we have an answer do Carl's question: you're still experimenting with this :) I think we can hold off on pushing anything until you feel happy with the result /Tomas ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
Re: [FFmpeg-devel] [PATCH 1/4] zmbvenc: don't sum the entropy when blocks are equal
On Tue, 25 Dec 2018 at 09:35, Tomas Härdin wrote: > lör 2018-12-22 klockan 15:32 + skrev Matthew Fearnley: > > > > > > > > > Note that bw,bh aren't guaranteed to equal ZMBV_BLOCK, so > `histogram[0] == > > > > bw*bh` would have to be used to guard against those (literal) edge > cases. > > > > > > Right, yes. But if we have block sizes other than ZMBV_BLOCK^2 then we > > > need score tables for those sizes too. > > > > I’ve thought about that a bit. I wondered if it would be worth it given: > > - the extra code, memory and logic needed > > If you have a huge amount of DOS captures to optimize then it might be > worth it, else probably questionable > > > - it would only improve the edge blocks > > I imagine large blocks would be good for scenes with mostly global > motion. You cut down on the number of MVs and thus the amount of data > zlib has to compress, if the block size is a good fit. > > > - the existing score table isn’t catastrophically bad for short blocks, > and would still favour blocks with more common pixels. > > > > It would be better from a correctness perspective though, and effects on > running time should be negligible. > > Good point. There's also no telling whether the current model is > actually an accurate prediction of how zlib behaves :) > > Hi. Just to say, I tried setting up additional score_tabs for the bottom/right partial blocks. Unfortunately, after implementing it and ironing out the bugs, the new score tables actually caused a slight increase in file size! After a little informal investigation with a couple of videos, my findings were that increasing the divisor - '(i / (double)(ZMBV_BLOCK * ZMBV_BLOCK * some_factor))' - would generally lead to slight compression improvements. Given the score table is still "valid" for smaller blocks, and given the extra complexity of adding the score tables, plus the fact that it may generally hurt the compression, I'm inclined to leave it with the one score table. But there may be potential to improve the current compression method in future, by somehow tuning the divisor for better results generally. Matthew ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
Re: [FFmpeg-devel] [PATCH 1/4] zmbvenc: don't sum the entropy when blocks are equal
lör 2018-12-22 klockan 15:32 + skrev Matthew Fearnley: > > > > > > > Note that bw,bh aren't guaranteed to equal ZMBV_BLOCK, so `histogram[0] == > > > bw*bh` would have to be used to guard against those (literal) edge cases. > > > > Right, yes. But if we have block sizes other than ZMBV_BLOCK^2 then we > > need score tables for those sizes too. > > I’ve thought about that a bit. I wondered if it would be worth it given: > - the extra code, memory and logic needed If you have a huge amount of DOS captures to optimize then it might be worth it, else probably questionable > - it would only improve the edge blocks I imagine large blocks would be good for scenes with mostly global motion. You cut down on the number of MVs and thus the amount of data zlib has to compress, if the block size is a good fit. > - the existing score table isn’t catastrophically bad for short blocks, and > would still favour blocks with more common pixels. > > It would be better from a correctness perspective though, and effects on > running time should be negligible. Good point. There's also no telling whether the current model is actually an accurate prediction of how zlib behaves :) /Tomas ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
Re: [FFmpeg-devel] [PATCH 1/4] zmbvenc: don't sum the entropy when blocks are equal
> On 22 Dec 2018, at 12:11, Tomas Härdin wrote: > > tor 2018-12-20 klockan 17:46 + skrev Matthew Fearnley: On Thu, 20 Dec 2018 at 16:30, Tomas Härdin wrote: >>> >>> I have a feeling this could be sped up further by just doing *xored = >>> histogram[0] == ZMBV_BLOCK*ZMBV_BLOCK after the loops, if [PATCH 3/4] >>> is applied before this. Computing both histogram and xored in the loop >>> seems pointless. >>> >> >> You're right, that speedup didn't occur to me. It makes the logic a bit >> more tenuous, but it would be more efficient. > > Eh, I wouldn't really call "we have xored data to output if and only if > number of zeroes < number of pixel" fairly easy to grasp. A comment > might be good tho Agreed. Just have to get the patch order sorted out now somehow. >> Note that bw,bh aren't guaranteed to equal ZMBV_BLOCK, so `histogram[0] == >> bw*bh` would have to be used to guard against those (literal) edge cases. > > Right, yes. But if we have block sizes other than ZMBV_BLOCK^2 then we > need score tables for those sizes too. I’ve thought about that a bit. I wondered if it would be worth it given: - the extra code, memory and logic needed - it would only improve the edge blocks - the existing score table isn’t catastrophically bad for short blocks, and would still favour blocks with more common pixels. It would be better from a correctness perspective though, and effects on running time should be negligible. I can work on a patch, see how it looks. ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
Re: [FFmpeg-devel] [PATCH 1/4] zmbvenc: don't sum the entropy when blocks are equal
tor 2018-12-20 klockan 17:46 + skrev Matthew Fearnley: > > On Thu, 20 Dec 2018 at 16:30, Tomas Härdin wrote: > > > I have a feeling this could be sped up further by just doing *xored = > > histogram[0] == ZMBV_BLOCK*ZMBV_BLOCK after the loops, if [PATCH 3/4] > > is applied before this. Computing both histogram and xored in the loop > > seems pointless. > > > > You're right, that speedup didn't occur to me. It makes the logic a bit > more tenuous, but it would be more efficient. Eh, I wouldn't really call "we have xored data to output if and only if number of zeroes < number of pixel" fairly easy to grasp. A comment might be good tho > Note that bw,bh aren't guaranteed to equal ZMBV_BLOCK, so `histogram[0] == > bw*bh` would have to be used to guard against those (literal) edge cases. Right, yes. But if we have block sizes other than ZMBV_BLOCK^2 then we need score tables for those sizes too. /Tomas ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
Re: [FFmpeg-devel] [PATCH 1/4] zmbvenc: don't sum the entropy when blocks are equal
On Thu, 20 Dec 2018 at 16:30, Tomas Härdin wrote: > I have a feeling this could be sped up further by just doing *xored = > histogram[0] == ZMBV_BLOCK*ZMBV_BLOCK after the loops, if [PATCH 3/4] > is applied before this. Computing both histogram and xored in the loop > seems pointless. > You're right, that speedup didn't occur to me. It makes the logic a bit more tenuous, but it would be more efficient. Note that bw,bh aren't guaranteed to equal ZMBV_BLOCK, so `histogram[0] == bw*bh` would have to be used to guard against those (literal) edge cases. ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
Re: [FFmpeg-devel] [PATCH 1/4] zmbvenc: don't sum the entropy when blocks are equal
ons 2018-12-19 klockan 22:00 + skrev matthew.w.fearn...@gmail.com: > > From: Matthew Fearnley > > If *xored is 0, then histogram[0]==bw*bh and histogram[1..255]==0. > > Because histogram[0] is skipped over for the entropy calculation, the > return value is always 0 when *xored==0, so we don't need to waste time > calculating it. > > This addition both clarifies the behaviour of the code and improves > the speed when the block matches. > --- > libavcodec/zmbvenc.c | 5 + > 1 file changed, 5 insertions(+) > > diff --git a/libavcodec/zmbvenc.c b/libavcodec/zmbvenc.c > index 4d9147657d..2f041dae32 100644 > --- a/libavcodec/zmbvenc.c > +++ b/libavcodec/zmbvenc.c > @@ -71,6 +71,7 @@ static inline int block_cmp(ZmbvEncContext *c, uint8_t > *src, int stride, > int i, j; > uint8_t histogram[256] = {0}; > > +/* build frequency histogram of byte values for src[] ^ src2[] */ > *xored = 0; > for(j = 0; j < bh; j++){ > for(i = 0; i < bw; i++){ > @@ -82,6 +83,10 @@ static inline int block_cmp(ZmbvEncContext *c, uint8_t > *src, int stride, > src2 += stride2; > } > > +/* early out if src and src2 are equal */ > +if (!*xored) return 0; I have a feeling this could be sped up further by just doing *xored = histogram[0] == ZMBV_BLOCK*ZMBV_BLOCK after the loops, if [PATCH 3/4] is applied before this. Computing both histogram and xored in the loop seems pointless. Beyond that this looks good /Tomas ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
[FFmpeg-devel] [PATCH 1/4] zmbvenc: don't sum the entropy when blocks are equal
From: Matthew Fearnley If *xored is 0, then histogram[0]==bw*bh and histogram[1..255]==0. Because histogram[0] is skipped over for the entropy calculation, the return value is always 0 when *xored==0, so we don't need to waste time calculating it. This addition both clarifies the behaviour of the code and improves the speed when the block matches. --- libavcodec/zmbvenc.c | 5 + 1 file changed, 5 insertions(+) diff --git a/libavcodec/zmbvenc.c b/libavcodec/zmbvenc.c index 4d9147657d..2f041dae32 100644 --- a/libavcodec/zmbvenc.c +++ b/libavcodec/zmbvenc.c @@ -71,6 +71,7 @@ static inline int block_cmp(ZmbvEncContext *c, uint8_t *src, int stride, int i, j; uint8_t histogram[256] = {0}; +/* build frequency histogram of byte values for src[] ^ src2[] */ *xored = 0; for(j = 0; j < bh; j++){ for(i = 0; i < bw; i++){ @@ -82,6 +83,10 @@ static inline int block_cmp(ZmbvEncContext *c, uint8_t *src, int stride, src2 += stride2; } +/* early out if src and src2 are equal */ +if (!*xored) return 0; + +/* sum the entropy of the non-zero values */ for(i = 1; i < 256; i++) sum += c->score_tab[histogram[i]]; -- 2.17.1 ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel