Re: [FFmpeg-devel] [PATCH 1/2] libavcodec/zmbvenc: block scoring improvements/bug fixes
On Fri, Feb 22, 2019 at 01:09:12PM +0100, Tomas Härdin wrote: > fre 2019-02-22 klockan 00:11 +0100 skrev Michael Niedermayer: > > On Thu, Feb 21, 2019 at 09:26:44PM +, Matthew Fearnley wrote: > > > - To clear up the effects: the change from 'i=1' to 'i=0' was the only > > > change that should make any difference in the output. > > > The rest of the changes were to improve the speed of an inner loop, and to > > > fix two bugs that happily cancelled each other out, but would have broken > > > if the code were changed to emit larger blocks. These should not lead to > > > changes in the blocks chosen or the final compression size. > > > > > > - Regarding the worse compression: let me start by stating that scoring on > > > byte frequency alone will not always predict well how Deflate will work, > > > since Deflate also uses pattern matching as well. > > > Frequency-based block scoring will only ever be a heuristic. There may be > > > scope for improving the scoring, but it would add a lot of complexity, and > > > I think it would be easy to make things worse rather than better. (I > > > found > > > this in my brief experiments with including the frequencies from the > > > previous block.) > > > > implementing the exact scoring by using Deflate itself would allow us to > > understand how much there is between the heuristic and what is achievable. > > If there is a gap that is significant then it may be interresting to > > look at improving the heuristic if the gap is small than the whole > > heuristic improvment path can be discarded. > > The obvious improvment for the heuristic would be to move to higher > > order entropy estimation. ATM IIUC this uses a 0th order. 1st order could > > be tried or any fast compression algorithm could also be used to estimate > > entropy. > > We tried a sliding window order-0 model, which did worse. An order-1 > model might be useful, but would need quite a bit of context for it to > be meaningful > > > It also would be interresting to see how a higher order heuristic or > > some simple fast LZ like compressor would perform with the previous block. > > That is if the issue you saw with the previous block inclusion was because > > of the very simplistic entropy estimation. > > zlib at level 1 maybe? You could also have a multi-pass thing where it > first does a coarse pass then tries to jiggle MVs and greedily accept > any change that results in smaller output. yes, but i would avoid using external code like zlib for this. Its not designed for fast entropy estimation of many blocks. [...] -- Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB Opposition brings concord. Out of discord comes the fairest harmony. -- Heraclitus signature.asc Description: PGP signature ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
Re: [FFmpeg-devel] [PATCH 1/2] libavcodec/zmbvenc: block scoring improvements/bug fixes
fre 2019-02-22 klockan 13:09 +0100 skrev Tomas Härdin: > fre 2019-02-22 klockan 00:11 +0100 skrev Michael Niedermayer: > > On Thu, Feb 21, 2019 at 09:26:44PM +, Matthew Fearnley wrote: > > > - To clear up the effects: the change from 'i=1' to 'i=0' was the only > > > change that should make any difference in the output. > > > The rest of the changes were to improve the speed of an inner loop, and to > > > fix two bugs that happily cancelled each other out, but would have broken > > > if the code were changed to emit larger blocks. These should not lead to > > > changes in the blocks chosen or the final compression size. > > > > > > - Regarding the worse compression: let me start by stating that scoring on > > > byte frequency alone will not always predict well how Deflate will work, > > > since Deflate also uses pattern matching as well. > > > Frequency-based block scoring will only ever be a heuristic. There may be > > > scope for improving the scoring, but it would add a lot of complexity, and > > > I think it would be easy to make things worse rather than better. (I > > > found > > > this in my brief experiments with including the frequencies from the > > > previous block.) > > > > implementing the exact scoring by using Deflate itself would allow us to > > understand how much there is between the heuristic and what is achievable. > > If there is a gap that is significant then it may be interresting to > > look at improving the heuristic if the gap is small than the whole > > heuristic improvment path can be discarded. > > The obvious improvment for the heuristic would be to move to higher > > order entropy estimation. ATM IIUC this uses a 0th order. 1st order could > > be tried or any fast compression algorithm could also be used to estimate > > entropy. > > We tried a sliding window order-0 model, which did worse. An order-1 > model might be useful, but would need quite a bit of context for it to > be meaningful I did some looking into this because I was unsure, and it turns out evaluating the entropy of an order-1 model involves computing the eigenvector of the transfer matrix corresponding to the eigenvalue 1, and computing the scalar product between that and the order-0 entropy of each column of the transfer matrix. A pricey operation. With some luck computing the entropy change for a small perturbation (MV change) over the transfer matrix for the entire frame *might* be cheap enough to be useful. Maybe. The resulting matrix difference has a particular structure that might be exploitable.. /Tomas ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
Re: [FFmpeg-devel] [PATCH 1/2] libavcodec/zmbvenc: block scoring improvements/bug fixes
fre 2019-02-22 klockan 00:11 +0100 skrev Michael Niedermayer: > On Thu, Feb 21, 2019 at 09:26:44PM +, Matthew Fearnley wrote: > > - To clear up the effects: the change from 'i=1' to 'i=0' was the only > > change that should make any difference in the output. > > The rest of the changes were to improve the speed of an inner loop, and to > > fix two bugs that happily cancelled each other out, but would have broken > > if the code were changed to emit larger blocks. These should not lead to > > changes in the blocks chosen or the final compression size. > > > > - Regarding the worse compression: let me start by stating that scoring on > > byte frequency alone will not always predict well how Deflate will work, > > since Deflate also uses pattern matching as well. > > Frequency-based block scoring will only ever be a heuristic. There may be > > scope for improving the scoring, but it would add a lot of complexity, and > > I think it would be easy to make things worse rather than better. (I found > > this in my brief experiments with including the frequencies from the > > previous block.) > > implementing the exact scoring by using Deflate itself would allow us to > understand how much there is between the heuristic and what is achievable. > If there is a gap that is significant then it may be interresting to > look at improving the heuristic if the gap is small than the whole > heuristic improvment path can be discarded. > The obvious improvment for the heuristic would be to move to higher > order entropy estimation. ATM IIUC this uses a 0th order. 1st order could > be tried or any fast compression algorithm could also be used to estimate > entropy. We tried a sliding window order-0 model, which did worse. An order-1 model might be useful, but would need quite a bit of context for it to be meaningful > It also would be interresting to see how a higher order heuristic or > some simple fast LZ like compressor would perform with the previous block. > That is if the issue you saw with the previous block inclusion was because > of the very simplistic entropy estimation. zlib at level 1 maybe? You could also have a multi-pass thing where it first does a coarse pass then tries to jiggle MVs and greedily accept any change that results in smaller output. /Tomas ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
Re: [FFmpeg-devel] [PATCH 1/2] libavcodec/zmbvenc: block scoring improvements/bug fixes
tor 2019-02-21 klockan 15:26 +0100 skrev Carl Eugen Hoyos: > > 2019-02-21 11:26 GMT+01:00, Tomas Härdin : > > ons 2019-02-20 klockan 23:33 +0100 skrev Carl Eugen Hoyos: > > > > > > > > 2019-02-10 16:42 GMT+01:00, Tomas Härdin : > > > > lör 2019-02-09 klockan 13:10 + skrev Matthew Fearnley: > > > > > - Improve block choices by counting 0-bytes in the entropy score > > > > > - Make histogram use uint16_t type, to allow byte counts from 16*16 > > > > > (current block size) up to 255*255 (maximum allowed 8bpp block size) > > > > > - Make sure score table is big enough for a full block's worth of > > > > > bytes > > > > > - Calculate *xored without using code in inner loop > > > > > --- > > > > > libavcodec/zmbvenc.c | 22 -- > > > > > 1 file changed, 16 insertions(+), 6 deletions(-) > > > > > > > > Passes FATE, looks good to me > > > > > > I believe you asked on irc about fate tests: > > > Since such a test would depend on the zlib version, you cannot test > > > exact output, you could only test a round-trip (assuming the codec > > > really is lossless). > > > > Right, we'd need to embed a specific version of zlib. But we probably > > don't want to do that, for many reasons. > > > > A dummy DEFLATE implementation that just passes the bytes along could > > be useful. I know there's a mode in the DEFLATE format for blocks that > > fail to compress. That way we could test many encoders that depend on > > zlib, and their output should decode just fine. > > What's wrong with a round-trip test? Nothing I suppose. But it might be nice to catch unintended changes to the encoder's output /Tomas ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
Re: [FFmpeg-devel] [PATCH 1/2] libavcodec/zmbvenc: block scoring improvements/bug fixes
On Thu, Feb 21, 2019 at 09:26:44PM +, Matthew Fearnley wrote: > On Thu, 21 Feb 2019 at 19:28, Michael Niedermayer > wrote: > > > On Thu, Feb 21, 2019 at 11:19:08AM +0100, Tomas Härdin wrote: > > > ons 2019-02-20 klockan 22:12 +0100 skrev Michael Niedermayer: > > > > On Sat, Feb 09, 2019 at 01:10:20PM +, Matthew Fearnley wrote: > > > > > - Improve block choices by counting 0-bytes in the entropy score > > > > > - Make histogram use uint16_t type, to allow byte counts from 16*16 > > > > > (current block size) up to 255*255 (maximum allowed 8bpp block size) > > > > > - Make sure score table is big enough for a full block's worth of > > bytes > > > > > - Calculate *xored without using code in inner loop > > > > > > > > This should have been split into multiple changes > > > > > > Alas > > > > > > > > > compression seems to become slightly worse from this change > > > > > > > > ./ffmpeg -i matrixbench_mpeg2.mpg -vframes 30 -vcodec zmbv -an -y > > test-old.avi > > > > ./ffmpeg -i matrixbench_mpeg2.mpg -vframes 30 -vcodec zmbv -an -y > > test-new.avi > > > > > > > > -rw-r- 1 michael michael 1175466 Feb 20 22:06 test-new.avi > > > > -rw-r- 1 michael michael 1174832 Feb 20 22:07 test-old.avi > > > > > > A whopping 0.05% change, > > > > no its just a tiny difference but i think it still would make sense to > > understand it. > > Generally a patch doing 4 improvments would be expected to improve things. > > This was not a hand picked case. I mean its not as if I saw 10 cases > > improving things and i picked the 11th. i looked at one case and that was > > worse > > > > Also either way the patch should have been split, dont you agree ? > > it also would make it much easier to within seconds identify which > > change actually caused the 0.05% difference. (maybe avoiding this > > discussion as it maybe would be obvious from that if this is a > > ommision in the commit message a bug a rare outlier ...) > > > > Hi Michael. First, let me say thanks for your feedback on this patch. > Let me address some of this.. > > - Yes: in retrospect, I think I should have split the patch. > I struggled here to work out how to split up my changes, and spent too long > staring too closely at the code to see this. > I still perhaps have some way to go before I would be able to get a proper > feel for the general FFmpeg development ecosystem. if something is unclear, ask, yes i know this is often not so simple but especially in case of for example spliting patches, if you dont know how its easy to ask > > - To clear up the effects: the change from 'i=1' to 'i=0' was the only > change that should make any difference in the output. > The rest of the changes were to improve the speed of an inner loop, and to > fix two bugs that happily cancelled each other out, but would have broken > if the code were changed to emit larger blocks. These should not lead to > changes in the blocks chosen or the final compression size. > > - Regarding the worse compression: let me start by stating that scoring on > byte frequency alone will not always predict well how Deflate will work, > since Deflate also uses pattern matching as well. > Frequency-based block scoring will only ever be a heuristic. There may be > scope for improving the scoring, but it would add a lot of complexity, and > I think it would be easy to make things worse rather than better. (I found > this in my brief experiments with including the frequencies from the > previous block.) implementing the exact scoring by using Deflate itself would allow us to understand how much there is between the heuristic and what is achievable. If there is a gap that is significant then it may be interresting to look at improving the heuristic if the gap is small than the whole heuristic improvment path can be discarded. The obvious improvment for the heuristic would be to move to higher order entropy estimation. ATM IIUC this uses a 0th order. 1st order could be tried or any fast compression algorithm could also be used to estimate entropy. It also would be interresting to see how a higher order heuristic or some simple fast LZ like compressor would perform with the previous block. That is if the issue you saw with the previous block inclusion was because of the very simplistic entropy estimation. > > I think the main thing that persuaded me to make this particular change > from 1 to 0 was that it gave a pure entropy-based score, and allowed me, > mathematically, to use the same scoring table on both full and partial > blocks, without having to worry about the effects of omitting the 0-bytes > in the score in each case. > > I focused my testing on the available ZMBV samples, which were likely all > generated by DOSBox, and my findings were that while not all movies were > smaller, overall the gains outweighed the losses. > I'm slightly sad to hear it didn't improve this real-world movie sample. > But I think this level in this case is acceptable, and not outside the >
Re: [FFmpeg-devel] [PATCH 1/2] libavcodec/zmbvenc: block scoring improvements/bug fixes
On Thu, 21 Feb 2019 at 19:28, Michael Niedermayer wrote: > On Thu, Feb 21, 2019 at 11:19:08AM +0100, Tomas Härdin wrote: > > ons 2019-02-20 klockan 22:12 +0100 skrev Michael Niedermayer: > > > On Sat, Feb 09, 2019 at 01:10:20PM +, Matthew Fearnley wrote: > > > > - Improve block choices by counting 0-bytes in the entropy score > > > > - Make histogram use uint16_t type, to allow byte counts from 16*16 > > > > (current block size) up to 255*255 (maximum allowed 8bpp block size) > > > > - Make sure score table is big enough for a full block's worth of > bytes > > > > - Calculate *xored without using code in inner loop > > > > > > This should have been split into multiple changes > > > > Alas > > > > > > compression seems to become slightly worse from this change > > > > > > ./ffmpeg -i matrixbench_mpeg2.mpg -vframes 30 -vcodec zmbv -an -y > test-old.avi > > > ./ffmpeg -i matrixbench_mpeg2.mpg -vframes 30 -vcodec zmbv -an -y > test-new.avi > > > > > > -rw-r- 1 michael michael 1175466 Feb 20 22:06 test-new.avi > > > -rw-r- 1 michael michael 1174832 Feb 20 22:07 test-old.avi > > > > A whopping 0.05% change, > > no its just a tiny difference but i think it still would make sense to > understand it. > Generally a patch doing 4 improvments would be expected to improve things. > This was not a hand picked case. I mean its not as if I saw 10 cases > improving things and i picked the 11th. i looked at one case and that was > worse > > Also either way the patch should have been split, dont you agree ? > it also would make it much easier to within seconds identify which > change actually caused the 0.05% difference. (maybe avoiding this > discussion as it maybe would be obvious from that if this is a > ommision in the commit message a bug a rare outlier ...) > > Hi Michael. First, let me say thanks for your feedback on this patch. Let me address some of this.. - Yes: in retrospect, I think I should have split the patch. I struggled here to work out how to split up my changes, and spent too long staring too closely at the code to see this. I still perhaps have some way to go before I would be able to get a proper feel for the general FFmpeg development ecosystem. - To clear up the effects: the change from 'i=1' to 'i=0' was the only change that should make any difference in the output. The rest of the changes were to improve the speed of an inner loop, and to fix two bugs that happily cancelled each other out, but would have broken if the code were changed to emit larger blocks. These should not lead to changes in the blocks chosen or the final compression size. - Regarding the worse compression: let me start by stating that scoring on byte frequency alone will not always predict well how Deflate will work, since Deflate also uses pattern matching as well. Frequency-based block scoring will only ever be a heuristic. There may be scope for improving the scoring, but it would add a lot of complexity, and I think it would be easy to make things worse rather than better. (I found this in my brief experiments with including the frequencies from the previous block.) I think the main thing that persuaded me to make this particular change from 1 to 0 was that it gave a pure entropy-based score, and allowed me, mathematically, to use the same scoring table on both full and partial blocks, without having to worry about the effects of omitting the 0-bytes in the score in each case. I focused my testing on the available ZMBV samples, which were likely all generated by DOSBox, and my findings were that while not all movies were smaller, overall the gains outweighed the losses. I'm slightly sad to hear it didn't improve this real-world movie sample. But I think this level in this case is acceptable, and not outside the expected range of size difference that I've seen. And the improvements to the motion estimation range (from the other patch) should tend to lead to a compression improvement overall. Maybe I should have been clearer though, that the scoring changes won't bring universal improvements. And if someone looks into this and finds on average results are worsened by the change, I won't complain if it gets reverted. Thanks for your feedback, I appreciate it. Matthew > > with an MPEG source rather than the intended > > PAL8.. > > The input to the encoder is always PAL8. so i assume you mean > "old computer game material" vs. "realistic video". > > > > > > The flip side here is that we can now use larger block sizes and higher > > me_range without crashing the encoder. We can add a slower mode that > > tests a few different block sizes to see which one results in the > > smallest output. > > > > /Tomas > > ___ > > ffmpeg-devel mailing list > > ffmpeg-devel@ffmpeg.org > > https://ffmpeg.org/mailman/listinfo/ffmpeg-devel > > -- > Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB > > Complexity theory is the science of finding the exact
Re: [FFmpeg-devel] [PATCH 1/2] libavcodec/zmbvenc: block scoring improvements/bug fixes
On Thu, Feb 21, 2019 at 11:19:08AM +0100, Tomas Härdin wrote: > ons 2019-02-20 klockan 22:12 +0100 skrev Michael Niedermayer: > > On Sat, Feb 09, 2019 at 01:10:20PM +, Matthew Fearnley wrote: > > > - Improve block choices by counting 0-bytes in the entropy score > > > - Make histogram use uint16_t type, to allow byte counts from 16*16 > > > (current block size) up to 255*255 (maximum allowed 8bpp block size) > > > - Make sure score table is big enough for a full block's worth of bytes > > > - Calculate *xored without using code in inner loop > > > > This should have been split into multiple changes > > Alas > > > compression seems to become slightly worse from this change > > > > ./ffmpeg -i matrixbench_mpeg2.mpg -vframes 30 -vcodec zmbv -an -y > > test-old.avi > > ./ffmpeg -i matrixbench_mpeg2.mpg -vframes 30 -vcodec zmbv -an -y > > test-new.avi > > > > -rw-r- 1 michael michael 1175466 Feb 20 22:06 test-new.avi > > -rw-r- 1 michael michael 1174832 Feb 20 22:07 test-old.avi > > A whopping 0.05% change, no its just a tiny difference but i think it still would make sense to understand it. Generally a patch doing 4 improvments would be expected to improve things. This was not a hand picked case. I mean its not as if I saw 10 cases improving things and i picked the 11th. i looked at one case and that was worse Also either way the patch should have been split, dont you agree ? it also would make it much easier to within seconds identify which change actually caused the 0.05% difference. (maybe avoiding this discussion as it maybe would be obvious from that if this is a ommision in the commit message a bug a rare outlier ...) > with an MPEG source rather than the intended > PAL8.. The input to the encoder is always PAL8. so i assume you mean "old computer game material" vs. "realistic video". > > The flip side here is that we can now use larger block sizes and higher > me_range without crashing the encoder. We can add a slower mode that > tests a few different block sizes to see which one results in the > smallest output. > > /Tomas > ___ > ffmpeg-devel mailing list > ffmpeg-devel@ffmpeg.org > https://ffmpeg.org/mailman/listinfo/ffmpeg-devel -- Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB Complexity theory is the science of finding the exact solution to an approximation. Benchmarking OTOH is finding an approximation of the exact signature.asc Description: PGP signature ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
Re: [FFmpeg-devel] [PATCH 1/2] libavcodec/zmbvenc: block scoring improvements/bug fixes
2019-02-21 15:08 GMT+01:00, Matthew Fearnley : > The question would still remain though, of the best way to incorporate it > into FFMEG. That's probably beyond my current knowledge.. It belongs in libavutil, see also the unfinished inflate implementation: https://github.com/michaelni/FFmpeg/tree/inflate Carl Eugen ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
Re: [FFmpeg-devel] [PATCH 1/2] libavcodec/zmbvenc: block scoring improvements/bug fixes
2019-02-21 11:26 GMT+01:00, Tomas Härdin : > ons 2019-02-20 klockan 23:33 +0100 skrev Carl Eugen Hoyos: >> > 2019-02-10 16:42 GMT+01:00, Tomas Härdin : >> > lör 2019-02-09 klockan 13:10 + skrev Matthew Fearnley: >> > > - Improve block choices by counting 0-bytes in the entropy score >> > > - Make histogram use uint16_t type, to allow byte counts from 16*16 >> > > (current block size) up to 255*255 (maximum allowed 8bpp block size) >> > > - Make sure score table is big enough for a full block's worth of >> > > bytes >> > > - Calculate *xored without using code in inner loop >> > > --- >> > > libavcodec/zmbvenc.c | 22 -- >> > > 1 file changed, 16 insertions(+), 6 deletions(-) >> > >> > Passes FATE, looks good to me >> >> I believe you asked on irc about fate tests: >> Since such a test would depend on the zlib version, you cannot test >> exact output, you could only test a round-trip (assuming the codec >> really is lossless). > > Right, we'd need to embed a specific version of zlib. But we probably > don't want to do that, for many reasons. > > A dummy DEFLATE implementation that just passes the bytes along could > be useful. I know there's a mode in the DEFLATE format for blocks that > fail to compress. That way we could test many encoders that depend on > zlib, and their output should decode just fine. What's wrong with a round-trip test? Carl Eugen ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
Re: [FFmpeg-devel] [PATCH 1/2] libavcodec/zmbvenc: block scoring improvements/bug fixes
On Thu, 21 Feb 2019 at 10:26, Tomas Härdin wrote: > ons 2019-02-20 klockan 23:33 +0100 skrev Carl Eugen Hoyos: > > > 2019-02-10 16:42 GMT+01:00, Tomas Härdin : > > > lör 2019-02-09 klockan 13:10 + skrev Matthew Fearnley: > > > > - Improve block choices by counting 0-bytes in the entropy score > > > > - Make histogram use uint16_t type, to allow byte counts from 16*16 > > > > (current block size) up to 255*255 (maximum allowed 8bpp block size) > > > > - Make sure score table is big enough for a full block's worth of > bytes > > > > - Calculate *xored without using code in inner loop > > > > --- > > > > libavcodec/zmbvenc.c | 22 -- > > > > 1 file changed, 16 insertions(+), 6 deletions(-) > > > > > > Passes FATE, looks good to me > > > > I believe you asked on irc about fate tests: > > Since such a test would depend on the zlib version, you cannot test > > exact output, you could only test a round-trip (assuming the codec > > really is lossless). > > Right, we'd need to embed a specific version of zlib. But we probably > don't want to do that, for many reasons. > > A dummy DEFLATE implementation that just passes the bytes along could > be useful. I know there's a mode in the DEFLATE format for blocks that > fail to compress. That way we could test many encoders that depend on > zlib, and their output should decode just fine. > I know the DEFLATE format fairly well, and a dummy DEFLATE implementation would certainly be possible. An uncompressed block can be up to 65535 bytes, and has a simple five-byte header (subject to some bit alignment issues, but they only come into play when other block types are also used). Here's a simple code example, the guts of which fit into <70 lines: https://gist.github.com/countingpine/602453d330ea055a4bcab90ca2a7ed02 It allows data to be compressed in chunks, allowing a stream to be split across frames, like in ZMBV. The first chunk needs the START flag, and the last chunk (if any) needs the FINISH flag. It also allows adding an Adler32 checksum - although ZMBV itself doesn't need this, because it never "finishes" the stream. The question would still remain though, of the best way to incorporate it into FFMEG. That's probably beyond my current knowledge.. Matthew /Tomas > ___ > ffmpeg-devel mailing list > ffmpeg-devel@ffmpeg.org > https://ffmpeg.org/mailman/listinfo/ffmpeg-devel > ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
Re: [FFmpeg-devel] [PATCH 1/2] libavcodec/zmbvenc: block scoring improvements/bug fixes
tor 2019-02-21 klockan 11:26 +0100 skrev Tomas Härdin: > ons 2019-02-20 klockan 23:33 +0100 skrev Carl Eugen Hoyos: > > > > > > 2019-02-10 16:42 GMT+01:00, Tomas Härdin : > > > lör 2019-02-09 klockan 13:10 + skrev Matthew Fearnley: > > > > - Improve block choices by counting 0-bytes in the entropy score > > > > - Make histogram use uint16_t type, to allow byte counts from 16*16 > > > > (current block size) up to 255*255 (maximum allowed 8bpp block size) > > > > - Make sure score table is big enough for a full block's worth of bytes > > > > - Calculate *xored without using code in inner loop > > > > --- > > > > libavcodec/zmbvenc.c | 22 -- > > > > 1 file changed, 16 insertions(+), 6 deletions(-) > > > > > > Passes FATE, looks good to me > > > > I believe you asked on irc about fate tests: > > Since such a test would depend on the zlib version, you cannot test > > exact output, you could only test a round-trip (assuming the codec > > really is lossless). > > Right, we'd need to embed a specific version of zlib. But we probably > don't want to do that, for many reasons. > > A dummy DEFLATE implementation that just passes the bytes along could > be useful. I know there's a mode in the DEFLATE format for blocks that > fail to compress. That way we could test many encoders that depend on > zlib, and their output should decode just fine. .. or just -compression_level 0 upon further inspection. Not likely to change between zlib version, but of course not guaranteed not to /Tomas ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
Re: [FFmpeg-devel] [PATCH 1/2] libavcodec/zmbvenc: block scoring improvements/bug fixes
ons 2019-02-20 klockan 23:33 +0100 skrev Carl Eugen Hoyos: > > 2019-02-10 16:42 GMT+01:00, Tomas Härdin : > > lör 2019-02-09 klockan 13:10 + skrev Matthew Fearnley: > > > - Improve block choices by counting 0-bytes in the entropy score > > > - Make histogram use uint16_t type, to allow byte counts from 16*16 > > > (current block size) up to 255*255 (maximum allowed 8bpp block size) > > > - Make sure score table is big enough for a full block's worth of bytes > > > - Calculate *xored without using code in inner loop > > > --- > > > libavcodec/zmbvenc.c | 22 -- > > > 1 file changed, 16 insertions(+), 6 deletions(-) > > > > Passes FATE, looks good to me > > I believe you asked on irc about fate tests: > Since such a test would depend on the zlib version, you cannot test > exact output, you could only test a round-trip (assuming the codec > really is lossless). Right, we'd need to embed a specific version of zlib. But we probably don't want to do that, for many reasons. A dummy DEFLATE implementation that just passes the bytes along could be useful. I know there's a mode in the DEFLATE format for blocks that fail to compress. That way we could test many encoders that depend on zlib, and their output should decode just fine. /Tomas ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
Re: [FFmpeg-devel] [PATCH 1/2] libavcodec/zmbvenc: block scoring improvements/bug fixes
ons 2019-02-20 klockan 22:12 +0100 skrev Michael Niedermayer: > On Sat, Feb 09, 2019 at 01:10:20PM +, Matthew Fearnley wrote: > > - Improve block choices by counting 0-bytes in the entropy score > > - Make histogram use uint16_t type, to allow byte counts from 16*16 > > (current block size) up to 255*255 (maximum allowed 8bpp block size) > > - Make sure score table is big enough for a full block's worth of bytes > > - Calculate *xored without using code in inner loop > > This should have been split into multiple changes Alas > compression seems to become slightly worse from this change > > ./ffmpeg -i matrixbench_mpeg2.mpg -vframes 30 -vcodec zmbv -an -y test-old.avi > ./ffmpeg -i matrixbench_mpeg2.mpg -vframes 30 -vcodec zmbv -an -y test-new.avi > > -rw-r- 1 michael michael 1175466 Feb 20 22:06 test-new.avi > -rw-r- 1 michael michael 1174832 Feb 20 22:07 test-old.avi A whopping 0.05% change, with an MPEG source rather than the intended PAL8.. The flip side here is that we can now use larger block sizes and higher me_range without crashing the encoder. We can add a slower mode that tests a few different block sizes to see which one results in the smallest output. /Tomas ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
Re: [FFmpeg-devel] [PATCH 1/2] libavcodec/zmbvenc: block scoring improvements/bug fixes
2019-02-10 16:42 GMT+01:00, Tomas Härdin : > lör 2019-02-09 klockan 13:10 + skrev Matthew Fearnley: >> - Improve block choices by counting 0-bytes in the entropy score >> - Make histogram use uint16_t type, to allow byte counts from 16*16 >> (current block size) up to 255*255 (maximum allowed 8bpp block size) >> - Make sure score table is big enough for a full block's worth of bytes >> - Calculate *xored without using code in inner loop >> --- >> libavcodec/zmbvenc.c | 22 -- >> 1 file changed, 16 insertions(+), 6 deletions(-) > > Passes FATE, looks good to me I believe you asked on irc about fate tests: Since such a test would depend on the zlib version, you cannot test exact output, you could only test a round-trip (assuming the codec really is lossless). Carl Eugen ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
Re: [FFmpeg-devel] [PATCH 1/2] libavcodec/zmbvenc: block scoring improvements/bug fixes
On Sat, Feb 09, 2019 at 01:10:20PM +, Matthew Fearnley wrote: > - Improve block choices by counting 0-bytes in the entropy score > - Make histogram use uint16_t type, to allow byte counts from 16*16 > (current block size) up to 255*255 (maximum allowed 8bpp block size) > - Make sure score table is big enough for a full block's worth of bytes > - Calculate *xored without using code in inner loop This should have been split into multiple changes compression seems to become slightly worse from this change ./ffmpeg -i matrixbench_mpeg2.mpg -vframes 30 -vcodec zmbv -an -y test-old.avi ./ffmpeg -i matrixbench_mpeg2.mpg -vframes 30 -vcodec zmbv -an -y test-new.avi -rw-r- 1 michael michael 1175466 Feb 20 22:06 test-new.avi -rw-r- 1 michael michael 1174832 Feb 20 22:07 test-old.avi [...] -- Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB "You are 36 times more likely to die in a bathtub than at the hands of a terrorist. Also, you are 2.5 times more likely to become a president and 2 times more likely to become an astronaut, than to die in a terrorist attack." -- Thoughty2 signature.asc Description: PGP signature ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
Re: [FFmpeg-devel] [PATCH 1/2] libavcodec/zmbvenc: block scoring improvements/bug fixes
lör 2019-02-09 klockan 13:10 + skrev Matthew Fearnley: > - Improve block choices by counting 0-bytes in the entropy score > - Make histogram use uint16_t type, to allow byte counts from 16*16 > (current block size) up to 255*255 (maximum allowed 8bpp block size) > - Make sure score table is big enough for a full block's worth of bytes > - Calculate *xored without using code in inner loop > --- > libavcodec/zmbvenc.c | 22 -- > 1 file changed, 16 insertions(+), 6 deletions(-) Passes FATE, looks good to me /Tomas ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
[FFmpeg-devel] [PATCH 1/2] libavcodec/zmbvenc: block scoring improvements/bug fixes
- Improve block choices by counting 0-bytes in the entropy score - Make histogram use uint16_t type, to allow byte counts from 16*16 (current block size) up to 255*255 (maximum allowed 8bpp block size) - Make sure score table is big enough for a full block's worth of bytes - Calculate *xored without using code in inner loop --- libavcodec/zmbvenc.c | 22 -- 1 file changed, 16 insertions(+), 6 deletions(-) diff --git a/libavcodec/zmbvenc.c b/libavcodec/zmbvenc.c index 4d9147657d..3df6e724c8 100644 --- a/libavcodec/zmbvenc.c +++ b/libavcodec/zmbvenc.c @@ -55,7 +55,7 @@ typedef struct ZmbvEncContext { int keyint, curfrm; z_stream zstream; -int score_tab[256]; +int score_tab[ZMBV_BLOCK * ZMBV_BLOCK + 1]; } ZmbvEncContext; @@ -69,20 +69,26 @@ static inline int block_cmp(ZmbvEncContext *c, uint8_t *src, int stride, { int sum = 0; int i, j; -uint8_t histogram[256] = {0}; +uint16_t histogram[256] = {0}; -*xored = 0; +/* Build frequency histogram of byte values for src[] ^ src2[] */ for(j = 0; j < bh; j++){ for(i = 0; i < bw; i++){ int t = src[i] ^ src2[i]; histogram[t]++; -*xored |= t; } src += stride; src2 += stride2; } -for(i = 1; i < 256; i++) +/* If not all the xored values were 0, then the blocks are different */ +*xored = (histogram[0] < bw * bh); + +/* Exit early if blocks are equal */ +if (!*xored) return 0; + +/* Sum the entropy of all values */ +for(i = 0; i < 256; i++) sum += c->score_tab[histogram[i]]; return sum; @@ -278,7 +284,11 @@ static av_cold int encode_init(AVCodecContext *avctx) int i; int lvl = 9; -for(i=1; i<256; i++) +/* Entropy-based score tables for comparing blocks. + * Suitable for blocks up to (ZMBV_BLOCK * ZMBV_BLOCK) bytes. + * Scores are nonnegative, lower is better. + */ +for(i = 1; i <= ZMBV_BLOCK * ZMBV_BLOCK; i++) c->score_tab[i] = -i * log2(i / (double)(ZMBV_BLOCK * ZMBV_BLOCK)) * 256; c->avctx = avctx; -- 2.17.1 ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel