On May 21, 2014, at 5:15 PM, Laurent Birtz <[email protected]> wrote:
> Hi, > > included is a design document for the analysis code refactoring. > > Laurent > <an_refactor.txt> Hi, Good read. Here are a couple of comments. 1) "The RDM costs are not well correlated with the RDO costs.” This may be true, but many published papers indicate strong Pearson correlation scores between Rough-Mode Decision (RMD) and Rate-Distortion Optimization (RDO). It might be a good idea to switch to RMD instead, at least for intra. We could follow your proposed logic of low quality: RMD only, hybrid RMD-RDO for medium quality, and RDO only for high quality. I’ve tested a few sequences with the suggested 22, 27, 32 and 37 QPs. The best modes frequently differ if using RMD over RDO. The Spearman rank-order correlation score was typically less than 0.8. That’s not high. It means that the modes are sorted in a coarsely similar arrangement. This explains why the winning mode differs. The Pearson correlation score however was above 0.90. This indicate a rather linear correlation between RMD and RDO. The trailing modes are more interesting here. Poorly performing RMD modes are also expected to perform poorly in RDO. Thus RMD is a good tool to discard ill-suited modes, not to select the best one. This is the assumption used in the reference software. The reference software's algorithm is a combination of Piao’s and Zhao’s work. It uses RMD to weed out the trailing modes. The problem I see is that it simply selects the top N modes based on the size of the block. I like about your idea of excluding modes whose cost is greater than 4/3. In some cases, this would limit even more the subset of candidates competing in RDO. In other cases, it would increase the subset. 2) “The experiments to estimate the transform tree layout using RDM have failed thus far.” True. The only thing I can think of would be to tailor an algorithm for intra coding, and a second one for inter coding. The assumption here is that intra and inter residual coefficients aren’t distributed in similar fashion. If this could be demonstrated, then we could exploit the coding depths and favour same-size PB-TBs for intra, and smaller size TBs for inter coding (more complex layout with few blocks that actually have coefficients). 3) "The optimal CB size is often 16x16 or below.” What settings did you use? I’ve seen papers show frequent usage of bigger blocks when dealing with 1080 sequences. The standard clearly states that the bigger block sizes will be better for bigger resolutions (4K and up I guess). The QP also plays a role in this AFAIK. Increasing the QP kills the high frequencies resulting in blurry images. As the blocks become more and more uniform, using bigger blocks starts to get efficient. 4) About Intra 64x64 RDO. "Another way to proceed is to do the RDO search on the modes selected by the 32x32 CBs (up to four different modes).” I don’t think this is a good idea. Intra 64x64 is pretty much there to save on signalling costs. Consider the case where 3 of the 4 32x32 CBs selected the same mode. The 4th CB opted for a different mode because the cost was better suited. For the 64x64 CB to win, the signalling savings need to outweigh the added distortion in the 4th CB. This may still be possible, but as soon as you are faced with 3+ different modes, I have little faith that exploring all these modes will result in the 64x64 winning the RDO contest. What we could also do is check the modes and if they all belong to a similar direction, then run 64x64 intra RDO. However, if all the modes differ (one vertical, one horizontal, one DC and one angular for instance), I suggest we simply skip the 64x64 case. François-- To unsubscribe visit http://f265.org or send a mail to [email protected].
