Re: [f265 dev team] Analysis refactoring

Francois Caron Fri, 23 May 2014 12:39:55 -0700

On May 21, 2014, at 5:15 PM, Laurent Birtz <[email protected]> wrote:


> Hi,
> 
> included is a design document for the analysis code refactoring.
> 
> Laurent
> <an_refactor.txt>

Hi,

Good read. Here are a couple of comments.

1) "The RDM costs are not well correlated with the RDO costs.”
This may be true, but many published papers indicate strong Pearson correlation 
scores between Rough-Mode Decision (RMD) and Rate-Distortion Optimization 
(RDO). It might be a good idea to switch to RMD instead, at least for intra. We 
could follow your proposed logic of low quality: RMD only, hybrid RMD-RDO for 
medium quality, and RDO only for high quality.

I’ve tested a few sequences with the suggested 22, 27, 32 and 37 QPs. The best 
modes frequently differ if using RMD over RDO. The Spearman rank-order 
correlation score was typically less than 0.8. That’s not high. It means that 
the modes are sorted in a coarsely similar arrangement. This explains why the 
winning mode differs.

The Pearson correlation score however was above 0.90. This indicate a rather 
linear correlation between RMD and RDO. The trailing modes are more interesting 
here. Poorly performing RMD modes are also expected to perform poorly in RDO. 
Thus RMD is a good tool to discard ill-suited modes, not to select the best one.

This is the assumption used in the reference software. The reference software's 
algorithm is a combination of Piao’s and Zhao’s work. It uses RMD to weed out 
the trailing modes. The problem I see is that it simply selects the top N modes 
based on the size of the block. I like about your idea of excluding modes whose 
cost is greater than 4/3. In some cases, this would limit even more the subset 
of candidates competing in RDO. In other cases, it would increase the subset. 

2) “The experiments to estimate the transform tree layout using RDM have failed 
thus far.”
True. The only thing I can think of would be to tailor an algorithm for intra 
coding, and a second one for inter coding. The assumption here is that intra 
and inter residual coefficients aren’t distributed in similar fashion. If this 
could be demonstrated, then we could exploit the coding depths and favour 
same-size PB-TBs for intra, and smaller size TBs for inter coding (more complex 
layout with few blocks that actually have coefficients).

3) "The optimal CB size is often 16x16 or below.”
What settings did you use? I’ve seen papers show frequent usage of bigger 
blocks when dealing with 1080 sequences. The standard clearly states that the 
bigger block sizes will be better for bigger resolutions (4K and up I guess). 
The QP also plays a role in this AFAIK. Increasing the QP kills the high 
frequencies resulting in blurry images. As the blocks become more and more 
uniform, using bigger blocks starts to get efficient.

4) About Intra 64x64 RDO.
"Another way to proceed is to do the RDO search on the modes selected by the 
32x32 CBs (up to four different modes).”
I don’t think this is a good idea. Intra 64x64 is pretty much there to save on 
signalling costs. Consider the case where 3 of the 4 32x32 CBs selected the 
same mode. The 4th CB opted for a different mode because the cost was better 
suited. For the 64x64 CB to win, the signalling savings need to outweigh the 
added distortion in the 4th CB. This may still be possible, but as soon as you 
are faced with 3+ different modes, I have little faith that exploring all these 
modes will result in the 64x64 winning the RDO contest.

What we could also do is check the modes and if they all belong to a similar 
direction, then run 64x64 intra RDO. However, if all the modes differ (one 
vertical, one horizontal, one DC and one angular for instance), I suggest we 
simply skip the 64x64 case.

François--
To unsubscribe visit http://f265.org
or send a mail to [email protected].

Re: [f265 dev team] Analysis refactoring

Reply via email to