On Jul 15, 2014, at 1:03 PM, Laurent Birtz <[email protected]> wrote:

> Recent HM versions now use assert() to verify the conformance of a
> stream with the level specified in the VPS/SPS. We're getting bug
> reports e.g. http://f265.org/bugs/ticket/24
> 
> The level (and tier) affects the following constraints:
> - Maximum picture size and aspect ratio.
> - Maximum frame rate.
> - Maximum bit rate.
> - Maximum DPB size.
> - Maximum number of slices.
> - Maximum number of tiles (rows and columns).
> - Minimum size of the tiles (rows and colums).
> - Minimum compression ratio.
> - Minimum CTB size (32x32 for level 5 and above).
> 
> The first thing to note is that the whole thing is a royal pain in the
> ass. It's like the MaxMvsPer2Mb constraint in H.264. Moving on, the
> question is what we do about it. We have several constraints on our side.
> 
> 1) Our regressions must keep working with our legacy HM version. That
> version is hardcoded to use level 5.1. Some of our regression tests use
> 16x16 CTBs, which is illegal with level 5.1.
> 
> 2) We must be able to encode large videos (4k and beyond) at very low
> quality, for which the analysis of 32x32 CTBs is not an option.
> 
> 3) Letting HM fail to decode our streams is not an option. People rely
> on it (including us).
> 
> 
> Let's assume for a moment that we genuinely want to support the level
> indicator in good faith as the spec people defined it. That means using
> the minimum level that still respects all the constraints, so that (in
> theory) a decoder doesn't refuse to decode the stream because the level
> is higher than what it can safely support.
> 
> Now, f265 is a general encoder. We don't control all the use cases that
> people can potentially use it for. Constant QP, dynamic bit or frame
> rate adaptation prevent us from knowing the effective bitrate and thus
> the minimum level we should use. What do we then?
> 
> We can put the burden on the user. "Read the spec, figure out the
> constraints, figure out the worst case for your current video, then
> specify the level on the command line". That's obviously not a practical
> solution.
> 
> We can guarantee spec conformance. We just use the maximum level defined
> by the spec. In the real world, most sane decoders will try to decode
> the stream regardless of the level indicator. The key word being "most".
> 
> We can make a guess. We define the level using some heuristics so that
> the stream is conformant most of the time. The key word being "most".
> 
> And that's it. It's a catch-22 situation. There doesn't seem to be a
> sane way out of this. It's worth noting that it's not just f265 that is
> facing this problem, everyone making a practical encoder is. My
> reasoning is that practical decoders (aside of HM) will have no choice
> but to decode streams correctly even if the level is incorrect (either
> too low because the user messed up or the encoder heuristics failed, or
> set way too high because the encoder is playing it safe for conformance).
> 
> I'm inclined to always use the maximum level to reduce the number of bug
> reports of the type "HM refuses to decode my stream" and "lambda
> analyzer/decoder says constraint X busted". That would completely
> subvert the intended usage of the level indicator. That's unfortunate,
> but we didn't create this mess. The HM people seem happy to break the
> decoding of streams that used to decode correctly, while defining a set
> of constraints that we can't always respect in practice. It seems the
> safest and easiest option out of the three.
> 
> I would provide a "level" parameter on the command line that the user
> can set explicitly. If let to 0, the encoder will set it to level 6.2.
> We fix the analysis to always use 32x32 CTBs (we ignore the 32x32 CB if
> not desired), unless the maximum CTB size is explicitly signalled on the
> command line. The key points are that we're conformant by default and
> that the level is decoupled from the parameters used. If that causes the
> stream to become non-conformant, so be it. We can add an option later to
> auto-guess the level (off by default).
> 
> What do you guys think?
> 
> Laurent
> --
> To unsubscribe visit http://f265.org
> or send a mail to [email protected].
> 

The way I see it, the root problem is that f265 is a general encoder. There is 
no way to control the user input. Hence, a user may give incompatible 
command-line parameters such as a very big resolution that requires a certain 
level and a CTB size too small for that level.

When such a situation occurs, there are two available courses of action. 1) We 
normalize the parameters and issue a warning about what has been modified. 2) 
We simply issue an error message indicating what parameters clash and abort the 
encoding. Neither of these solutions is sexy.

Normalizing the parameters requires a smart rule engine. Moreover, such a rule 
engine is not necessarily trivial to do correctly. We know in advance that the 
picture size is non-negotiable. We can play with other parameters but it’s not 
a guarantee that the normalized parameters are what the user would have chosen. 
For instance, maybe the user really wants 16x16 blocks: simply signal that 
32x32 CTBs are always split. In another case, maybe the user simply did not 
know about that restraint and does not care about the block size. It will be 
impossible to please everyone.  Furthermore, we might receive many questions 
such as : “Why am I getting 32x32 CTBs when I clearly asked for 16x16 CTBs?”

Issuing an error message and aborting the encoding is not better. This clearly 
shifts the responsibility in the user’s hands. Not everybody will want to take 
the time to learn the HEVC standard and all its rules/constraints. We could 
print out very detailed messages to help them out.

As for specifying the level directly on the command-line, this also opens the 
door to users asking for an incorrect level given the rest of the parameters. 
In that case, what would be the appropriate behaviour? Change the level to fit 
the parameters, assuming a level exists for that set of parameters?

All in all, the second course of action looks easier to do. I’m not sure this 
is the best behaviour from a user point of view…

My two cents.
François

--
To unsubscribe visit http://f265.org
or send a mail to [email protected].

Reply via email to