On 7/31/2021 2:46 PM, Michael Niedermayer wrote:
On Sat, Jul 31, 2021 at 01:10:27PM -0300, James Almer wrote:
On 7/31/2021 6:41 AM, Michael Niedermayer wrote:
On Fri, Jul 30, 2021 at 10:31:53AM -0300, James Almer wrote:
On 7/30/2021 8:33 AM, Michael Niedermayer wrote:
On Thu, Jul 29, 2021 at 03:09:03PM -0300, James Almer wrote:
On 7/29/2021 2:58 PM, Michael Niedermayer wrote:
On Tue, Jul 27, 2021 at 10:08:20AM -0300, James Almer wrote:
Since we can't blindly trust the keyframe flag in packets and assume its
contents are a valid Sync Sample, do some basic bitstream parsing to build the
Sync Sample table in addition to a Random Access Recovery Point table.

Suggested-by: ffm...@fb.com
Signed-off-by: James Almer <jamr...@gmail.com>
---
     libavformat/movenc.c         | 125 +++++++++++++++++++++++++++++++++--
     libavformat/movenc.h         |   1 +
     tests/ref/lavf-fate/h264.mp4 |   6 +-
     3 files changed, 123 insertions(+), 9 deletions(-)


A.
Will this allow a user to mark a subset of keyframes as random
access points ?
A user may want seeking to preferably happen to a scene start and not a
frame before


B.
Will this allow a user to mark a damaged keyframe as such ?
A frame might in fact have been a keyframe and maybe the only in the file
but maybe was damaged so a parser might fail to detect it.

This code will ensure all packets with an IDR picture will be listed in the
Sync Sample table, and all packets with a Recovery Point SEI message will be
listed in the Random Access Recovery Point table.
Whatever is signaled in packets (like the keyframe flag) is ignored to
prevent creating non-spec compliant output.

The file in case B will be non compliant, it is damaged data, it cannot
be muxed in a compliant way. If we support muxing damaged data then
parsing that data as a way to unconditionally override the user is
problematic.
If you try to interpret damaged data the result is undefined, the spec
does not tell you how to interpret it and 2 parsers can disagree
if a damaged frame is a keyframe.

If you don't mark a packet as a Sync Sample, even if it contains damaged
data, the file is still compliant. The point here is to avoid filling the
Sync Sample table with bogus entries.

IMHO, damaged data is not always a compliant video stream, otherwise in the 
most extreem
example /dev/random would have to be a compliant h264 stream.




Lets just as 2 examples consider multiple slices some being parsed as IDR
some as non IDR, so what is that ? or what if some packet reodering or
duplication causes parts of a IDR and non IDR frame to be mixed.
where is the IDR frame in that case ? 2 parsers can disagree

The AVCodecParserContext will tag a packet with an IDR slice as keyframe,
and then disregard anything else it may find. I made the code i wrote here
do the same.

If thats the case, we could take the example of a frame lets say it has
100 slices, and in one of these 1 byte is changed so it becomes a IDR slice
That is not really a keyframe or sync symple yet it will
be marked as one IIUC.

Unless it's the first slice NALU out of those 100 in the AU, it will not be
marked as a keyframe by the AVCodecParserContext, nor as a Sync Sample by
this code.

Now to continue with this example, which way will it work better
X. With this marked not a keyframe or
Y. With this marked as keyframe as the parser detects ?

It should not be marked as a Sync Sample.



Its not containing a single valid IDR slice just 1 of the non IDR slices
have been damaged and are misparsed. That will not work well as a sync sample
yet as a normal sample the 1% of damage might not be noticed at all as its 
copied
from the last frame

How about i look at every slice NALU in the AU and ensure there's only IDR
slices before marking it as a Sync Sample, instead of only looking at the
very first slice and decide based on it?
Same thing could be done in the AVCodecParserContext, for that matter.

Then you get a IDR frame wrong if it has a single damaged slice where its
parsed as non IDR.

Same situation: It does not get added to the Sync Sample table.


What you could do and i do not suggest this actually, just hypothetically
is take the majority >50% being IDR as probably IDR. That will probably
require extra care when there are 2 fields or frames in one packet

The problem is if you unconditionally overreide the user you really need
to be perfect and this is not easy. More so as being done by default it
too should be low overhead not slowing things down







still a decoder can with some luck start decoding from such a
point i would guess. But does the user want this marked as a keyframe ?
maybe yes, maybe no. I think taking the users only way to set the keyframe
flag away is a step backward

Look at MPEG2 parsing and others in this container. It's the same scenario.
Did any of these concerns show up back then?
What I'm doing here for h264 is the same thing we did for those, to ensure
we don't create non spec compliant files because of damaged or mistagged
input, or a careless user.

Letting the user mark whatever they want as a Sync Sample on a container
where it's explicit what should and should not be such is not good practice.
If you want that, you can use Matroska and other formats where keyframe
tagging is apparently not strict.

Now, i want to point out that I wrote this parsing code here because my
previous attempt at stopping the AVCodecParserContext from being too liberal
marking things as keyframes was rejected *precisely* because it would let
the user create non compliant output in mp4. And now you don't want this one
either because you want to let the user create non complaint output.
I'm running in circles here and it's getting tiresome.

I see the problem and i agree with you that it is a problem that needs
a solution. Its very bad if the users sets bad flags and it results in
non-compliant files. What my concern is, is that theres no way for the
user to override this when its the parsing code thats wrong and previously
the user could override this.
It may be very rare that the parser is wrong and a user would bother to
override it, i do not know, then again some people are quite crazy with
the level of perfection they try to achieve in their videos

The only reason the user may have wanted to override the packet's keyframe
flag was precisely because the AVCodecParserContext was tagging way too many
packets as such. This would no longer be necessary after this change, and
you can trust the muxer to do the right thing.


I can however accept a compromise here, and take into account the packet's
keyframe flag if and only if invalid or unexpected data is found in the
bitstream. If the parsing succeeds, then whatever the packet metadata may
signal should be outright ignored.

how do you detect invalid or unexpected data ?

Any parsing error for which we already abort, and if we find a mix of IDR and non IDR slices within a given AU. Instead of deciding to not add them to the Sync Sample table like i mentioned above, we look at the packet's keyframe flag and use it as last resort.





Theres also the 2nd situation where all the information the parser
extracts is already known to the lib user and going over the whole
bitstream is wasting cpu cycles. This can be a situation where for
example all input files where just a moment ago created by FFmpeg
and we assume these would then be as compliant as redoing the parsing
would make them

Other information, like Recovery Point SEI messages, can't be properly
signaled by encoders within packet metadata. Muxer can and should, for now,
look at the bitstream for this purpose.

yes

thx

[...]


_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".


_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

Reply via email to