Re: [FFmpeg-devel] [PATCH 06/20] avformat/matroskaenc: Make output more deterministic
Steve Lhomme: > Not sure how FATE works but I suppose the pseudo-random becomes deterministic > ? In that case the size should always be the same. And if not having a fixed > size will make no difference. > > If FATE can skip random parts of a file (which your patch would solve) it > will miss inconsitencies of UIDs that should be the same in two places of the > file. So IMO it's not good. I suggest to have a flag in the muxer to force > the seed to a fixed value when using FATE. > The output of FATE was always deterministic. Here is the relevant snippet from patch #3: +if (s->flags & AVFMT_FLAG_BITEXACT) { +track->uid = i + 1; +} else { +track->uid = mkv_get_uid(mkv->tracks, i, ); +} The rest of the code does simply refer to track->uid and does not care about whether these values have been set randomly or not. Patch #6 has been made to address concerns [1] that adding random values all over the place could inhibit catching bugs. We can already check the UIDs for consistency: By muxing a file and then checking whether the metadata is read correctly. I have recently applied a patch [2] that allows to do exactly that and I intend to add more tests using this capability. - Andreas [1]: https://ffmpeg.org/pipermail/ffmpeg-devel/2019-November/253377.html [2]: https://ffmpeg.org/pipermail/ffmpeg-devel/2020-April/260592.html ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".
Re: [FFmpeg-devel] [PATCH 06/20] avformat/matroskaenc: Make output more deterministic
Not sure how FATE works but I suppose the pseudo-random becomes deterministic ? In that case the size should always be the same. And if not having a fixed size will make no difference. If FATE can skip random parts of a file (which your patch would solve) it will miss inconsitencies of UIDs that should be the same in two places of the file. So IMO it's not good. I suggest to have a flag in the muxer to force the seed to a fixed value when using FATE. > On April 5, 2020 5:59 PM Andreas Rheinhardt > wrote: > > > Using random values for TrackUID and FileUID (as happens when the > AVFMT_FLAG_BITEXACT flag is not set) has the obvious downside of making > the output indeterministic. This commit mitigates this by writing the > potentially random values with a fixed size of eight byte, even if their > actual values would fit into less than eight bytes. This ensures that > even in non-bitexact mode, the differences between two files generated > with the same settings are restricted to a few bytes in the header. > (Namely the SegmentUID, the TrackUIDs (in Tracks as well as when > referencing them via TagTrackUID), the FileUIDs (in Attachments as > well as in TagAttachmentUID) as well as the CRC-32 checksums of the > Info, Tracks, Attachments and Tags level-1-elements.) Without this > patch, there might be an offset/a size difference between two such > files. > > The FATE-tests had to be updated because the fixed-sized UIDs are also > used in bitexact mode. > > Signed-off-by: Andreas Rheinhardt > --- > libavformat/matroskaenc.c | 16 +-- > tests/fate/matroska.mak | 2 +- > tests/fate/wavpack.mak| 4 +- > tests/ref/fate/aac-autobsf-adtstoasc | 4 +- > tests/ref/fate/binsub-mksenc | 2 +- > tests/ref/fate/matroska-flac-extradata-update | 4 +- > tests/ref/fate/rgb24-mkv | 4 +- > tests/ref/lavf-fate/av1.mkv | 4 +- > tests/ref/lavf/mka| 4 +- > tests/ref/lavf/mkv| 4 +- > tests/ref/lavf/mkv_attachment | 4 +- > tests/ref/seek/lavf-mkv | 44 +-- > 12 files changed, 53 insertions(+), 43 deletions(-) > > diff --git a/libavformat/matroskaenc.c b/libavformat/matroskaenc.c > index 135d5e3abd..cf2bb7b933 100644 > --- a/libavformat/matroskaenc.c > +++ b/libavformat/matroskaenc.c > @@ -226,6 +226,16 @@ static void put_ebml_num(AVIOContext *pb, uint64_t num, > int bytes) > avio_w8(pb, (uint8_t)(num >> i * 8)); > } > > +/** > + * Write a (random) UID with fixed size to make the output more deterministic > + */ > +static void put_ebml_uid(AVIOContext *pb, uint32_t elementid, uint64_t uid) > +{ > +put_ebml_id(pb, elementid); > +put_ebml_num(pb, 8, 0); > +avio_wb64(pb, uid); > +} > + > static void put_ebml_uint(AVIOContext *pb, uint32_t elementid, uint64_t val) > { > int i, bytes = 1; > @@ -1104,7 +1114,7 @@ static int mkv_write_track(AVFormatContext *s, > MatroskaMuxContext *mkv, > track = start_ebml_master(pb, MATROSKA_ID_TRACKENTRY, 0); > put_ebml_uint (pb, MATROSKA_ID_TRACKNUMBER, > mkv->is_dash ? mkv->dash_track_number : i + 1); > -put_ebml_uint (pb, MATROSKA_ID_TRACKUID, mkv->tracks[i].uid); > +put_ebml_uid (pb, MATROSKA_ID_TRACKUID, mkv->tracks[i].uid); > put_ebml_uint (pb, MATROSKA_ID_TRACKFLAGLACING , 0);// no lacing > (yet) > > if ((tag = av_dict_get(st->metadata, "title", NULL, 0))) > @@ -1485,7 +1495,7 @@ static int mkv_write_tag_targets(AVFormatContext *s, > uint32_t elementid, > *tag= start_ebml_master(pb, MATROSKA_ID_TAG,0); > targets = start_ebml_master(pb, MATROSKA_ID_TAGTARGETS, 0); > if (elementid) > -put_ebml_uint(pb, elementid, uid); > +put_ebml_uid(pb, elementid, uid); > end_ebml_master(pb, targets); > return 0; > } > @@ -1684,7 +1694,7 @@ static int mkv_write_attachments(AVFormatContext *s) > > put_ebml_string(dyn_cp, MATROSKA_ID_FILEMIMETYPE, mimetype); > put_ebml_binary(dyn_cp, MATROSKA_ID_FILEDATA, > st->codecpar->extradata, st->codecpar->extradata_size); > -put_ebml_uint(dyn_cp, MATROSKA_ID_FILEUID, track->uid); > +put_ebml_uid(dyn_cp, MATROSKA_ID_FILEUID, track->uid); > end_ebml_master(dyn_cp, attached_file); > } > end_ebml_master_crc32(pb, _cp, mkv, MATROSKA_ID_ATTACHMENTS, 0, 0); > diff --git a/tests/fate/matroska.mak b/tests/fate/matroska.mak > index 4b9ee7a872..8cc35d52da 100644 > --- a/tests/fate/matroska.mak > +++ b/tests/fate/matroska.mak > @@ -12,7 +12,7 @@ fate-matroska-prores-header-insertion-bz2: CMD = framecrc > -i $(TARGET_SAMPLES)/m > FATE_MATROSKA-$(call DEMMUX, MATROSKA, MATROSKA) += fate-matroska-remux > fate-matroska-remux: CMD = md5pipe -i > $(TARGET_SAMPLES)/vp9-test-vectors/vp90-2-2pass-akiyo.webm -color_trc 4 -c:v
[FFmpeg-devel] [PATCH 06/20] avformat/matroskaenc: Make output more deterministic
Using random values for TrackUID and FileUID (as happens when the AVFMT_FLAG_BITEXACT flag is not set) has the obvious downside of making the output indeterministic. This commit mitigates this by writing the potentially random values with a fixed size of eight byte, even if their actual values would fit into less than eight bytes. This ensures that even in non-bitexact mode, the differences between two files generated with the same settings are restricted to a few bytes in the header. (Namely the SegmentUID, the TrackUIDs (in Tracks as well as when referencing them via TagTrackUID), the FileUIDs (in Attachments as well as in TagAttachmentUID) as well as the CRC-32 checksums of the Info, Tracks, Attachments and Tags level-1-elements.) Without this patch, there might be an offset/a size difference between two such files. The FATE-tests had to be updated because the fixed-sized UIDs are also used in bitexact mode. Signed-off-by: Andreas Rheinhardt --- libavformat/matroskaenc.c | 16 +-- tests/fate/matroska.mak | 2 +- tests/fate/wavpack.mak| 4 +- tests/ref/fate/aac-autobsf-adtstoasc | 4 +- tests/ref/fate/binsub-mksenc | 2 +- tests/ref/fate/matroska-flac-extradata-update | 4 +- tests/ref/fate/rgb24-mkv | 4 +- tests/ref/lavf-fate/av1.mkv | 4 +- tests/ref/lavf/mka| 4 +- tests/ref/lavf/mkv| 4 +- tests/ref/lavf/mkv_attachment | 4 +- tests/ref/seek/lavf-mkv | 44 +-- 12 files changed, 53 insertions(+), 43 deletions(-) diff --git a/libavformat/matroskaenc.c b/libavformat/matroskaenc.c index 135d5e3abd..cf2bb7b933 100644 --- a/libavformat/matroskaenc.c +++ b/libavformat/matroskaenc.c @@ -226,6 +226,16 @@ static void put_ebml_num(AVIOContext *pb, uint64_t num, int bytes) avio_w8(pb, (uint8_t)(num >> i * 8)); } +/** + * Write a (random) UID with fixed size to make the output more deterministic + */ +static void put_ebml_uid(AVIOContext *pb, uint32_t elementid, uint64_t uid) +{ +put_ebml_id(pb, elementid); +put_ebml_num(pb, 8, 0); +avio_wb64(pb, uid); +} + static void put_ebml_uint(AVIOContext *pb, uint32_t elementid, uint64_t val) { int i, bytes = 1; @@ -1104,7 +1114,7 @@ static int mkv_write_track(AVFormatContext *s, MatroskaMuxContext *mkv, track = start_ebml_master(pb, MATROSKA_ID_TRACKENTRY, 0); put_ebml_uint (pb, MATROSKA_ID_TRACKNUMBER, mkv->is_dash ? mkv->dash_track_number : i + 1); -put_ebml_uint (pb, MATROSKA_ID_TRACKUID, mkv->tracks[i].uid); +put_ebml_uid (pb, MATROSKA_ID_TRACKUID, mkv->tracks[i].uid); put_ebml_uint (pb, MATROSKA_ID_TRACKFLAGLACING , 0);// no lacing (yet) if ((tag = av_dict_get(st->metadata, "title", NULL, 0))) @@ -1485,7 +1495,7 @@ static int mkv_write_tag_targets(AVFormatContext *s, uint32_t elementid, *tag= start_ebml_master(pb, MATROSKA_ID_TAG,0); targets = start_ebml_master(pb, MATROSKA_ID_TAGTARGETS, 0); if (elementid) -put_ebml_uint(pb, elementid, uid); +put_ebml_uid(pb, elementid, uid); end_ebml_master(pb, targets); return 0; } @@ -1684,7 +1694,7 @@ static int mkv_write_attachments(AVFormatContext *s) put_ebml_string(dyn_cp, MATROSKA_ID_FILEMIMETYPE, mimetype); put_ebml_binary(dyn_cp, MATROSKA_ID_FILEDATA, st->codecpar->extradata, st->codecpar->extradata_size); -put_ebml_uint(dyn_cp, MATROSKA_ID_FILEUID, track->uid); +put_ebml_uid(dyn_cp, MATROSKA_ID_FILEUID, track->uid); end_ebml_master(dyn_cp, attached_file); } end_ebml_master_crc32(pb, _cp, mkv, MATROSKA_ID_ATTACHMENTS, 0, 0); diff --git a/tests/fate/matroska.mak b/tests/fate/matroska.mak index 4b9ee7a872..8cc35d52da 100644 --- a/tests/fate/matroska.mak +++ b/tests/fate/matroska.mak @@ -12,7 +12,7 @@ fate-matroska-prores-header-insertion-bz2: CMD = framecrc -i $(TARGET_SAMPLES)/m FATE_MATROSKA-$(call DEMMUX, MATROSKA, MATROSKA) += fate-matroska-remux fate-matroska-remux: CMD = md5pipe -i $(TARGET_SAMPLES)/vp9-test-vectors/vp90-2-2pass-akiyo.webm -color_trc 4 -c:v copy -fflags +bitexact -strict -2 -f matroska fate-matroska-remux: CMP = oneline -fate-matroska-remux: REF = 49a60ef91cf7302ab7276f9373f8a429 +fate-matroska-remux: REF = 8369f24de64aaa52cf57a699dcdc7d58 FATE_MATROSKA-$(call ALLYES, MATROSKA_DEMUXER VORBIS_PARSER) += fate-matroska-xiph-lacing fate-matroska-xiph-lacing: CMD = framecrc -i $(TARGET_SAMPLES)/mkv/xiph_lacing.mka -c:a copy diff --git a/tests/fate/wavpack.mak b/tests/fate/wavpack.mak index bc7a5cc92f..e3cf4ec632 100644 --- a/tests/fate/wavpack.mak +++ b/tests/fate/wavpack.mak @@ -91,12 +91,12 @@ fate-wavpack-matroskamode: CMD = md5 -i $(TARGET_SAMPLES)/wavpack/special/matros FATE_WAVPACK-$(call DEMMUX, WV, MATROSKA) +=