Re: [FFmpeg-devel] [PATCH 06/20] avformat/matroskaenc: Make output more deterministic

2020-04-19 Thread Andreas Rheinhardt
Steve Lhomme:
> Not sure how FATE works but I suppose the pseudo-random becomes deterministic 
> ? In that case the size should always be the same. And if not having a fixed 
> size will make no difference.
> 
> If FATE can skip random parts of a file (which your patch would solve) it 
> will miss inconsitencies of UIDs that should be the same in two places of the 
> file. So IMO it's not good. I suggest to have a flag in the muxer to force 
> the seed to a fixed value when using FATE.
> 

The output of FATE was always deterministic. Here is the relevant
snippet from patch #3:

+if (s->flags & AVFMT_FLAG_BITEXACT) {
+track->uid = i + 1;
+} else {
+track->uid = mkv_get_uid(mkv->tracks, i, );
+}

The rest of the code does simply refer to track->uid and does not care
about whether these values have been set randomly or not.

Patch #6 has been made to address concerns [1] that adding random values
all over the place could inhibit catching bugs.

We can already check the UIDs for consistency: By muxing a file and then
checking whether the metadata is read correctly. I have recently applied
a patch [2] that allows to do exactly that and I intend to add more
tests using this capability.

- Andreas

[1]: https://ffmpeg.org/pipermail/ffmpeg-devel/2019-November/253377.html
[2]: https://ffmpeg.org/pipermail/ffmpeg-devel/2020-April/260592.html
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

Re: [FFmpeg-devel] [PATCH 06/20] avformat/matroskaenc: Make output more deterministic

2020-04-19 Thread Steve Lhomme
Not sure how FATE works but I suppose the pseudo-random becomes deterministic ? 
In that case the size should always be the same. And if not having a fixed size 
will make no difference.

If FATE can skip random parts of a file (which your patch would solve) it will 
miss inconsitencies of UIDs that should be the same in two places of the file. 
So IMO it's not good. I suggest to have a flag in the muxer to force the seed 
to a fixed value when using FATE.

> On April 5, 2020 5:59 PM Andreas Rheinhardt  
> wrote:
> 
>  
> Using random values for TrackUID and FileUID (as happens when the
> AVFMT_FLAG_BITEXACT flag is not set) has the obvious downside of making
> the output indeterministic. This commit mitigates this by writing the
> potentially random values with a fixed size of eight byte, even if their
> actual values would fit into less than eight bytes. This ensures that
> even in non-bitexact mode, the differences between two files generated
> with the same settings are restricted to a few bytes in the header.
> (Namely the SegmentUID, the TrackUIDs (in Tracks as well as when
> referencing them via TagTrackUID), the FileUIDs (in Attachments as
> well as in TagAttachmentUID) as well as the CRC-32 checksums of the
> Info, Tracks, Attachments and Tags level-1-elements.) Without this
> patch, there might be an offset/a size difference between two such
> files.
> 
> The FATE-tests had to be updated because the fixed-sized UIDs are also
> used in bitexact mode.
> 
> Signed-off-by: Andreas Rheinhardt 
> ---
>  libavformat/matroskaenc.c | 16 +--
>  tests/fate/matroska.mak   |  2 +-
>  tests/fate/wavpack.mak|  4 +-
>  tests/ref/fate/aac-autobsf-adtstoasc  |  4 +-
>  tests/ref/fate/binsub-mksenc  |  2 +-
>  tests/ref/fate/matroska-flac-extradata-update |  4 +-
>  tests/ref/fate/rgb24-mkv  |  4 +-
>  tests/ref/lavf-fate/av1.mkv   |  4 +-
>  tests/ref/lavf/mka|  4 +-
>  tests/ref/lavf/mkv|  4 +-
>  tests/ref/lavf/mkv_attachment |  4 +-
>  tests/ref/seek/lavf-mkv   | 44 +--
>  12 files changed, 53 insertions(+), 43 deletions(-)
> 
> diff --git a/libavformat/matroskaenc.c b/libavformat/matroskaenc.c
> index 135d5e3abd..cf2bb7b933 100644
> --- a/libavformat/matroskaenc.c
> +++ b/libavformat/matroskaenc.c
> @@ -226,6 +226,16 @@ static void put_ebml_num(AVIOContext *pb, uint64_t num, 
> int bytes)
>  avio_w8(pb, (uint8_t)(num >> i * 8));
>  }
>  
> +/**
> + * Write a (random) UID with fixed size to make the output more deterministic
> + */
> +static void put_ebml_uid(AVIOContext *pb, uint32_t elementid, uint64_t uid)
> +{
> +put_ebml_id(pb, elementid);
> +put_ebml_num(pb, 8, 0);
> +avio_wb64(pb, uid);
> +}
> +
>  static void put_ebml_uint(AVIOContext *pb, uint32_t elementid, uint64_t val)
>  {
>  int i, bytes = 1;
> @@ -1104,7 +1114,7 @@ static int mkv_write_track(AVFormatContext *s, 
> MatroskaMuxContext *mkv,
>  track = start_ebml_master(pb, MATROSKA_ID_TRACKENTRY, 0);
>  put_ebml_uint (pb, MATROSKA_ID_TRACKNUMBER,
> mkv->is_dash ? mkv->dash_track_number : i + 1);
> -put_ebml_uint (pb, MATROSKA_ID_TRACKUID, mkv->tracks[i].uid);
> +put_ebml_uid  (pb, MATROSKA_ID_TRACKUID, mkv->tracks[i].uid);
>  put_ebml_uint (pb, MATROSKA_ID_TRACKFLAGLACING , 0);// no lacing 
> (yet)
>  
>  if ((tag = av_dict_get(st->metadata, "title", NULL, 0)))
> @@ -1485,7 +1495,7 @@ static int mkv_write_tag_targets(AVFormatContext *s, 
> uint32_t elementid,
>  *tag= start_ebml_master(pb, MATROSKA_ID_TAG,0);
>  targets = start_ebml_master(pb, MATROSKA_ID_TAGTARGETS, 0);
>  if (elementid)
> -put_ebml_uint(pb, elementid, uid);
> +put_ebml_uid(pb, elementid, uid);
>  end_ebml_master(pb, targets);
>  return 0;
>  }
> @@ -1684,7 +1694,7 @@ static int mkv_write_attachments(AVFormatContext *s)
>  
>  put_ebml_string(dyn_cp, MATROSKA_ID_FILEMIMETYPE, mimetype);
>  put_ebml_binary(dyn_cp, MATROSKA_ID_FILEDATA, 
> st->codecpar->extradata, st->codecpar->extradata_size);
> -put_ebml_uint(dyn_cp, MATROSKA_ID_FILEUID, track->uid);
> +put_ebml_uid(dyn_cp, MATROSKA_ID_FILEUID, track->uid);
>  end_ebml_master(dyn_cp, attached_file);
>  }
>  end_ebml_master_crc32(pb, _cp, mkv, MATROSKA_ID_ATTACHMENTS, 0, 0);
> diff --git a/tests/fate/matroska.mak b/tests/fate/matroska.mak
> index 4b9ee7a872..8cc35d52da 100644
> --- a/tests/fate/matroska.mak
> +++ b/tests/fate/matroska.mak
> @@ -12,7 +12,7 @@ fate-matroska-prores-header-insertion-bz2: CMD = framecrc 
> -i $(TARGET_SAMPLES)/m
>  FATE_MATROSKA-$(call DEMMUX, MATROSKA, MATROSKA) += fate-matroska-remux
>  fate-matroska-remux: CMD = md5pipe -i 
> $(TARGET_SAMPLES)/vp9-test-vectors/vp90-2-2pass-akiyo.webm -color_trc 4 -c:v 

[FFmpeg-devel] [PATCH 06/20] avformat/matroskaenc: Make output more deterministic

2020-04-05 Thread Andreas Rheinhardt
Using random values for TrackUID and FileUID (as happens when the
AVFMT_FLAG_BITEXACT flag is not set) has the obvious downside of making
the output indeterministic. This commit mitigates this by writing the
potentially random values with a fixed size of eight byte, even if their
actual values would fit into less than eight bytes. This ensures that
even in non-bitexact mode, the differences between two files generated
with the same settings are restricted to a few bytes in the header.
(Namely the SegmentUID, the TrackUIDs (in Tracks as well as when
referencing them via TagTrackUID), the FileUIDs (in Attachments as
well as in TagAttachmentUID) as well as the CRC-32 checksums of the
Info, Tracks, Attachments and Tags level-1-elements.) Without this
patch, there might be an offset/a size difference between two such
files.

The FATE-tests had to be updated because the fixed-sized UIDs are also
used in bitexact mode.

Signed-off-by: Andreas Rheinhardt 
---
 libavformat/matroskaenc.c | 16 +--
 tests/fate/matroska.mak   |  2 +-
 tests/fate/wavpack.mak|  4 +-
 tests/ref/fate/aac-autobsf-adtstoasc  |  4 +-
 tests/ref/fate/binsub-mksenc  |  2 +-
 tests/ref/fate/matroska-flac-extradata-update |  4 +-
 tests/ref/fate/rgb24-mkv  |  4 +-
 tests/ref/lavf-fate/av1.mkv   |  4 +-
 tests/ref/lavf/mka|  4 +-
 tests/ref/lavf/mkv|  4 +-
 tests/ref/lavf/mkv_attachment |  4 +-
 tests/ref/seek/lavf-mkv   | 44 +--
 12 files changed, 53 insertions(+), 43 deletions(-)

diff --git a/libavformat/matroskaenc.c b/libavformat/matroskaenc.c
index 135d5e3abd..cf2bb7b933 100644
--- a/libavformat/matroskaenc.c
+++ b/libavformat/matroskaenc.c
@@ -226,6 +226,16 @@ static void put_ebml_num(AVIOContext *pb, uint64_t num, 
int bytes)
 avio_w8(pb, (uint8_t)(num >> i * 8));
 }
 
+/**
+ * Write a (random) UID with fixed size to make the output more deterministic
+ */
+static void put_ebml_uid(AVIOContext *pb, uint32_t elementid, uint64_t uid)
+{
+put_ebml_id(pb, elementid);
+put_ebml_num(pb, 8, 0);
+avio_wb64(pb, uid);
+}
+
 static void put_ebml_uint(AVIOContext *pb, uint32_t elementid, uint64_t val)
 {
 int i, bytes = 1;
@@ -1104,7 +1114,7 @@ static int mkv_write_track(AVFormatContext *s, 
MatroskaMuxContext *mkv,
 track = start_ebml_master(pb, MATROSKA_ID_TRACKENTRY, 0);
 put_ebml_uint (pb, MATROSKA_ID_TRACKNUMBER,
mkv->is_dash ? mkv->dash_track_number : i + 1);
-put_ebml_uint (pb, MATROSKA_ID_TRACKUID, mkv->tracks[i].uid);
+put_ebml_uid  (pb, MATROSKA_ID_TRACKUID, mkv->tracks[i].uid);
 put_ebml_uint (pb, MATROSKA_ID_TRACKFLAGLACING , 0);// no lacing (yet)
 
 if ((tag = av_dict_get(st->metadata, "title", NULL, 0)))
@@ -1485,7 +1495,7 @@ static int mkv_write_tag_targets(AVFormatContext *s, 
uint32_t elementid,
 *tag= start_ebml_master(pb, MATROSKA_ID_TAG,0);
 targets = start_ebml_master(pb, MATROSKA_ID_TAGTARGETS, 0);
 if (elementid)
-put_ebml_uint(pb, elementid, uid);
+put_ebml_uid(pb, elementid, uid);
 end_ebml_master(pb, targets);
 return 0;
 }
@@ -1684,7 +1694,7 @@ static int mkv_write_attachments(AVFormatContext *s)
 
 put_ebml_string(dyn_cp, MATROSKA_ID_FILEMIMETYPE, mimetype);
 put_ebml_binary(dyn_cp, MATROSKA_ID_FILEDATA, st->codecpar->extradata, 
st->codecpar->extradata_size);
-put_ebml_uint(dyn_cp, MATROSKA_ID_FILEUID, track->uid);
+put_ebml_uid(dyn_cp, MATROSKA_ID_FILEUID, track->uid);
 end_ebml_master(dyn_cp, attached_file);
 }
 end_ebml_master_crc32(pb, _cp, mkv, MATROSKA_ID_ATTACHMENTS, 0, 0);
diff --git a/tests/fate/matroska.mak b/tests/fate/matroska.mak
index 4b9ee7a872..8cc35d52da 100644
--- a/tests/fate/matroska.mak
+++ b/tests/fate/matroska.mak
@@ -12,7 +12,7 @@ fate-matroska-prores-header-insertion-bz2: CMD = framecrc -i 
$(TARGET_SAMPLES)/m
 FATE_MATROSKA-$(call DEMMUX, MATROSKA, MATROSKA) += fate-matroska-remux
 fate-matroska-remux: CMD = md5pipe -i 
$(TARGET_SAMPLES)/vp9-test-vectors/vp90-2-2pass-akiyo.webm -color_trc 4 -c:v 
copy -fflags +bitexact -strict -2 -f matroska
 fate-matroska-remux: CMP = oneline
-fate-matroska-remux: REF = 49a60ef91cf7302ab7276f9373f8a429
+fate-matroska-remux: REF = 8369f24de64aaa52cf57a699dcdc7d58
 
 FATE_MATROSKA-$(call ALLYES, MATROSKA_DEMUXER VORBIS_PARSER) += 
fate-matroska-xiph-lacing
 fate-matroska-xiph-lacing: CMD = framecrc -i 
$(TARGET_SAMPLES)/mkv/xiph_lacing.mka -c:a copy
diff --git a/tests/fate/wavpack.mak b/tests/fate/wavpack.mak
index bc7a5cc92f..e3cf4ec632 100644
--- a/tests/fate/wavpack.mak
+++ b/tests/fate/wavpack.mak
@@ -91,12 +91,12 @@ fate-wavpack-matroskamode: CMD = md5 -i 
$(TARGET_SAMPLES)/wavpack/special/matros
 FATE_WAVPACK-$(call DEMMUX, WV, MATROSKA) +=