Hello, I'm trying to process some AVI files that originally came from a Sony Handycam to mp4 and managed to get that to work. However, a couple of files are giving me trouble and I've spent days trying to figure it out.
I'm new here but have done my homework as best I could: * I downloaded the latest ffmpeg build I could find. * Searched the forums for solutions/switches (-af aresample=async=1, -fflags +igndts, -fflags +sortdts). * Tried other tools to extract the audio. * Past full command and output for your reference. * Will try to not "top post". That seems to be a thing here. If I run the following command, audio is processed correctly up to about the 43s mark, then becomes slower than expected (i.e. voices are deep, audio in "slow mo" but video plays normal). The original file plays correctly with VLC and Video on Linux. It does not play correctly with mplayer (same slowed down audio issue past 43 seconds). fmpeg -i input.avi -c:v libx264 -preset fast -crf 21 output.mp4 ffmpeg version N-55863-g9f38fac053-static https://johnvansickle.com/ffmpeg/ Copyright (c) 2000-2021 the FFmpeg developers built with gcc 8 (Debian 8.3.0-6) configuration: --enable-gpl --enable-version3 --enable-static --disable-debug --disable-ffplay --disable-indev=sndio --disable-outdev=sndio --cc=gcc --enable-fontconfig --enable-frei0r --enable-gnutls --enable-gmp --enable-libgme --enable-gray --enable-libaom --enable-libfribidi --enable-libass --enable-libvmaf --enable-libfreetype --enable-libmp3lame --enable-libopencore-amrnb --enable-libopencore-amrwb --enable-libopenjpeg --enable-librubberband --enable-libsoxr --enable-libspeex --enable-libsrt --enable-libvorbis --enable-libopus --enable-libtheora --enable-libvidstab --enable-libvo-amrwbenc --enable-libvpx --enable-libwebp --enable-libx264 --enable-libx265 --enable-libxml2 --enable-libdav1d --enable-libxvid --enable-libzvbi --enable-libzimg libavutil 56. 64.100 / 56. 64.100 libavcodec 58.119.100 / 58.119.100 libavformat 58. 65.101 / 58. 65.101 libavdevice 58. 11.103 / 58. 11.103 libavfilter 7.100.100 / 7.100.100 libswscale 5. 8.100 / 5. 8.100 libswresample 3. 8.100 / 3. 8.100 libpostproc 55. 8.100 / 55. 8.100 [avi @ 0x620c340] Switching to NI mode, due to poor interleaving Input #0, avi, from 'input.avi': Duration: 00:07:00.00, start: 0.000000, bitrate: 28806 kb/s Stream #0:0: Video: dvvideo, yuv420p, 720x576 [SAR 16:15 DAR 4:3], 25000 kb/s, 25 fps, 25 tbr, 25 tbn, 25 tbc Stream #0:1: Audio: pcm_s16le, 32000 Hz, stereo, s16, 1024 kb/s Stream #0:2: Audio: pcm_s16le, 32000 Hz, stereo, s16, 1024 kb/s Stream mapping: Stream #0:0 -> #0:0 (dvvideo (native) -> h264 (libx264)) Stream #0:1 -> #0:1 (pcm_s16le (native) -> aac (native)) Press [q] to stop, [?] for help [libx264 @ 0x6233580] using SAR=16/15 [libx264 @ 0x6233580] using cpu capabilities: MMX2 SSE2Fast SSSE3 SSE4.2 AVX FMA3 BMI2 AVX2 [libx264 @ 0x6233580] profile High, level 3.0, 4:2:0, 8-bit [libx264 @ 0x6233580] 264 - core 161 r3040 35417dc - H.264/MPEG-4 AVC codec - Copyleft 2003-2021 - http://www.videolan.org/x264.html - options: cabac=1 ref=2 deblock=1:0:0 analyse=0x3:0x113 me=hex subme=6 psy=1 psy_rd=1.00:0.00 mixed_ref=1 me_range=16 chroma_me=1 trellis=1 8x8dct=1 cqm=0 deadzone=21,11 fast_pskip=1 chroma_qp_offset=-2 threads=12 lookahead_threads=2 sliced_threads=0 nr=0 decimate=1 interlaced=0 bluray_compat=0 constrained_intra=0 bframes=3 b_pyramid=2 b_adapt=1 b_bias=0 direct=1 weightb=1 open_gop=0 weightp=1 keyint=250 keyint_min=25 scenecut=40 intra_refresh=0 rc_lookahead=30 rc=crf mbtree=1 crf=21.0 qcomp=0.60 qpmin=0 qpmax=69 qpstep=4 ip_ratio=1.40 aq=1:1.00 Output #0, mp4, to 'output.mp4': Metadata: encoder : Lavf58.65.101 Stream #0:0: Video: h264 (avc1 / 0x31637661), yuv420p(bottom coded first (swapped)), 720x576 [SAR 16:15 DAR 4:3], q=2-31, 25 fps, 12800 tbn Metadata: encoder : Lavc58.119.100 libx264 Side data: cpb: bitrate max/min/avg: 0/0/0 buffer size: 0 vbv_delay: N/A Stream #0:1: Audio: aac (LC) (mp4a / 0x6134706D), 32000 Hz, stereo, fltp, 128 kb/s Metadata: encoder : Lavc58.119.100 aac frame=10500 fps=116 q=-1.0 Lsize= 168876kB time=00:07:00.02 bitrate=3293.7kbits/s speed=4.63x video:158898kB audio:9545kB subtitle:0kB other streams:0kB global headers:0kB muxing overhead: 0.257381% [libx264 @ 0x6233580] frame I:93 Avg QP:20.65 size: 54098 [libx264 @ 0x6233580] frame P:3607 Avg QP:23.10 size: 22302 [libx264 @ 0x6233580] frame B:6800 Avg QP:24.65 size: 11358 [libx264 @ 0x6233580] consecutive B-frames: 13.4% 0.6% 0.7% 85.3% [libx264 @ 0x6233580] mb I I16..4: 1.2% 98.2% 0.6% [libx264 @ 0x6233580] mb P I16..4: 0.5% 32.9% 0.4% P16..4: 35.4% 17.7% 9.8% 0.0% 0.0% skip: 3.2% [libx264 @ 0x6233580] mb B I16..4: 2.2% 25.7% 0.2% B16..8: 22.8% 8.7% 0.7% direct:28.9% skip:10.8% L0:40.2% L1:32.6% BI:27.2% [libx264 @ 0x6233580] 8x8 transform intra:93.9% inter:82.3% [libx264 @ 0x6233580] coded y,uvDC,uvAC intra: 82.4% 88.7% 30.9% inter: 40.1% 65.9% 1.2% [libx264 @ 0x6233580] i16 v,h,dc,p: 21% 19% 34% 26% [libx264 @ 0x6233580] i8 v,h,dc,ddl,ddr,vr,hd,vl,hu: 15% 14% 40% 5% 5% 5% 5% 6% 6% [libx264 @ 0x6233580] i4 v,h,dc,ddl,ddr,vr,hd,vl,hu: 9% 54% 15% 4% 4% 3% 4% 3% 4% [libx264 @ 0x6233580] i8c dc,h,v,p: 59% 19% 19% 3% [libx264 @ 0x6233580] Weighted P-Frames: Y:8.2% UV:1.4% [libx264 @ 0x6233580] ref P L0: 56.9% 43.1% [libx264 @ 0x6233580] ref B L0: 74.0% 26.0% [libx264 @ 0x6233580] ref B L1: 95.3% 4.7% [libx264 @ 0x6233580] kb/s:3099.25 [aac @ 0x6234fc0] Qavg: 189.772 Trying to narrow down the problem area, I did the following - just encode up to the 43s mark and dump the audio. If I run it for the whole file, there are hundreds, if not thousands of entries like " Non-monotonous DTS..." ffmpeg -t 00:00:43 -i input.avi -map 0:a:0 -c:a aac output.avi ffmpeg version N-55863-g9f38fac053-static https://johnvansickle.com/ffmpeg/ Copyright (c) 2000-2021 the FFmpeg developers built with gcc 8 (Debian 8.3.0-6) configuration: --enable-gpl --enable-version3 --enable-static --disable-debug --disable-ffplay --disable-indev=sndio --disable-outdev=sndio --cc=gcc --enable-fontconfig --enable-frei0r --enable-gnutls --enable-gmp --enable-libgme --enable-gray --enable-libaom --enable-libfribidi --enable-libass --enable-libvmaf --enable-libfreetype --enable-libmp3lame --enable-libopencore-amrnb --enable-libopencore-amrwb --enable-libopenjpeg --enable-librubberband --enable-libsoxr --enable-libspeex --enable-libsrt --enable-libvorbis --enable-libopus --enable-libtheora --enable-libvidstab --enable-libvo-amrwbenc --enable-libvpx --enable-libwebp --enable-libx264 --enable-libx265 --enable-libxml2 --enable-libdav1d --enable-libxvid --enable-libzvbi --enable-libzimg libavutil 56. 64.100 / 56. 64.100 libavcodec 58.119.100 / 58.119.100 libavformat 58. 65.101 / 58. 65.101 libavdevice 58. 11.103 / 58. 11.103 libavfilter 7.100.100 / 7.100.100 libswscale 5. 8.100 / 5. 8.100 libswresample 3. 8.100 / 3. 8.100 libpostproc 55. 8.100 / 55. 8.100 [avi @ 0x5b922c0] Switching to NI mode, due to poor interleaving Input #0, avi, from 'input.avi': Duration: 00:07:00.00, start: 0.000000, bitrate: 28806 kb/s Stream #0:0: Video: dvvideo, yuv420p, 720x576 [SAR 16:15 DAR 4:3], 25000 kb/s, 25 fps, 25 tbr, 25 tbn, 25 tbc Stream #0:1: Audio: pcm_s16le, 32000 Hz, stereo, s16, 1024 kb/s Stream #0:2: Audio: pcm_s16le, 32000 Hz, stereo, s16, 1024 kb/s File 'output.avi' already exists. Overwrite? [y/N] y Stream mapping: Stream #0:1 -> #0:0 (pcm_s16le (native) -> aac (native)) Press [q] to stop, [?] for help Output #0, avi, to 'output.avi': Metadata: ISFT : Lavf58.65.101 Stream #0:0: Audio: aac (LC) ([255][0][0][0] / 0x00FF), 32000 Hz, stereo, fltp, 128 kb/s Metadata: encoder : Lavc58.119.100 aac [avi @ 0x5bb8040] Non-monotonous DTS in output stream 0:0; previous: 1341, current: 1339; changing to 1342. This may result in incorrect timestamps in the output file. [avi @ 0x5bb8040] Non-monotonous DTS in output stream 0:0; previous: 1342, current: 1340; changing to 1343. This may result in incorrect timestamps in the output file. [avi @ 0x5bb8040] Non-monotonous DTS in output stream 0:0; previous: 1343, current: 1340; changing to 1344. This may result in incorrect timestamps in the output file. [avi @ 0x5bb8040] Non-monotonous DTS in output stream 0:0; previous: 1344, current: 1341; changing to 1345. This may result in incorrect timestamps in the output file. [avi @ 0x5bb8040] Non-monotonous DTS in output stream 0:0; previous: 1345, current: 1342; changing to 1346. This may result in incorrect timestamps in the output file. [avi @ 0x5bb8040] Non-monotonous DTS in output stream 0:0; previous: 1346, current: 1342; changing to 1347. This may result in incorrect timestamps in the output file. [avi @ 0x5bb8040] Non-monotonous DTS in output stream 0:0; previous: 1347, current: 1343; changing to 1348. This may result in incorrect timestamps in the output file. [avi @ 0x5bb8040] Non-monotonous DTS in output stream 0:0; previous: 1348, current: 1343; changing to 1349. This may result in incorrect timestamps in the output file. size= 714kB time=00:00:43.16 bitrate= 135.6kbits/s speed=96.2x video:0kB audio:676kB subtitle:0kB other streams:0kB global headers:0kB muxing overhead: 5.592729% [aac @ 0x5bb9980] Qavg: 225.541 The only solution I could find so far is to a. extract all the audio using ffmpeg (total audio file lengh is 10m09s) b. use audacity to carefully select the audio from 0m43s to the end and "shrink" it down to a total of 7m00s (the original file length) c. create a new video file from the processed audio stream Is there a way to troubleshoot, ignore, correct the Non-monotonous DTS error? Thanks for your help. _______________________________________________ ffmpeg-user mailing list ffmpeg-user@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-user To unsubscribe, visit link above, or email ffmpeg-user-requ...@ffmpeg.org with subject "unsubscribe".