ffmpeg amerge and amix filter delay I need to take audio-streams from several IP cameras and merge them into one file, so that they would sound simaltaneousely.
I tried filter "amix": (for testing purposes I take audio-stream 2 times from the same camera. yes, I tried 2 cameras - result is the same) ffmpeg -i rtsp://user:[email protected] -i rtsp://user:[email protected] -map 0:a -map 1:a -filter_complex amix=inputs=2:duration=first:dropout_transition=3 -ar 22050 -vn -f flv rtmp://172.22.45.38:1935/live/stream1 result: I say "hello". And hear in speakers the first "hello" and in 1 second I hear the second "hello". Instead of hearing two "hello"'s simaltaneousely. and tried filter "amerge": ffmpeg -i rtsp://user:[email protected] -i rtsp://user:[email protected] -map 0:a -map 1:a -filter_complex amerge -ar 22050 -vn -f flv rtmp:// 172.22.45.38:1935/live/stream1 result: the same as in the first example, but now I hear the first "hello" in left speaker and in 1 second I hear the second "hello" in right speaker, instead of hearing two "hello"'s in both speakers simaltaneousely. So, the question is: how to make them sound simaltaneousely? May be you know some parameter? or some other command? P.S. Here is ful command-line output for both variants: amix: ffmpeg -i rtsp://admin:[email protected] -i rtsp:// admin:[email protected] -map 0:a -map 1:a -filter_complex amix=inputs=2:duration=longest:dropout_transition=0 -vn -ar 22050 -f flv rtmp://172.22.45.38:1935/live/stream1 ffmpeg version N-76031-g9099079 Copyright (c) 2000-2015 the FFmpeg developers built with gcc 4.4.7 (GCC) 20120313 (Red Hat 4.4.7-16) configuration: --enable-gpl --enable-libx264 --enable-libmp3lame --enable-nonfree --enable-version3 libavutil 55. 4.100 / 55. 4.100 libavcodec 57. 6.100 / 57. 6.100 libavformat 57. 4.100 / 57. 4.100 libavdevice 57. 0.100 / 57. 0.100 libavfilter 6. 11.100 / 6. 11.100 libswscale 4. 0.100 / 4. 0.100 libswresample 2. 0.100 / 2. 0.100 libpostproc 54. 0.100 / 54. 0.100 Input #0, rtsp, from 'rtsp://admin:[email protected]': Metadata: title : Media Presentation Duration: N/A, start: 0.032000, bitrate: N/A Stream #0:0: Video: h264 (Baseline), yuv420p, 1280x720, 20 fps, 25 tbr, 90k tbn, 40 tbc Stream #0:1: Audio: adpcm_g726, 8000 Hz, mono, s16, 16 kb/s Stream #0:2: Data: none Input #1, rtsp, from 'rtsp://admin:[email protected]': Metadata: title : Media Presentation Duration: N/A, start: 0.032000, bitrate: N/A Stream #1:0: Video: h264 (Baseline), yuv420p, 1280x720, 20 fps, 25 tbr, 90k tbn, 40 tbc Stream #1:1: Audio: adpcm_g726, 8000 Hz, mono, s16, 16 kb/s Stream #1:2: Data: none Output #0, flv, to 'rtmp://172.22.45.38:1935/live/stream1': Metadata: title : Media Presentation encoder : Lavf57.4.100 Stream #0:0: Audio: mp3 (libmp3lame) ([2][0][0][0] / 0x0002), 22050 Hz, mono, fltp (default) Metadata: encoder : Lavc57.6.100 libmp3lame Stream mapping: Stream #0:1 (g726) -> amix:input0 Stream #1:1 (g726) -> amix:input1 amix -> Stream #0:0 (libmp3lame) Press [q] to stop, [?] for help [rtsp @ 0x2689600] Thread message queue blocking; consider raising the thread_queue_size option (current value: 8) [rtsp @ 0x2727c60] Thread message queue blocking; consider raising the thread_queue_size option (current value: 8) [rtsp @ 0x2689600] max delay reached. need to consume packet [NULL @ 0x268c500] RTP: missed 38 packets [rtsp @ 0x2689600] max delay reached. need to consume packet [NULL @ 0x268d460] RTP: missed 4 packets [flv @ 0x2958360] Failed to update header with correct duration. [flv @ 0x2958360] Failed to update header with correct filesize. size= 28kB time=00:00:06.18 bitrate= 36.7kbits/s video:0kB audio:24kB subtitle:0kB other streams:0kB global headers:0kB muxing overhead: 16.331224% and amerge: # ffmpeg -i rtsp://admin:[email protected] -i rtsp:// admin:[email protected] -map 0:a -map 1:a -filter_complex amerge -vn -ar 22050 -f flv rtmp://172.22.45.38:1935/live/stream1 ffmpeg version N-76031-g9099079 Copyright (c) 2000-2015 the FFmpeg developers built with gcc 4.4.7 (GCC) 20120313 (Red Hat 4.4.7-16) configuration: --enable-gpl --enable-libx264 --enable-libmp3lame --enable-nonfree --enable-version3 libavutil 55. 4.100 / 55. 4.100 libavcodec 57. 6.100 / 57. 6.100 libavformat 57. 4.100 / 57. 4.100 libavdevice 57. 0.100 / 57. 0.100 libavfilter 6. 11.100 / 6. 11.100 libswscale 4. 0.100 / 4. 0.100 libswresample 2. 0.100 / 2. 0.100 libpostproc 54. 0.100 / 54. 0.100 Input #0, rtsp, from 'rtsp://admin:[email protected]': Metadata: title : Media Presentation Duration: N/A, start: 0.064000, bitrate: N/A Stream #0:0: Video: h264 (Baseline), yuv420p, 1280x720, 20 fps, 25 tbr, 90k tbn, 40 tbc Stream #0:1: Audio: adpcm_g726, 8000 Hz, mono, s16, 16 kb/s Stream #0:2: Data: none Input #1, rtsp, from 'rtsp://admin:[email protected]': Metadata: title : Media Presentation Duration: N/A, start: 0.032000, bitrate: N/A Stream #1:0: Video: h264 (Baseline), yuv420p, 1280x720, 20 fps, 25 tbr, 90k tbn, 40 tbc Stream #1:1: Audio: adpcm_g726, 8000 Hz, mono, s16, 16 kb/s Stream #1:2: Data: none [Parsed_amerge_0 @ 0x3069cc0] No channel layout for input 1 [Parsed_amerge_0 @ 0x3069cc0] Input channel layouts overlap: output layout will be determined by the number of distinct input channels Output #0, flv, to 'rtmp://172.22.45.38:1935/live/stream1': Metadata: title : Media Presentation encoder : Lavf57.4.100 Stream #0:0: Audio: mp3 (libmp3lame) ([2][0][0][0] / 0x0002), 22050 Hz, stereo, s16p (default) Metadata: encoder : Lavc57.6.100 libmp3lame Stream mapping: Stream #0:1 (g726) -> amerge:in0 Stream #1:1 (g726) -> amerge:in1 amerge -> Stream #0:0 (libmp3lame) Press [q] to stop, [?] for help [rtsp @ 0x2f71640] Thread message queue blocking; consider raising the thread_queue_size option (current value: 8) [rtsp @ 0x300fb40] Thread message queue blocking; consider raising the thread_queue_size option (current value: 8) [rtsp @ 0x2f71640] max delay reached. need to consume packet [NULL @ 0x2f744a0] RTP: missed 18 packets [flv @ 0x3058b00] Failed to update header with correct duration. [flv @ 0x3058b00] Failed to update header with correct filesize. size= 39kB time=00:00:04.54 bitrate= 70.2kbits/s video:0kB audio:36kB subtitle:0kB other streams:0kB global headers:0kB muxing overhead: 8.330614% Thanx. UPDATE 30 oct 2015: I found interesting detail when connecting 2 cameras (they have different microphones and I hear the difference between them): the order of "Hello"'s from different cams depends on the ORDER OF INPUTS. with command ffmpeg -i rtsp://cam2 -i rtsp://cam1 -map 0:a -map 1:a -filter_complex amix=inputs=2:duration=longest:dropout_transition=0 -vn -ar 22050 -f flv rtmp://172.22.45.38:1935/live/stream1 I hear "hello" from 1st cam and then in 1 second "hello" from 2nd cam. with command ffmpeg -i rtsp://cam1 -i rtsp://cam2 -map 0:a -map 1:a -filter_complex amix=inputs=2:duration=longest:dropout_transition=0 -vn -ar 22050 -f flv rtmp://172.22.45.38:1935/live/stream1 I hear "hello" from 2nd cam and then in 1 second "hello" from 1st cam. So, As I understand - ffmpeg takes inputs not simaltaneousely, but in the order of inputs given. Question: how to tell ffmpeg to read inputs simaltaneousely? _______________________________________________ ffmpeg-user mailing list [email protected] http://ffmpeg.org/mailman/listinfo/ffmpeg-user
