On Tue, 19 Nov 2024, 01:20 Shane Warren, <sha...@innovsys.com> wrote:
>> On Mon, 18 Nov 2024, 11:33 pm Shane Warren, <sha...@innovsys.com> wrote: >> >> >> I have been trying to track down why when transcoding using xstack >> >> with nvidia decoding and encoding I get strange decoding issues in >> ffmpeg. >> >> >> >> Note: I use 2 1 minute long .ts files for this example if you want >> >> my inputs, they are available here (as input1.ts and input2.ts) : >> >> >> >> >> >> https://nam10.safelinks.protection.outlook.com/?url=https%3A%2F%2Fd >> >> riv%2F&data=05%7C02%7Cshanew%40innovsys.com%7Cc241556f6a2e4253d9bc0 >> >> 8dd0825844e%7C7a48ce45ee974a95ac183390878a179b%7C0%7C0%7C6386756794 >> >> 48993996%7CUnknown%7CTWFpbGZsb3d8eyJFbXB0eU1hcGkiOnRydWUsIlYiOiIwLj >> >> AuMDAwMCIsIlAiOiJXaW4zMiIsIkFOIjoiTWFpbCIsIldUIjoyfQ%3D%3D%7C0%7C%7 >> >> C%7C&sdata=hVnUflCd1pK6iadB%2FsXUiB1BPuSiPt%2F%2BW3FP8a%2BWDiI%3D&r >> >> eserved=0 >> >> e.google.com%2Fdrive%2Ffolders%2F1mZ8xiNvz5ez1ULlNsy5a3KhnhaqQ2Hgo% >> >> 3Fu >> >> sp%3Ddrive_link&data=05%7C02%7Cshanew%40innovsys.com%7C02a2eccf16aa >> >> 494 >> >> 1b6c408dd08136cfd%7C7a48ce45ee974a95ac183390878a179b%7C0%7C0%7C6386 >> >> 756 >> >> 01721027151%7CUnknown%7CTWFpbGZsb3d8eyJFbXB0eU1hcGkiOnRydWUsIlYiOiI >> >> wLj >> >> AuMDAwMCIsIlAiOiJXaW4zMiIsIkFOIjoiTWFpbCIsIldUIjoyfQ%3D%3D%7C0%7C%7 >> >> C%7 >> >> C&sdata=H7Nk9G6qJ3jg17ApCn3iBkSDmN0Mz%2BX5QZzHnSHBnAQ%3D&reserved=0 >> >> >> >> I got the latest ffmpeg and tried this command (xstacking 2 videos >> >> into 1 >> >> output): >> >> >> >> ffmpeg -y -threads 2 -nostats -loglevel verbose -probesize 5M >> >> -filter_threads 4 -threads 2 -re -fflags +genpts -fflags >> >> discardcorrupt \ -extra_hw_frames 2 -hwaccel cuda >> >> -hwaccel_output_format cuda -threads 2 -thread_queue_size 4096 >> >> -heavy_compr 1 -thread_queue_size 4096 -re -i input1.ts \ >> >> -extra_hw_frames 2 -hwaccel cuda -hwaccel_output_format cuda >> >> -threads >> >> 2 -thread_queue_size 4096 -heavy_compr 1 -thread_queue_size 4096 >> >> -re -i input2.ts \ -filter_complex "\ >> >> [0:v:0]yadif_cuda=deint=interlaced,scale_cuda=768:432,hwdownload,fo >> >> rma >> >> t=nv12,fps=60000/1001[v0]; >> >> \ >> >> [1:v:0]yadif_cuda=deint=interlaced,scale_cuda=768:432,hwdownload,fo >> >> rma >> >> t=nv12,fps=60000/1001[v1]; >> >> \ >> >> [v0][v1] xstack=inputs=2:layout=0_0|0_h0[mosaic];\ >> >> >> [mosaic]hwupload_cuda,scale_cuda=w=1280:h=720:format=yuv420p:force_original_aspect_ratio=decrease,hwdownload,format=yuv420p,pad=1280:720:(ow-iw)/2:(oh-ih)/2,hwupload_cuda[out0]" >> >> \ >> >> -filter:a:0 "aresample=async=10000,volume=1.00" -c:a:0 ac3 -threads >> >> 2 >> >> -ac:a:0 6 -ar:a:0 48000 -b:a:0 384k \ >> >> -filter:a:1 "aresample=async=10000,volume=1.00" -c:a:1 ac3 -threads >> >> 2 >> >> -ac:a:1 6 -ar:a:1 48000 -b:a:1 384k \ -map "[out0]" -map "0:a:0" >> >> -map "1:a:0" \ -c:v h264_nvenc -b:v 6000k -minrate:v 6000k >> >> -maxrate:v 6000k -bufsize:v 12000k -a53cc 1 -tune ll -zerolatency 1 >> >> -cbr 1 -forced-idr 1 -strict_gop 1 -threads 2 -profile:v high >> >> -level:v 4.2 -bf:v 0 -g:v 30 \ -f mpegts -muxrate >> >> 8238520 -pes_payload_size 1528 "udp://@ >> >> >> 225.105.0.37:10102?pkt_size=1316&bitrate=8238520&burst_bits=10528&ttl=64" >> >> >> >> If you run that command in Ubuntu 22.04 it works 100% fine and >> >> transcodes till the end of the input file(s). >> >> >> >> What doesn't work is if you start that process under systemd >> >> non-interactively like so: >> >> >> >> systemd-run -S >> >> >> >> Then run that same command it will now fail in a strange way. >> >> >> >> Note: It's important that you try to output to multicast, if I try >> >> the same command outputting to a file, it works fine (my guess is >> >> any network-based output exhibits this behavior). >> >> >> >> You will see logs like this: >> >> >> >> [Parsed_scale_cuda_1 @ 0x55da86a03340] w:1920 h:1080 fmt:nv12 -> >> >> w:768 >> >> h:432 fmt:nv12 >> >> >> >> And the about 1-2 seconds before another log comes out. >> >> >> >> Eventually (after many stalls and logs) this log comes out and the >> >> transcode stops: >> >> >> >> [vost#0:0/h264_nvenc @ 0x55da86a3f780] Error submitting a packet to >> >> the >> >> muxer: Cannot allocate memory >> >> >> >> I attached GDB to ffmpeg when it is stalled and its inside trying >> >> to compile a cuda script. >> >> >> >> If I'm not doing xstack (I'm pretty sure this has to do with >> >> multiple >> >> inputs) nvidia does not stall. >> >> >> >> Does anyone have any idea what is happening here? I launch ffmpeg >> >> from a >> >> c++ wrapper daemon, if that daemon is started via systemd, then >> >> c++ nvidia >> >> multiple inputs fail. However, if I launch my daemon by hand at a >> >> terminal, it works fine. >> >> >> >> Thanks >> >> >> >> > Paste the content of the systemd unit file here. >> > Logs from the same (systemctl status unit-name.service) will also assist. >> >That might help in understanding how and why the systemd unit is failing. >> >> systemd service file: >> >> [Unit] >> Description=Transcoder Service >> After=default.target >> StartLimitInterval=0 >> >> [Service] >> Type=forking >> ExecStart=/opt/bin/videotranscoder >> Restart=always >> RestartSec=15 >> TasksMax=infinity >> LimitCORE=infinity >> >> [Install] >> WantedBy=default.target >> >> Logs: >> >> * videotranscoder.service - Innovative Video Transcoder >> Loaded: loaded (/lib/systemd/system/videotranscoder.service; >> disabled; vendor preset: enabled) >> Active: active (running) since Mon 2024-11-18 16:11:09 CST; 3min >> 19s ago >> Process: 50296 ExecStart=/opt/bin/videotranscoder (code=exited, >> status=0/SUCCESS) >> Main PID: 50298 (videotranscoder) >> Tasks: 81 >> Memory: 915.1M >> CPU: 1min 1.870s >> CGroup: /system.slice/videotranscoder.service >> |-50298 /opt/bin/videotranscoder >> |-50320 /bin/sh -c "/opt/bin/ffmpeg -y -threads 2 >> -nostats -nostdin -loglevel verbose -progress pipe:1 -probesize 5M >> -filter_threads 4 -threads 2 -re -fflags +genpts -fflags >> discardcorrupt -hwaccel_device 3 -extra_hw_frames 2 -hwaccel cuda -h> >> `-50322 /opt/bin/ffmpeg -y -threads 2 -nostats -nostdin >> -loglevel verbose -progress pipe:1 -probesize 5M -filter_threads 4 >> -threads >> 2 -re -fflags +genpts -fflags discardcorrupt -hwaccel_device 3 >> -extra_hw_frames 2 -hwaccel cuda -hwaccel_outpu> >> >> Nov 18 16:14:26 encoder10029unit4 videotranscoder:50296[50298]: >> FileTranscoder: [u:4,t:1,f:9f201704-a501-4e94-bce7-f3ac8e83a519.ts] >> Adding audio output: ac3, 6 channels, 384 kbps. >> Nov 18 16:14:26 encoder10029unit4 videotranscoder:50296[50298]: >> FileTranscoder: [u:4,t:1,f:9f201704-a501-4e94-bce7-f3ac8e83a519.ts] >> Audio bitrate is 0, defaulting audio bitrate to 128k for aac. >> Nov 18 16:14:26 encoder10029unit4 videotranscoder:50296[50298]: >> FileTranscoder: [u:4,t:1,f:9f201704-a501-4e94-bce7-f3ac8e83a519.ts] >> Adding audio output: aac, 2 channels, 128 kbps. >> Nov 18 16:14:26 encoder10029unit4 videotranscoder:50296[50298]: >> FileTranscoder: transcode ffmpeg cmd (starting): ffmpeg -hide_banner >> -y -nostats -hwaccel_device 1 -hwaccel cuvid -i >> /video/vod/in/9f201704-a501-4e94-bce7-f3ac8e83a519.ts -filter_complex >> "hw> Nov 18 16:14:28 encoder10029unit4 videotranscoder:50296[50298]: >> VideoTranscodeApp: [u:4,t:3,p:1: 225.105.0.56:10102] [fifo @ >> 0x55c5facc4840] Recovery attempt #1 Nov 18 16:14:28 encoder10029unit4 >> videotranscoder:50296[50298]: >> VideoTranscodeApp: [u:4,t:3,p:1: 225.105.0.56:10102] [mpegts @ >> 0x55c5f6bd2900] service 1 using PCR in pid=256, pcr_period=20ms >> >> [mpegts @ 0x55c5f6bd2900] muxrate 8238520, Nov 18 16:14:28 >> encoder10029unit4 videotranscoder:50296[50298]: >> VideoTranscodeApp: [u:4,t:3,p:1: 225.105.0.56:10102] sdt every 500 ms, >> pat/pmt every 100 ms Nov 18 16:14:28 encoder10029unit4 >> videotranscoder:50296[50298]: >> VideoTranscodeApp: [u:4,t:3,p:1: 225.105.0.56:10102] [fifo @ >> 0x55c5facc4840] Recovery successful Nov 18 16:14:28 encoder10029unit4 >> videotranscoder:50296[50298]: >> VideoTranscodeApp: [u:4,t:3,p:1: 225.105.0.56:10102] [fifo @ >> 0x55c5facc4840] FIFO queue flushed Nov 18 16:14:28 encoder10029unit4 >> videotranscoder:50296[50298]: >> VideoTranscodeApp: [u:4,t:3,p:1: 225.105.0.56:10102] [AVIOContext @ >> 0x7fa8b4014300] Statistics: 5395788 bytes written, 0 seeks, 4657 >> writeouts Nov 18 16:14:29 encoder10029unit4 videotranscoder:50296[50298]: >> VideoTranscodeApp: [u:4,t:3,p:1: 225.105.0.56:10102] [fifo @ >> 0x55c5facc4840] FIFO queue full >> > I see the problem. > > Your output is emulating CBR over mpegts, but it's overshooting. > Lower your buffersize to about 5*(bitrate/fps). Assuming a frame rate of 30 > fps, use -bufsize:v 1000 or thereabouts. First, thanks for that info, I was never quite sure what buffersize was correct. However, after changing to use that buffersize I get the same behavior. The key thing is I ran this under systemd-run for a reason. I was trying to show the simplest way to make this happen. I'm running under a stock Ubuntu 22.04 using Cuda 12.4 and the latest stable nvidia driver. If anyone has a nvidia compiled ffmpeg and ubuntu 22.04 with a nvidia card this will fail for them too. I'm baffled why starting it from an interactive terminal (ssh or directly on a connected keyboard/monitor) it works fine, but if I start it from systemd-run or if it's started by a systemd script (like on a reboot or package install) it exhibits this behavior. _______________________________________________ ffmpeg-user mailing list ffmpeg-user@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-user To unsubscribe, visit link above, or email ffmpeg-user-requ...@ffmpeg.org with subject "unsubscribe".