Re: [FFmpeg-user] High CPU usage during scale_npp to low resolutions with multiple instances
On Wed, 8 Apr 2020, 15:23 Valentin Schweitzer, wrote: > > > Set this environment variable: CUDA_DEVICE_MAX_CONNECTIONS=2 > > Then retest and report back. > > > > > > One more thing: Could you show us the output of: > > numactl --hardware > > Thanks for your reply. We should have clarified that we are on Windows. > Unfortunately, setting the environment variable CUDA_DEVICE_MAX_CONNECTIONS > to 2 does not make a difference. The closest we got to a numactl equivalent > on Windows is the NUMA view in the Task Manager which shows four NUMA nodes > on our 24-core processor. Given this information, is it possible that > either > the Windows scheduler or the NVIDIA driver is having troubles with > different > ffmpeg instances being distributed to different NUMA nodes so that a lot of > data has to be transferred between NUMA nodes, limiting the CPU? Are there > any mitigations to this or is there anything else that we can analyze to > clarify why different resolutions behave so differently on our machine? > > Greetings, > Valentin > Hey there, In your BIOS, disable the following options and retest: 1. SMT support (toggle to disabled). 2. X2APIC (toggle to disabled). This will be the equivalent of setting "gaming mode" on the Ryzen consumer processors. Apply these changes, retest and report back. > ___ ffmpeg-user mailing list ffmpeg-user@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-user To unsubscribe, visit link above, or email ffmpeg-user-requ...@ffmpeg.org with subject "unsubscribe".
Re: [FFmpeg-user] High CPU usage during scale_npp to low resolutions with multiple instances
Set this environment variable: CUDA_DEVICE_MAX_CONNECTIONS=2 Then retest and report back. One more thing: Could you show us the output of: numactl --hardware Thanks for your reply. We should have clarified that we are on Windows. Unfortunately, setting the environment variable CUDA_DEVICE_MAX_CONNECTIONS to 2 does not make a difference. The closest we got to a numactl equivalent on Windows is the NUMA view in the Task Manager which shows four NUMA nodes on our 24-core processor. Given this information, is it possible that either the Windows scheduler or the NVIDIA driver is having troubles with different ffmpeg instances being distributed to different NUMA nodes so that a lot of data has to be transferred between NUMA nodes, limiting the CPU? Are there any mitigations to this or is there anything else that we can analyze to clarify why different resolutions behave so differently on our machine? Greetings, Valentin ___ ffmpeg-user mailing list ffmpeg-user@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-user To unsubscribe, visit link above, or email ffmpeg-user-requ...@ffmpeg.org with subject "unsubscribe".
Re: [FFmpeg-user] High CPU usage during scale_npp to low resolutions with multiple instances
On Mon, 30 Mar 2020, 15:31 Dennis Mungai, wrote: > On Mon, 30 Mar 2020, 15:22 Valentin Schweitzer, > wrote: > >> 1234567890123456789012345678901234567890123456789012345678901234567890 >> >> Hi, >> >> when using scale_npp to scale a test video down from 1920x1080 to >> 1024x576 or lower with multiple processes in parallel, CPU usage is >> unusually high.For context, when scaling the same video down to >> 1280x720, CPU usage stays at about0.5% per FFmpeg instance. When >> scaling down too 1024x576 or lower, CPU usage per FFmpeg process rises >> to about 3.0%. The values listed here appear when starting 29 >> instances of FFmpeg in parallel. The effect is less pronounced but >> still visible at 10 instances in parallel. Hardware used for this >> is an AMD EPYC 7401P 24 Core + NVIDIA Quadro RTX 4000. >> >> To generate 100s of random noise in 1080p (which will be the test video): >> >> ffmpeg -y -hide_banner -f lavfi -i nullsrc=s=1920x1080 -filter_complex >> "geq=random(1)*255:128:128;aevalsrc=-2+random(0)" -vcodec rawvideo >> -acodec pcm_s16le -t 100 noise.mkv >> >> Now rescale the test video to 720p: >> >> ffmpeg -hide_banner -y -i noise.mkv -vf >> hwupload_cuda,scale_npp=w=1280:h=720:format=nv12 -vcodec h264_nvenc -an >> -f null NUL >> >> This should not cause very high CPU usage. Now rescale the same video to >> 576p: >> >> ffmpeg -hide_banner -y -i noise.mkv -vf >> hwupload_cuda,scale_npp=w=1024:h=576:format=nv12 -vcodec h264_nvenc -an >> -f null NUL >> >> This should cause about 5 or 6 times as much CPU usage. >> >> This might be caused by some NVIDIA optimizations, but it does not >> seem to be documented and I have yet to find a good place to ask >> > > > Set this environment variable: CUDA_DEVICE_MAX_CONNECTIONS=2 > > Then retest and report back. > One more thing: Could you show us the output of: numactl --hardware ___ ffmpeg-user mailing list ffmpeg-user@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-user To unsubscribe, visit link above, or email ffmpeg-user-requ...@ffmpeg.org with subject "unsubscribe".
Re: [FFmpeg-user] High CPU usage during scale_npp to low resolutions with multiple instances
On Mon, 30 Mar 2020, 15:22 Valentin Schweitzer, wrote: > 1234567890123456789012345678901234567890123456789012345678901234567890 > > Hi, > > when using scale_npp to scale a test video down from 1920x1080 to > 1024x576 or lower with multiple processes in parallel, CPU usage is > unusually high.For context, when scaling the same video down to > 1280x720, CPU usage stays at about0.5% per FFmpeg instance. When > scaling down too 1024x576 or lower, CPU usage per FFmpeg process rises > to about 3.0%. The values listed here appear when starting 29 > instances of FFmpeg in parallel. The effect is less pronounced but > still visible at 10 instances in parallel. Hardware used for this > is an AMD EPYC 7401P 24 Core + NVIDIA Quadro RTX 4000. > > To generate 100s of random noise in 1080p (which will be the test video): > > ffmpeg -y -hide_banner -f lavfi -i nullsrc=s=1920x1080 -filter_complex > "geq=random(1)*255:128:128;aevalsrc=-2+random(0)" -vcodec rawvideo > -acodec pcm_s16le -t 100 noise.mkv > > Now rescale the test video to 720p: > > ffmpeg -hide_banner -y -i noise.mkv -vf > hwupload_cuda,scale_npp=w=1280:h=720:format=nv12 -vcodec h264_nvenc -an > -f null NUL > > This should not cause very high CPU usage. Now rescale the same video to > 576p: > > ffmpeg -hide_banner -y -i noise.mkv -vf > hwupload_cuda,scale_npp=w=1024:h=576:format=nv12 -vcodec h264_nvenc -an > -f null NUL > > This should cause about 5 or 6 times as much CPU usage. > > This might be caused by some NVIDIA optimizations, but it does not > seem to be documented and I have yet to find a good place to ask > Set this environment variable: CUDA_DEVICE_MAX_CONNECTIONS=2 Then retest and report back. > ___ ffmpeg-user mailing list ffmpeg-user@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-user To unsubscribe, visit link above, or email ffmpeg-user-requ...@ffmpeg.org with subject "unsubscribe".
[FFmpeg-user] High CPU usage during scale_npp to low resolutions with multiple instances
1234567890123456789012345678901234567890123456789012345678901234567890 Hi, when using scale_npp to scale a test video down from 1920x1080 to 1024x576 or lower with multiple processes in parallel, CPU usage is unusually high.For context, when scaling the same video down to 1280x720, CPU usage stays at about0.5% per FFmpeg instance. When scaling down too 1024x576 or lower, CPU usage per FFmpeg process rises to about 3.0%. The values listed here appear when starting 29 instances of FFmpeg in parallel. The effect is less pronounced but still visible at 10 instances in parallel. Hardware used for this is an AMD EPYC 7401P 24 Core + NVIDIA Quadro RTX 4000. To generate 100s of random noise in 1080p (which will be the test video): ffmpeg -y -hide_banner -f lavfi -i nullsrc=s=1920x1080 -filter_complex "geq=random(1)*255:128:128;aevalsrc=-2+random(0)" -vcodec rawvideo -acodec pcm_s16le -t 100 noise.mkv Now rescale the test video to 720p: ffmpeg -hide_banner -y -i noise.mkv -vf hwupload_cuda,scale_npp=w=1280:h=720:format=nv12 -vcodec h264_nvenc -an -f null NUL This should not cause very high CPU usage. Now rescale the same video to 576p: ffmpeg -hide_banner -y -i noise.mkv -vf hwupload_cuda,scale_npp=w=1024:h=576:format=nv12 -vcodec h264_nvenc -an -f null NUL This should cause about 5 or 6 times as much CPU usage. This might be caused by some NVIDIA optimizations, but it does not seem to be documented and I have yet to find a good place to ask more in-depth questions about NVIDIA encoding hardware. So, if anyone has encountered a similar issue or knows why this issue might occur, I would be grateful about any advice. Greetings, Valentin ___ ffmpeg-user mailing list ffmpeg-user@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-user To unsubscribe, visit link above, or email ffmpeg-user-requ...@ffmpeg.org with subject "unsubscribe".