I’ve got two AMD systems that are similar in configuration.  An HP Z230 running 
NetBSD 9.3 and an HP-6200 running NetBSD 10.0_BETA.  On the 9.3 system I 
converted/encoded my video library containing some old movies and TV shows 
using ffmpeg4 with libx264 compression.  It all converted just fine.  Then I 
discovered that if I encoded the videos with libx265 I’d get much higher 
compression will little loss in quality, so I set up the HP-6200 to do this 
under NetBSD 10.0.  This is where it gets interesting!

On NetBSD-10.0 I was seeing about a 3-5% failure rate with set faults using 
ffmpeg5 doing libx265 encoding. On retry using ffmpeg with libx264 encoding the 
video aways encoded to successful completion.  And many times just retrying 
encoding with libx265 would complete successfully too.

So I tried doing the same thing on NetBSD-9.3 using ffmpege4 with libx265 
encoding and I was seeing a higher failure rate, somewhere around 7-8%, usually 
on many of the same videos. But again here, encoding with libx264 always 
succeeded, but retries with libx265 never seemed to succeed on a video that had 
previously failed.

Naturally I assumed that problem is with libx265, and over the years many 
people have reported the same sorts of problems, but the maintainers have never 
been able to reliably reproduce the failures. Digging a bit deeper I see that 
the same version of libx265 (3.5) is used in both my ffmpeg4 on 9.3 and ffmpeg5 
on 10.0.  BTW, I do have ffmpege4 built on my 10.0 system and it produces the 
same results as ffmpeg5.

What’s interesting is the the failure rate for encoding is about twice as high 
on the 9.3 system as it is on the 10.0 system, which leads me to believe it 
_could_ be an OS issue and not a libx265 one.  That and the fact that in many 
cases retries work on 10.0 but not on 9.3.

I then noticed PR 57188 and the suggested fix for 10.0.  I applied that fix and 
my failure rate on encoding seems to have dropped to about half of what I was 
seeing previously.  (Although this may be a Red Herring since retries seem to 
succeed about 60% of the time.)  I tried the same fix on my 9.3 system and the 
failure rate there seems unchanged, and retries don’t help.  I’m wondering if 
there’s a bug in 9.3 that got partially but not completely squashed in 10.0 
which could explain this.

I wonder if anyone has some suggestions to help narrow down what’s happening 
and which pieces of software deserve the blame?

Reply via email to