I am using nvidia packages as an example, because I personally have been
affected, but the problem can be more general.
Splitting nvidia-driver into the kernel and userland pieces was the right thing
from the perspective that the kernel driver is obvious coupled with kernel version.
But it seems that nvidia userland and kernel module are also coupled with each
other.
Main package repo-s and kmod repo-s are updated at independent cadences.
Also, there can be build failures that can introduce further unpredictability
into package update cycles.
So, it can happen and has happened recently that nvidia packages got updated in
one place but remained at an older version in the other.
I overlooked that when doing a package upgrade and ended up with a newer kernel
module and older userland. nvidia didn't like that at all and Xorg could not
start properly (it did start but all I saw was a black screen and cursor).
Also, there were these messages in the log:
kernel: NVRM: API mismatch: the client 'Xorg' (pid 3447)
kernel: NVRM: has the version 580.95.05, but this kernel module has
kernel: NVRM: the version 580.105.08. Please make sure that this
kernel: NVRM: kernel module and all NVIDIA driver components
kernel: NVRM: have the same version.
It's impossible to prevent such a mismatch across different repositories.
So, maybe, it would be possible to prevent it at upgrade / installation time?
E.g., maybe there could be some stricter version dependency between
nvidia-driver and nvidia-kmod such that pkg would refuse to upgrade nvidia-kmod
if a newer nvidia-driver is not available (and vice versa?).
As I said in the beginning, there could be other instances of such
userland-kernel coupling.
--
Andriy Gapon