On 4/10/21 10:38 PM, chris hermansen wrote:
> Hello again everyone,
>
> Hoping to beg a bit of advice. I have a brand new computer with an NVidida
> RTX 3060 and I'm running the daily 21.04 on it.  Install goes fine but
> "after awhile" I seem to end up with a manually installed driver for the
> RTX and "after a bit more while" all of a sudden I'm looking at a 1320 X
> 768 display.
>
> I've made this problem go away temporarily with
>
> sudo apt purge nvidia-\*
>
> and poof I'm back to a reasonable 1920 x 1080 display.
>
> As far as I know I did not "manually install" anything. I'm a bit lost as
> to how to report this as a bug, given that it's an NVidia thing and also
> this weird "manually installed" package.
>
> If anyone has any suggestions I'd be most grateful.
>



Greetings,

It's been a long time since I've used Nvidia on Ubuntu. So I'm going to start with a package I know. Hopefully I don't simplify my explanation to the point of being incorrect. :-D

Linux-generic is a "package" that really isn't a package. It's more like a pointer to the latest version of the kernel. If you never flush out old kernels when you update, eventually (I think it's three old versions??) you will find that the repo no longer lists the kernels as available and apt will then basically say "I can't find this package anymore in the source list so therefore this package must have been installed manually." And that's how you can end up with "manually" installed packages that were actually installed by Ubuntu - it happens all the time.

My *guess* is that you have an nvidia pointer package that is installing drivers for you. Eventually the version you are running is falling out of sync with the upstream source and thus your specific version is falling into "I can't find it anymore so it must be a manual install".

Also, *IF* I remember correctly NVidia drivers are built to a specific version of the kernel but can be "loosely" linked. Thus, an NVidia driver for kernel 5.4.0.66.68 could work on 5.4.0.66.69 without recompile provided that nothing major changed in the kernel specific to its hooks. But eventually, you will update to a version of the kernel that isn't loosely linked and if the nvidia driver isn't automatically rebuilt with the update then things are going to go very poorly in driver quality.

Here's my suspicion. If removing the NVidia drivers solves your problem, then you are wanting to stay on the nouveau drivers. Some update some where is telling your system to update to the NVidia drivers. They get out of sync and those drivers are then listed as "manual". Eventually those drivers clash/conflict with the latest kernel update and you end up with a bad display. I most often see this kind of thing happen with auto-updates in the background that pick the "best" option for you.

[quick side rant]
Telling someone to disable auto-updates is **TERRIBLE** advise. But this non-sense about drivers getting out of line is _precisely_ why I disable auto-update. I _loath_ it when I leave a perfectly functional system one night and log in the next morning to a busted upgrade... But I'm also paranoid and responsible enough that I subscribe to all the RSS feeds for all the security notices on my systems and I have planned times when I apply updates... But because the vast majority of users can't be bothered to actually do regular updates, then updates have to be forced on them automatically to prevent massive security issues and evil botnets which unfortunately means user systems "break unexpectedly".
*sigh*
[end rant]

So what should you do? If you really don't want the NVidia drivers, then open up "software sources" and look for the "additional drivers" tab. Make sure that NVidia is disabled here. If you *DO* want the NVidia drivers, then make sure the appropriate driver is selected (if I recall correctly, there's like a stable, beta, and maybe something else?). And if you can't find "software sources", drop to the command line and try typing `software-properties` then hit the tab key to see if there are auto-completes for qt or gtk or whatever. It *should* be installed already if you did a default Ubuntu install though.

From that same interface, on a different tab, you can also check to see what updates happen, and at what time interval updates are checked for and applied. I recommend you at least knowing what frequency this might be occurring on your system.

Also, Ubuntu can churn through kernels. I recommend staying on top of any automated process that is updating behind the scenes. Knowing when and how often your computer is updating the kernel and then verifying if it is installing the nvidia drivers along side it might help you narrow down the who. Once you find the who - then you can file a bug report.

It's been too long since I've really poked at the auto-update process for Ubuntu (besides just flat out turning it off!). I honestly don't remember the package/process name that does the auto update. Maybe someone else can tell you exactly the package/process causing a problem. But I can tell you how I'd start looking for it.

Start with /var/log/dpkg.log and look for clues as to when and what packages are being installed. For example, lets say you notice a line that looks something like this (this is fabricated so it may not match exactly - in fact I'm making up version numbers! :-D ):

2021-04-11 20:40:57 status installed nvidia-driver:all 5.1.2.23

Great! You know that the package was installed at that time. Next take a look at the file /var/log/syslog to find out what was going on at that time. Perhaps you might see something that looks similar to this (again, making up a few numbers but it should be close):

Apr 11 20:34:56 cohen systemd[1]: Started Run anacron jobs.
Apr 11 20:34:56 cohen anacron[1854232]: Anacron 2.3 started on 2021-04-11
Apr 11 20:47:12 cohen systemd[1]: anacron.service: Succeeded.
Apr 11 20:47:12 cohen anacron[1854232]: Normal exit (1 jobs run)

Great! That tells you that anacron had some job that ran for ~13 minutes which is probably a good indicator that it busy for a while and that aligns well to a package upgrade. Maybe it isn't anacron. Maybe it is /etc/cron.hourly or /ect/cron.daily. But it is probably some cron service running the script to auto update. At that point, you just track down what application is installing the nvidia drivers. Take a look in the cron service to see which looks closest to updating packages.

Bonus! If you can actually catch all the logs of this happening then it helps the bug report. It is much more helpful if you can say "This system was working perfectly, then this update with these syslog files installed these nvidia packages from dpkg.log which then broke my system. I then removed nvidia to make my system work again."

Hope that helps some. And I hope you find the package that is causing problems so that someone who knows what they are talking about can fix it! (because if it is causing you problems it's probably causing someone else problems too!)
:-D

Ed



--
Ubuntu-quality mailing list
[email protected]
Modify settings or unsubscribe at: 
https://lists.ubuntu.com/mailman/listinfo/ubuntu-quality

Reply via email to