I had this nicely formatted when I sent it, but it seems to have been reformatted elsewhere in transit. Hopefully this helps but if not I will leave it be.

On 2/14/21 6:27 PM, Judah Kocher wrote:
Thanks to each of you for your replies,

Lesson 1: always get machines with remote console access. It wil save the day some day and help in diagnosing issues.
Having remote console access would be sweet, but unfortunately that goes far beyond the hobbyist price point I currently have to work with.
On the system that succeeded when you were watching on the console, did automaic sysupgardes started to work after that?
It did not. This unit still fails the weekly upgrade.
In general, my guess would be a boot.conf contents that prevent the automatic upgrade to work. Or maybe you have very old bootloaders on the failing mahcines.
I do not have an /etc/boot.conf file on any of these systems. Both the units having issues are less than 12 months old so I wouldn't think the bootloader age would be an issue. I have both older and newer units that are working correctly. Are there major recent changes you are aware of that might lead to something like this?
BTW, kernel # cannot be used to identify a kernel.
Interesting. When I realized two of my machines were still on old snapshots it seemed like a a simple metric to use for tracking this. The actual number isn't really as relevant to me at this point as the fact that it does or does not change. Could the update be partially and/or completely succeeding and the kernel # staying the same? In my limited experience following -current for the last 4+ years every snapshot comes with a different #.
 -Otto

Care to show a script ? otherwise it looks like rather lengthy mathematical problem with quite some variables.
    The script is very basic. I didn't show it originally because I know from reading various threads in the past that many of the folks on this list find a "non-default" install abhorrent and I wasn't interested in dragging that into this. However, here are the complete script contents:

#!/bin/sh

# Fetch current system version
CURRENT=`uname -a | awk '{print $4}'`

# download latest snapshot but do not install
doas sysupgrade -n -s

# Delete unwanted sets
doas rm /home/_sysupgrade/x*
doas rm /home/_sysupgrade/g*
doas rm /home/_sysupgrade/c*

# Log System Time
year=`date +%Y`
month=`date +%m`
day=`date +%d`
hour=`date +%H`
minute=`date +%M`
second=`date +%S`

echo "System Snapshot upgrade started from $CURRENT at $hour:$minute:$second on $month/$day/$year" >> /home/$USER/systemLog
doas reboot


    I have a separate script that runs on each boot which checks if an upgrade attempt was the last logged item and fetches the current system version to compare. If it is the same it logs a failure. If it is different it logs the new version #. Either way it emails me the results. This script is running on all 6 systems.
.


What are the permissions on the bsd.upgrade that's left behind? If they are still +x then your issue is with the boot loader, maybe that boot.conf otto suggested. If they are -x then the boot loader started the install kernel but something went wrong.
The permissions of the left-behind bsd.upgrade are -rw------- 1 root

On 14 February 2021 18:02:07 CET, Judah Kocher <koche...@hotmail.com> wrote:
Hello folks,

I am having an issue with sysupgrade and I have had trouble finding the

source of the problem so I hope someone here might be able and willing
to point me in the right direction.

I have 6 small systems running OpenBSD -current and I have a basic
script which upgrades to the latest snapshot weekly. The systems are
all
relatively similar. Three are the exact same piece of hardware, two are

slightly different, and one is a VM configured to match the first three

as closely as possible with virtual hardware.

The script checks the current kernel version, (e.g. "GENERIC.MP#302")
logs it, runs sysupgrade, and after the reboot it checks the kernel
version again. If it is different it logs it as a "success" and if it
is
still the same it logs it as a failure.

All 6 systems were configured using the same autoinstall configuration
and the upgrade script is identical on each unit. However, two of the
three identical units always fail. When I remote into either system and

manually run the upgrade script it also fails. I was able to get onsite

with one of them where I connected a monitor and keyboard and manually
ran the script to observe the results but oddly enough it succeeded so
I
learned nothing actionable. However it continues to fail the weekly
upgrade. I have confirmed that the script permissions are identical on
the working and nonworking units.

The 4 units that successfully upgrade leave a mail message with a log
of
the upgrade process. However I have been unable to find any record or
log on the systems that are failing to help me figure out why this
isn't
working. The only difference I can identify between the systems is that

"auto_upgrade.conf" and "bsd.upgrade" are both present in "/" on the
two
systems that fail, but are properly removed on the 4 that succeed.

I would appreciate any suggestions of what else I can try or check to
figure out what is causing this issue.

Thanks

Judah

Reply via email to