Hi Adrian,

thanks for the response!

First things first: I got GRUB to work yesterday :-)
The last bit of magic missing in the picture was what you expected: The blocklist was "shifted" because GRUB-install didn't correctly determine the relation of physical disk vs. MDraid-Volume block positions.
Not sure if it does when installing on x86 with blocklists?
I found it hard to find usable documentation about the actual physical on-disk-layout of MDraid RAID-1. SuperBlocks are well documented, the "complex" RAID-levels reasonably well also, but I found no mention that building the volume with metadata-version 1.2 (metadata 4k from beginning of disk, which shouldn't be a problem, as only the first 2 512-Byte sectors (->~1k) are in use for label & GRUB boot.img) actually shifts the start of Data in the mdraid to the end of the first 1MB block - so blocklist numbers are off by 1MB. I noticed that by trial & error: I opened a "normally installed" disk and the RAID member disk I tried to build in a hexeditor ;-) metadata=0.9 does not do this. I thought the metadata-settings only affected the position and "type" of the mdraid metadata blocks, not the actual on-disk-layout of the mirror.

So, long story short:
My old setup with SILO was exactly like this: metadata=0.9 for the /boot partition. With this, the ext3 blocks are on the same position on the physical disk AND on the md-volume. I might have set it up like this for the same reason back then, but forgot... It's been quite a while.

The second (although minor) trouble was, that grub-mkconfig generates an unusable grub.cfg for this setup. It refuses to set the "root=" variable to (mduuid/UUID), which was in turn necessary to install the bootblocks, instead using settings that lead to GRUB being unable to open the partition label and failing back to OBP.

The best solution I found was to edit the /usr/lib/grub/grub-mkconfig_lib shellscript to not set root at all. In my case, that works flawlessly, as GRUB actually starts with "root=" already set to the disk that loaded it, so it even works with one disk pulled from the server, simulating failure - which was exactly what I wanted.


So, long story short:
- Patching util/setup.c to correctly handle (virtual) block devices without partition tables.
- Use metadata=0.9 to build the mirror (!)
- Add device.map entry for MDraid-Device to work around the "diskfilter writes not supported"-issue (from grub-probe -t ieee1275_hints -d /dev/md0):
(mduuid/66bf8873932144cf2d6a74e4a05e67d3) /dev/md0
- Strip the lines in /usr/lib/grub/grub-mkconfig_lib between
  # otherwise set root as per value in device.map.
and
  IFS="$old_ifs"
to make boot entries that do not try to re-set "root"


After this fight, I achieved my boot mirror setup on GRUB/SPARC :-)

I'll respond to the kernel-stuff separately.

Thanks!

- Robin


Am 17.05.2021 um 10:23 schrieb John Paul Adrian Glaubitz:
Hi Robin!

On 5/15/21 7:25 PM, Robin Cremer wrote:
7. Report back to the list and include your hardware and partition setup
A bit late to the party, as SILO already appears to be gone (including the 
repos) and
all install images use GRUB now, but I'm having trouble and wanted to report 
this - and
maybe get some ideas, in case this is the best address to do so:
You can still install SILO from snapshot.debian.org. However, I would recommend 
building
the latest version from source as there have been some bugfixes in the meantime.

I'm in the process of migrating most of our SPARC servers running Solaris 10 & 
the old Debian
with 32bit SPARC userland to the SPARC64 debport. Some servers running Solaris 
11 will follow.
Good to hear.

Installing on two SunFire v215 went reasonably well

/- (apart from recurring Kernel Panics with "Unable to handle kernel paging request 
in mna handler",
most often triggered on boot immediately after the systemd binfmt service tries 
to start. This seems
to have been mentioned in /2020/04/msg00020.html but never pinpointed and 
fixed?) -/
What kernel version are you running. There have actually been some fixes in 
this regard, in particular
this fix:

https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/arch/sparc?id=e5e8b80d352ec999d2bba3ea584f541c83f4ca3f
but I can't seem to be able to configure GRUB on these servers as I did in the 
past with SILO (a 2-disk
mdraid with mirrored /boot, / and swap). I'm currently stuck with /boot on only 
one disk and the rest of
the system mirrored as I can't figure out how to install grub for a mirrored 
/boot partition:
Please keep in mind that GRUB is installed using blocklists on these older 
machines which means it's not
aware of the filesystem being used. The bootloader will just remember the 
location of the data blocks
and the physical disk. So it has no means to deal with something sophisticated 
as a software RAID.

Not sure how it worked with SILO which didn't use anything else than blocklists 
either (which is why
the /boot partition couldn't be too large and the filesystem used couldn't be 
too fancy).

1) Installing to the mirror device always yields a Segmentation Fault. I was 
unable to get any clue with
my limited gdb experience as to why - (with loaded debug symbols etc.: 
"Backtrace stopped: previous frame
identical to this frame (corrupt stack?)"):
# grub-install --skip-fs-probe --force --debug /dev/md0
[...]
grub-install: info: setting the root device to 
`mduuid/1ae243c1e2445aef777f4d32b671f41c'.
grub-install: warning: File system `ext2' doesn't support embedding.
grub-install: warning: Embedding is not possible.  GRUB can only be installed 
in this setup by using blocklists.  However, blocklists are UNRELIABLE and 
their use is discouraged..
grub-install: info: will leave the core image on the filesystem.
Segmentation fault
As I said above, I don't expect this to work, really. That doesn't mean that 
grub-install should crash
here. I will try to reproduce the issue when I find some time. Ideally, 
grub-install should just abort
the installation in this case.

But we could also find out how SILO worked in this case.

2) Trying to install to the individual disk partitions or the raw disk itself:
grub-install: warning: File system `ext2' doesn't support embedding.
grub-install: error: embedding is not possible, but this is required for RAID 
and LVM install.
[...]
grub-install: warning: Partition style `sun' doesn't support embedding.
grub-install: error: embedding is not possible, but this is required for RAID 
and LVM install.
Neither different filesystems (ext2, xfs, ...) nor different mdraid metadata 
formats made any difference.
I can't test other disk labels, as the old OBP doesn't handle GPT AFAIR.
Also, GRUB built from the most recent official sources from their git segfaults 
as well.
Thanks for testing the git version, I was about to ask that.

Any pointers how to achieve this setup? What can I test or does someone else 
have a similar setup
working? Am I doing something horribly wrong? I don't think mdraid-mirrored 
bootdisks should be too
uncommon on this hardware.
 From my statements above, I wouldn't expect GRUB with blocklists to work on a 
software RAID, so I
think you probably have no choice but to use a single disk for booting. In any 
case, I think the
the GRUB-specific discussion should be moved to the GRUB mailing list as this 
really concerns the
low-level functionality of GRUB.

Thanks and cheers to the community keeping SPARC alive :-)
Sure. Glad it's being useful.

Adrian


Reply via email to