Hi, Michael

Thanks for your reply. I am trying out a few things you mentioned and will send you a few log files privately.


Michael Tautschnig wrote:
Just a note: it shouldn't be necessary to add the preserve_lazy:1,2 and
preserve_lazy:0,1 lines - setup-storage should definitely be able to figure this
out itself, just from the preserve_lazy:vg0-home. In fact I'd even be interested
in a try that shows that this effectively works.
Ok, I've removed this and tried to install again with rebuilt nfsroot to remove all local changes under that directory. Again I get the blinking cursor in the lower left corner. I think something is done to the beginning of the disk since it will actually boot when loading a remote kernel via the PXE boot menu, but not without errors.

+system "/sbin/mdadm-startall";
# see whether there are any existing LVMs
&FAI::get_current_lvm;
+# and stop mdadm
+system "/etc/init.d/mdadm-raid stop";


Hmm, I wonder whether the kernel module md-mod is loaded before setup-storage
starts. Could you please try to figure that out? I had hoped that if it is
loaded, the array would be detected without such explicit starting of mdadm.
I followed your lead and, instead of earlier changes, just added a lsmod line before get_current_lvm like this

system "/sbin/lsmod";
&FAI::get_current_lvm;


So I got a printout of all the modules

md_mod                 73824  0

included, followed by

    Finding all volume groups
  No volume groups found
(CMD) mdadm --examine --scan --verbose -c partitions 1> /tmp/9zLFVW5FMG 2> /tmp/VtouZRXPYf
Executing: mdadm --examine --scan --verbose -c partitions

etc and later still

Current LVM layout
$VAR1 = {};

which is not empty if we run mdadm-startall before get_current_lvm. (fai.log.lsmod in private mail)
# see whether there are any existing RAID devices
&FAI::get_current_raid;
@@ -177,7 +181,11 @@
$FAI::debug and print Dumper \%FAI::configs;

# generate the command script
-&FAI::build_disk_commands;
+# build_disk_commands won't leave our partitions alone
+#&FAI::build_disk_commands;
+&FAI::push_command( "true", "" , "pt_complete_/dev/sda");
+&FAI::push_command( "true", "" , "pt_complete_/dev/sdb");
+
&FAI::build_raid_commands;
&FAI::build_lvm_commands;
&FAI::build_cryptsetup_commands;

In what sense is it not leaving your partitions alone, or, rather, what does it
break? Could you please give it another try with only the above mdadm-changes?
With the build_disk_commands and without our two push_command lines, or vise versa, it does find the correct Current LVM layout, but exit with

Error in task partition. Traceback: task_error task_partition task task_install task task_action task main
$LOGUSER is undefined. Not saving log files to remote.
FATAL ERROR. Installation stopped.

(fai.log.stopped in private mail)
All of the above with this disk_config

disk_config disk1 bootable:1
primary  -              512     -       -
primary  -              0-      -       -

disk_config disk2 bootable:1
primary  -              512     -       -
primary  -              0-      -       -

disk_config raid
raid1        /boot   sda1,sdb1  ext4    rw,errors=remount-ro
raid1        -       sda2,sdb2  -       -

disk_config lvm preserve_lazy:vg0-home
vg              vg0                     md1
vg0-swap                swap                    4096        swap        rw
vg0-local               /usr/local              10240       ext4        rw
vg0-src                 /usr/src                10240       ext4        rw
vg0-usr                 /usr                    10240       ext4        rw
vg0-var                 /var                    10240       ext4        rw
vg0-tmp                 /tmp                    2048        ext4        rw
vg0-home                /home                   97280       ext4        rw
vg0-root / 4096 ext4 rw,errors=remount-ro

If I use this disk_config line instead

disk_config raid preserve_lazy:0,1 always_format:0

together with the two push_command lines and without build_disk_commands the installation will go through and vg0-home is preserved. (fai.log.vg0-home in private mail)

What is it that gives these messages, is it grub? I think it would help me a lot
if you could send along the full fai.log; you might want to send that in private
mail, though.
This is after grub since it does load the kernel and the initial ramdisk prior to that message, but I'll write a more detailed output of what is printed after grub

Loading Linux 2.6.32-5-amd64 ...
Loading initial ramdisk ...
[ 0.973192] pci 0000:01:00.0: BAR 6: no parent found for of device [0xfff80000-0xffffffff] [ 0.973302] pci 0000:03:00.0: BAR 6: no parent found for of device [0xfffc0000-0xffffffff]
Loading, please wait...
  Volume group "vg0" not found
  Skipping volume group vg0
Unable to find LVM volume vg0/root
Gave up waiting for root device. Common problems:
 - Boot args (cat /proc/cmdline)
   - Check rootdelay= (did the system wait long enough?)
   - Check root= (did the system wait for the right device?)
 - Missing modules (cat /proc/modules; ls /dev)
ALERT!  /dev/mapper/vg0-root does not exist.  Dropping to a shell!
(initramfs)

To clearify; This is with local changes to setup-storage, mentioned here and in earlier mail, but with grub preserved from previous install. This is probably due to the preserve_lazy at the disk_config raid line, if it is broken from previous installation it stays broken (the blinking lower left cursor). If we use our rescue mode to load a kernel and an initrd image remote we get some problems with md0, but the system is quite bootable and vg0-home is preserved. (I'll supply /var/log/dmesg in private mail, there might be something useful in it, also a fai.log.437, from an install with FAI 4.3.7, for reference.)


Thank you, I'll be back Monday
--
Fredrik

<<attachment: fredrike.vcf>>

Antwort per Email an