On Tuesday 26 August 2014 14:21:19 Kerin Millar wrote: > On 26/08/2014 10:38, Peter Humphrey wrote: > > On Monday 25 August 2014 18:46:23 Kerin Millar wrote: > >> On 25/08/2014 17:51, Peter Humphrey wrote: > >>> On Monday 25 August 2014 13:35:11 Kerin Millar wrote: --->8 > Again, can you find out what the exit status is under the circumstances that > mdadm produces a blank error? I am hoping it is something other than 1.
I've remerged mdadm to run this test. I'll report the result in a moment. [...] In fact it returned status 1. Sorry to disappoint :) > >>> Here's the position: > >>> 1. I've left /etc/init.d/mdraid out of all run levels. I have nothing > >>> but comments in mdadm.conf, but then it's not likely to be read anyway > >>> if the init script isn't running. > >>> 2. I have empty /etc/udev rules files as above. > >>> 3. I have kernel auto-assembly of raid enabled. > >>> 4. I don't use an init ram disk. > >>> 5. The root partition is on /dev/md5 (0.99 metadata) > >>> 6. All other partitions except /boot are under /dev/vg7 which is built > >>> on top of /dev/md7 (1.x metadata). > >>> 7. The system boots normally. > >> > >> I must confess that this boggles my mind. Under these circumstances, I > >> cannot fathom how - or when - the 1.x arrays are being assembled. > >> Something has to be executing mdadm at some point. > > > > I think it's udev. I had a look at the rules, but I no grok. I do see > > references to mdadm though. > So would I, only you said in step 2 that you have "empty" rules, which I > take to mean that you had overridden the mdadm-provided udev rules with > empty files. Correct; that's what I did, but since removing mdadm I've also removed the corresponding, empty /etc/udev files. I don't think it's udev any more; I now think the kernel is cleverer than we gave it credit for (see below and attached dmesg). > If all of the conditions you describe were true, you would have eliminated > all three of the aformentioned contexts in which mdadm can be invoked. Given > that mdadm is needed to assemble your 1.x arrays (see below), I would expect > such conditions to result in mount errors on account of the missing arrays. --->8 > Again, 1.x arrays must be assembled in userspace. The kernel cannot > assemble them by itself as it can with 0.9x arrays. If you uninstall > mdadm, you will be removing the very userspace tool that is employed for > assembly. Neither udev nor mdraid will be able to execute it, which > cannot end well. I had done that, with no ill effect. I've just booted the box with no mdadm present. It seems the kernel can after all assemble the arrays (see attached dmesg.txt, edited). Or maybe I was wrong about the metadata and they're all 0.99. In course of checking this I tried a couple of things: # lvm pvck /dev/md7 Found label on /dev/md7, sector 1, type=LVM2 001 Found text metadata area: offset=4096, size=1044480 # lvm vgdisplay --- Volume group --- VG Name vg7 System ID Format lvm2 Metadata Areas 1 Metadata Sequence No 14 VG Access read/write VG Status resizable MAX LV 0 Cur LV 13 Open LV 13 Max PV 0 Cur PV 1 Act PV 1 VG Size 500.00 GiB PE Size 4.00 MiB Total PE 127999 Alloc PE / Size 108800 / 425.00 GiB Free PE / Size 19199 / 75.00 GiB VG UUID ll8OHc-if2H-DVTf-AxrQ-5EW0-FOLM-Z73y0z Can you tell from that which metadata version I used when I created vg7? It looks like 1.x to me, since man lvm refers to formats (=metadata types) lvm1 and lvm2 - or am I reading too much into that? See here what the postinst message said when I remerged sys-fs/mdadm-3.3.1-r2 for the return-code test you asked for: * If you're not relying on kernel auto-detect of your RAID * devices, you need to add 'mdraid' to your 'boot' runlevel: * rc-update add mdraid boot Could be thought ambiguous. Is nobody else experiencing this behaviour? -- Regards Peter
I seem to have a BIOS problem here. I switched DMA relocation off in the kernel config when I found this error the first time, but it still appears, as you see. [ 0.000000] ------------[ cut here ]------------ [ 0.000000] WARNING: CPU: 0 PID: 0 at drivers/iommu/dmar.c:503 warn_invalid_dmar+0x7c/0x8e() [ 0.000000] Your BIOS is broken; DMAR reported at address fed90000 returns all ones! BIOS vendor: American Megatrends Inc.; Ver: 1102 ; Product Version: System Version [ 0.000000] Modules linked in: [ 0.000000] CPU: 0 PID: 0 Comm: swapper Not tainted 3.14.14-gentoo #2 [ 0.000000] Hardware name: System manufacturer System Product Name/P7P55D, BIOS 1102 11/23/2009 [ 0.000000] 000000000000000b ffffffff81801e08 ffffffff814ba5c6 00000000000000a8 [ 0.000000] ffffffff81801e58 ffffffff81801e48 ffffffff81041887 ffffffff81801e68 [ 0.000000] ffffffff81af001c ffffffff81af0058 00000000fed90000 0000000000000000 [ 0.000000] Call Trace: [ 0.000000] [<ffffffff814ba5c6>] dump_stack+0x46/0x58 [ 0.000000] [<ffffffff81041887>] warn_slowpath_common+0x87/0xb0 [ 0.000000] [<ffffffff8104190a>] warn_slowpath_fmt_taint+0x3a/0x40 [ 0.000000] [<ffffffff812e06bf>] ? acpi_tb_verify_checksum+0x20/0x55 [ 0.000000] [<ffffffff814bb97e>] warn_invalid_dmar+0x7c/0x8e [ 0.000000] [<ffffffff818c81e9>] detect_intel_iommu+0xd4/0x13e [ 0.000000] [<ffffffff81897e07>] pci_iommu_alloc+0x4a/0x72 [ 0.000000] [<ffffffff818a43b6>] mem_init+0x9/0x3e [ 0.000000] [<ffffffff81893ca2>] start_kernel+0x1f8/0x35f [ 0.000000] [<ffffffff818938a9>] ? repair_env_string+0x5e/0x5e [ 0.000000] [<ffffffff818935af>] x86_64_start_reservations+0x2a/0x2c [ 0.000000] [<ffffffff818936a9>] x86_64_start_kernel+0xf8/0xfc [ 0.000000] ---[ end trace 492cc958e666c6fa ]--- Nevertheless, I still get this: [ 0.108051] Last level iTLB entries: 4KB 512, 2MB 7, 4MB 7 Last level dTLB entries: 4KB 512, 2MB 32, 4MB 32, 1GB 0 tlb_flushall_shift: 6 [ 0.108330] Freeing SMP alternatives memory: 20K (ffffffff81951000 - ffffffff81956000) [ 0.108515] dmar: Host address width 36 [ 0.108605] dmar: DRHD base: 0x000000fed90000 flags: 0x1 [ 0.108704] dmar: IOMMU: failed to map dmar0 [ 0.108794] dmar: parse DMAR table failure. [ 0.109268] ..TIMER: vector=0x30 apic1=0 pin1=2 apic2=-1 pin2=-1 [ 0.119369] smpboot: CPU0: Intel(R) Core(TM) i5 CPU 750 @ 2.67GHz (fam: 06, model: 1e, stepping: 05) Here's where disk detection starts: [ 0.340082] ahci 0000:00:1f.2: version 3.0 [ 0.340170] ahci 0000:00:1f.2: irq 44 for MSI/MSI-X [ 0.340192] ahci 0000:00:1f.2: SSS flag set, parallel bus scan disabled [ 0.351054] ahci 0000:00:1f.2: AHCI 0001.0300 32 slots 6 ports 3 Gbps 0x3f impl SATA mode [ 0.351196] ahci 0000:00:1f.2: flags: 64bit ncq sntf stag pm led clo pmp pio slum part ems sxs apst [ 0.362882] scsi0 : ahci [ 0.363489] scsi1 : ahci [ 0.364060] scsi2 : ahci [ 0.364673] scsi3 : ahci [ 0.365266] scsi4 : ahci [ 0.365881] scsi5 : ahci [ 0.366044] ata1: SATA max UDMA/133 abar m2048@0xf7ff7000 port 0xf7ff7100 irq 44 [ 0.366185] ata2: SATA max UDMA/133 abar m2048@0xf7ff7000 port 0xf7ff7180 irq 44 [ 0.366326] ata3: SATA max UDMA/133 abar m2048@0xf7ff7000 port 0xf7ff7200 irq 44 [ 0.366466] ata4: SATA max UDMA/133 abar m2048@0xf7ff7000 port 0xf7ff7280 irq 44 [ 0.366606] ata5: SATA max UDMA/133 abar m2048@0xf7ff7000 port 0xf7ff7300 irq 44 [ 0.366747] ata6: SATA max UDMA/133 abar m2048@0xf7ff7000 port 0xf7ff7380 irq 44 [ 0.366966] i8042: PNP: PS/2 Controller [PNP0303:PS2K,PNP0f03:PS2M] at 0x60,0x64 irq 1,12 [ 0.370095] serio: i8042 KBD port at 0x60,0x64 irq 1 [ 0.370192] serio: i8042 AUX port at 0x60,0x64 irq 12 [ 0.370478] mousedev: PS/2 mouse device common for all mice [ 0.370602] md: raid1 personality registered for level 1 [ 0.370958] device-mapper: ioctl: 4.27.0-ioctl (2013-10-30) initialised: [email protected] [ 0.371119] hidraw: raw HID events driver (C) Jiri Kosina [ 0.372090] snd_hda_intel 0000:00:1b.0: irq 45 for MSI/MSI-X [ 0.372208] NET: Registered protocol family 17 [ 0.372304] NET: Registered protocol family 15 [ 0.372400] Key type dns_resolver registered [ 0.372866] registered taskstats version 1 [ 0.373630] ALSA device list: [ 0.373728] No soundcards found. [ 0.396880] input: AT Translated Set 2 keyboard as /devices/platform/i8042/serio0/input/input0 [ 0.672182] ata1: SATA link up 3.0 Gbps (SStatus 123 SControl 300) [ 0.678134] ata1.00: ATA-8: SAMSUNG HD103SJ, 1AJ100E4, max UDMA/133 [ 0.678242] ata1.00: 1953525168 sectors, multi 0: LBA48 NCQ (depth 31/32), AA [ 0.684315] ata1.00: configured for UDMA/133 [ 0.684729] scsi 0:0:0:0: Direct-Access ATA SAMSUNG HD103SJ 1AJ1 PQ: 0 ANSI: 5 [ 0.685048] sd 0:0:0:0: [sda] 1953525168 512-byte logical blocks: (1.00 TB/931 GiB) [ 0.685234] sd 0:0:0:0: [sda] Write Protect is off [ 0.685326] sd 0:0:0:0: [sda] Mode Sense: 00 3a 00 00 [ 0.685337] sd 0:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA [ 0.747681] sda: sda1 sda2 sda3 sda4 < sda5 sda6 sda7 sda8 sda9 > [ 0.749559] sd 0:0:0:0: [sda] Attached SCSI disk [ 0.990341] ata2: SATA link up 3.0 Gbps (SStatus 123 SControl 300) [ 0.996335] ata2.00: ATA-8: SAMSUNG HD103SJ, 1AJ100E4, max UDMA/133 [ 0.996443] ata2.00: 1953525168 sectors, multi 0: LBA48 NCQ (depth 31/32), AA [ 1.002515] ata2.00: configured for UDMA/133 [ 1.002917] scsi 1:0:0:0: Direct-Access ATA SAMSUNG HD103SJ 1AJ1 PQ: 0 ANSI: 5 [ 1.003243] sd 1:0:0:0: [sdb] 1953525168 512-byte logical blocks: (1.00 TB/931 GiB) [ 1.003418] sd 1:0:0:0: [sdb] Write Protect is off [ 1.003511] sd 1:0:0:0: [sdb] Mode Sense: 00 3a 00 00 [ 1.003522] sd 1:0:0:0: [sdb] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA [ 1.043495] sdb: sdb1 sdb2 sdb3 sdb4 < sdb5 sdb6 sdb7 sdb8 sdb9 > [ 1.045159] sd 1:0:0:0: [sdb] Attached SCSI disk [ 1.308476] ata3: SATA link up 1.5 Gbps (SStatus 113 SControl 300) [ 1.310264] ata3.00: ATAPI: Optiarc DVD RW AD-7240S, 1.02, max UDMA/100 [ 1.312470] ata3.00: configured for UDMA/100 [ 1.314570] scsi 2:0:0:0: CD-ROM Optiarc DVD RW AD-7240S 1.02 PQ: 0 ANSI: 5 [ 1.337522] tsc: Refined TSC clocksource calibration: 2674.966 MHz [ 1.619654] ata4: SATA link down (SStatus 0 SControl 300) [ 1.924871] ata5: SATA link down (SStatus 0 SControl 300) [ 2.229959] ata6: SATA link down (SStatus 0 SControl 300) [ 2.230247] md: Waiting for all devices to be available before autodetect [ 2.230354] md: If you don't use raid, use raid=noautodetect [ 2.230706] md: Autodetecting RAID arrays. [ 2.291257] md: Scanned 6 and added 6 devices. [ 2.291359] md: autorun ... [ 2.291451] md: considering sdb9 ... [ 2.291542] md: adding sdb9 ... [ 2.291631] md: sdb7 has different UUID to sdb9 [ 2.291722] md: sdb5 has different UUID to sdb9 [ 2.291814] md: adding sda9 ... [ 2.291911] md: sda7 has different UUID to sdb9 [ 2.292003] md: sda5 has different UUID to sdb9 [ 2.292381] md: created md9 [ 2.292479] md: bind<sda9> [ 2.292576] md: bind<sdb9> [ 2.292670] md: running: <sdb9><sda9> [ 2.293140] kworker/u8:5 (67) used greatest stack depth: 6728 bytes left [ 2.293351] kworker/u8:5 (66) used greatest stack depth: 6688 bytes left [ 2.293837] md/raid1:md9: active with 2 out of 2 mirrors [ 2.293982] md9: detected capacity change from 0 to 405345533952 [ 2.294087] md: considering sdb7 ... [ 2.294178] md: adding sdb7 ... [ 2.294267] md: sdb5 has different UUID to sdb7 [ 2.294359] md: adding sda7 ... [ 2.294448] md: sda5 has different UUID to sdb7 [ 2.294634] md: created md7 [ 2.294723] md: bind<sda7> [ 2.294820] md: bind<sdb7> [ 2.294921] md: running: <sdb7><sda7> [ 2.295504] md/raid1:md7: active with 2 out of 2 mirrors [ 2.295632] md7: detected capacity change from 0 to 536870846464 [ 2.295738] md: considering sdb5 ... [ 2.295828] md: adding sdb5 ... [ 2.295948] md: adding sda5 ... [ 2.296248] md: created md5 [ 2.296345] md: bind<sda5> [ 2.296440] md: bind<sdb5> [ 2.296534] md: running: <sdb5><sda5> [ 2.297089] md/raid1:md5: active with 2 out of 2 mirrors [ 2.297219] md5: detected capacity change from 0 to 21474770944 [ 2.297316] md: ... autorun DONE. [ 2.319988] md5: unknown partition table [ 2.338158] Switched to clocksource tsc [ 2.346224] EXT4-fs (md5): mounted filesystem with ordered data mode. Opts: (null) [ 2.346370] VFS: Mounted root (ext4 filesystem) readonly on device 9:5. [ 2.365114] devtmpfs: mounted [ 2.366034] Freeing unused kernel memory: 840K (ffffffff8187f000 - ffffffff81951000) [ 2.366174] Write protecting the kernel read-only data: 8192k [ 2.369403] Freeing unused kernel memory: 1252K (ffff8800014c7000 - ffff880001600000) [ 2.370560] Freeing unused kernel memory: 424K (ffff880001796000 - ffff880001800000) [ 2.416614] random: nonblocking pool is initialized [ 3.003048] setfont (84) used greatest stack depth: 4176 bytes left Here's where udev starts (it's the next entry in dmesg - I haven't cut anything here): [ 4.831342] systemd-udevd[446]: starting version 215 [ 5.218493] input: Power Button as /devices/LNXSYSTM:00/device:00/PNP0C0C:00/input/input2 [ 5.218498] ACPI: Power Button [PWRB] [ 5.218550] input: Power Button as /devices/LNXSYSTM:00/LNXPWRBN:00/input/input3 [ 5.218552] ACPI: Power Button [PWRF] [ 5.219707] rtc_cmos 00:02: RTC can wake from S4 [ 5.219819] rtc_cmos 00:02: rtc core: registered rtc_cmos as rtc0 [ 5.219845] rtc_cmos 00:02: alarms up to one month, y3k, 114 bytes nvram, hpet irqs And now the file-systems: [ 8.847508] EXT4-fs (md5): re-mounted. Opts: (null) [ 9.099859] mount (867) used greatest stack depth: 4136 bytes left [ 9.203031] Adding 2097148k swap on /dev/sda3. Priority:10 extents:1 across:2097148k FS [ 9.207315] Adding 2097148k swap on /dev/sdb3. Priority:10 extents:1 across:2097148k FS [ 9.216701] Adding 20971516k swap on /dev/sda6. Priority:1 extents:1 across:20971516k FS [ 9.224521] Adding 20971516k swap on /dev/sdb6. Priority:1 extents:1 across:20971516k FS [ 9.292744] EXT4-fs (dm-0): mounted filesystem with ordered data mode. Opts: (null) [ 9.362160] EXT4-fs (dm-2): mounted filesystem with ordered data mode. Opts: (null) [ 9.386163] EXT4-fs (dm-1): mounted filesystem with ordered data mode. Opts: (null) [ 9.419314] EXT4-fs (dm-9): mounted filesystem with ordered data mode. Opts: (null) [ 9.461981] EXT4-fs (dm-3): mounted filesystem with ordered data mode. Opts: (null) [ 9.511962] EXT4-fs (dm-4): mounted filesystem with ordered data mode. Opts: (null) [ 9.590775] EXT4-fs (dm-5): mounted filesystem with ordered data mode. Opts: (null) [ 9.622095] EXT4-fs (dm-6): mounted filesystem with ordered data mode. Opts: (null) [ 9.643743] EXT4-fs (dm-7): mounted filesystem with ordered data mode. Opts: (null) [ 9.674996] EXT4-fs (dm-8): mounted filesystem with ordered data mode. Opts: (null) [ 9.699230] EXT4-fs (dm-12): mounted filesystem with ordered data mode. Opts: (null) [ 13.602089] r8169 0000:02:00.0 eth0: link down [ 13.602108] r8169 0000:02:00.0 eth0: link down [ 13.602150] ip (1728) used greatest stack depth: 2520 bytes left [ 14.281876] NET: Registered protocol family 10 [ 14.282472] IPv6: ADDRCONF(NETDEV_UP): eth0: link is not ready [ 15.205672] w83627ehf: Found W83667HG-B chip at 0x290 [ 15.989801] r8169 0000:02:00.0 eth0: link up [ 15.989812] IPv6: ADDRCONF(NETDEV_CHANGE): eth0: link becomes ready [ 18.788649] ip_tables: (C) 2000-2006 Netfilter Core Team [ 20.034863] nf_conntrack version 0.5.0 (16384 buckets, 65536 max) [ 31.426556] EXT4-fs (md5): re-mounted. Opts: commit=0 [ 31.627416] EXT4-fs (dm-0): re-mounted. Opts: commit=0 [ 31.685620] EXT4-fs (dm-2): re-mounted. Opts: commit=0 [ 31.939499] EXT4-fs (dm-1): re-mounted. Opts: commit=0 [ 32.010016] EXT4-fs (dm-9): re-mounted. Opts: commit=0 [ 32.033846] EXT4-fs (dm-3): re-mounted. Opts: commit=0 [ 32.075817] EXT4-fs (dm-4): re-mounted. Opts: commit=0 [ 32.117996] EXT4-fs (dm-5): re-mounted. Opts: commit=0 [ 32.251796] EXT4-fs (dm-6): re-mounted. Opts: commit=0 [ 32.293969] EXT4-fs (dm-7): re-mounted. Opts: commit=0 [ 32.308344] EXT4-fs (dm-8): re-mounted. Opts: commit=0 [ 32.351373] EXT4-fs (dm-12): re-mounted. Opts: commit=0 Remaining entries snipped.

