Re: 7.1 sparc64 softraid0 1.5TB/2TB partition limit of RAID 5 + c
On 9/18/22 03:42, Klemens Nanni wrote: On Fri, Sep 16, 2022 at 05:59:20PM -0700, Michael Truog wrote: Hi, I was attempting to have a RAID 5 softraid0 setup on a sparc64 machine (boot log output below) but ran into problems when attempting to create a single partition with the size 5.5TB (RAID 5 with 4 x 2TB hard drives). I found an interesting problem when using disklabel on the softraid0 hard drive device, when attempting to make this 5.5TB partition. The partition "a" would only be allowed as 1.5TB and any partition >= "d" would only be allowed as 2TB, however the limit occurred silently after disklabel had exited. When inside disklabel, I could allocate a single "a" partition to be 5.5TB successfully and was able to write the partition successfully. However, when the disklabel process exited, either with the q command or a kill signal 9, the partition would be shrunk to the limit described above. If the disklabel process was suspended (ctrl-Z), this wouldn't happen and newfs would see the 5.5TB partition, though usage of the partition wouldn't work. The partition would have inaccessible blocks that fsck showed extreme anger at, when it saw it at boot time. It would really help to showcase your issue with commands/output. This issue is not related to softraid(4), it is most probably an old sparc(64) quirk: 1. create big dummy disk for a single filesystem: $ ldomctl create-vdisk -s 10T sparse-10T.img 2. pass it to guest domain in order to have a "real" 10T sized sd(4): # dmesg | grep ^sd2 sd2 at scsibus3 targ 0 lun 0: sd2: 10485760MB, 512 bytes/sector, 21474836480 sectors I attempted doing this. The ldomctl manpage provided a good path for getting that working. However, at least for my Sun SPARC Enterprise T5220 (T2) I needed to explicitly declare the "primary" domain because otherwise the configuration would not be used after an ILOM reset occurred. I had a working ldom.conf configuration, but once I attempted (after getting the ldom.conf working) to do "ldomctl delete openbsd" "ldomctl download openbsd" ILOM "reset /SYS" the machine failed to boot. Then the machine failed to allow an ILOM "stop /SYS" to occur. So, I then removed both power plugs, inserted them again, updated the firmware without retaining the configuration and all was well again. My configuration was: domain "primary" { vcpu 56 memory 31G } domain "test" { vcpu 8 memory 32G vdisk "/home/sparse-10T.img" vnet } I didn't initialize the vnet before the delete/download and it generated WARNINGs on the console, so that may have led to the problem. I was using the 7.2 snapshot from 2022-09-19 for everything described in this email reply. I did attempt to replicate the kernel panic from 7.1 with the 7.2 snapshot using softraid0 5 and c. The limit on sparc64 partition sizes doesn't appear to be very clearly defined, with the limit changing based on the situation. In your situation below it has allowed partition a to be resized to 2 TB when using the ldomctl vdisk. When I used softraid0 with level c, partition a was allowed to be 1.6 TB. When I used softraid0 with level 5, partition a was allowed to be 1.5 TB. With the 7.2 snapshot it appears that this limit applies to the first partition, not necessarily partition a. With 7.1, I always saw it as an issue with partition a. If partition d is created after a shrinks to 1.5 TB, with softraid0 level 5, partition d is allowed to be 2 TB. However, if an attempt is made to create a partition e after that, it shrinks to match the same partition as d (to become an invalid partition). I have output that helps to show this. I was unable to replicate the 7.1 kernel panic when using the 7.2 snapshot, so there isn't a clear failure related to the partition resizing itself when disklabel exits. With the 7.2 snapshot, I am able to suspend the disklabel process (after the write, but before the quit), do newfs, kill -9 on the disklabel process and mount/write to the new filesystem without any kernel panic. The new filesystem will still fail when fsck is attempted, but the first write of a file is able to succeed (tried the disklabel/suspend/newfs/kill/write-file sequence with both softraid0 level 5 and c). # echo '/ 1M-* 100%' | disklabel -wAT/dev/stdin sd2 # disklabel -h sd2 # /dev/rsd2c: type: SCSI disk: SCSI disk label: Virtual Disk duid: c4befc09bf56efed flags: vendor bytes/sector: 512 sectors/track: 255 tracks/cylinder: 511 sectors/cylinder: 130305 cylinders: 164804 total sectors: 21474836480 # total bytes: 10.0T boundstart: 0 boundend: 21474836480 16 partitions: #size offset fstype [fsize bsize cpg] a: 2.0T0 4.2BSD 8192 65536 1 c:10.0T
Re: 7.1 sparc64 softraid0 1.5TB/2TB partition limit of RAID 5 + c
On Fri, Sep 16, 2022 at 05:59:20PM -0700, Michael Truog wrote: > Hi, > > I was attempting to have a RAID 5 softraid0 setup on a sparc64 machine (boot > log output below) but ran into problems when attempting to create a single > partition with the size 5.5TB (RAID 5 with 4 x 2TB hard drives). I found an > interesting problem when using disklabel on the softraid0 hard drive device, > when attempting to make this 5.5TB partition. The partition "a" would only > be allowed as 1.5TB and any partition >= "d" would only be allowed as 2TB, > however the limit occurred silently after disklabel had exited. When inside > disklabel, I could allocate a single "a" partition to be 5.5TB successfully > and was able to write the partition successfully. However, when the > disklabel process exited, either with the q command or a kill signal 9, the > partition would be shrunk to the limit described above. If the disklabel > process was suspended (ctrl-Z), this wouldn't happen and newfs would see the > 5.5TB partition, though usage of the partition wouldn't work. The partition > would have inaccessible blocks that fsck showed extreme anger at, when it > saw it at boot time. It would really help to showcase your issue with commands/output. This issue is not related to softraid(4), it is most probably an old sparc(64) quirk: 1. create big dummy disk for a single filesystem: $ ldomctl create-vdisk -s 10T sparse-10T.img 2. pass it to guest domain in order to have a "real" 10T sized sd(4): # dmesg | grep ^sd2 sd2 at scsibus3 targ 0 lun 0: sd2: 10485760MB, 512 bytes/sector, 21474836480 sectors # echo '/ 1M-* 100%' | disklabel -wAT/dev/stdin sd2 # disklabel -h sd2 # /dev/rsd2c: type: SCSI disk: SCSI disk label: Virtual Disk duid: c4befc09bf56efed flags: vendor bytes/sector: 512 sectors/track: 255 tracks/cylinder: 511 sectors/cylinder: 130305 cylinders: 164804 total sectors: 21474836480 # total bytes: 10.0T boundstart: 0 boundend: 21474836480 16 partitions: #size offset fstype [fsize bsize cpg] a: 2.0T0 4.2BSD 8192 65536 1 c:10.0T0 unused disklabel: warning, partition a: size % cylinder-size != 0 3. compare against amd64/vmm: $ vmctl create -s 10T 10T-sparse.img vmctl: create imagefile operation failed: File too large $ vmctl create -s 7T 7T-sparse.img vmctl: raw imagefile created (Not quite sure why 7T is the maximum here... 8T wouldn't work, either) # vmctl start -c -b /bsd.rd -d 7T-sparse.img t ... sd0 at scsibus0 targ 0 lun 0: sd0: 7340032MB, 512 bytes/sector, 15032385536 sectors ... (I)nstall, (U)pgrade, (A)utoinstall or (S)hell? s # cd /dev ; MAKEDEV sd0 sh: MAKEDEV: not found # cd /dev ; sh MAKEDEV sd0 # echo '/ 1M-* 100%' | disklabel -wAT/dev/stdin sd0 # disklabel -h sd0 # /dev/rsd0c: type: SCSI disk: SCSI disk label: Block Device duid: 24ff0fe5062adbdc flags: bytes/sector: 512 sectors/track: 255 tracks/cylinder: 511 sectors/cylinder: 130305 cylinders: 115363 total sectors: 15032385536 # total bytes: 7.0T boundstart: 0 boundend: 15032385536 16 partitions: #size offset fstype [fsize bsize cpg] a: 7.0T0 4.2BSD 8192 65536 1 c: 7.0T0 unused So that makes it look like a purely sparc64 related issue. I don't *see* silent truncation on amd64. > > I did bump into a kernel panic when doing the sequence (kernel panic output > is below the boot log): disklabel single partition 5.5TB written, suspend > disklabel process, newfs on partition, kill -9 disklabel process, write a > single file to the filesystem ("the_first_file" in the command line output > below). Same as above; clear steps to reproduce would be helpful. > > The 1.5TB/2TB partition limit is known and expected on sparc64, isn't it? I > didn't see the limit mentioned in documentation, though the disklabel > manpage does say "On some machines, such as Sparc64, partition tables may > not exhibit the full functionality described above.". I bumped into the > same limit when attempting to use softraid0 RAID c too. This disklabel(8) CAVEATS is pretty vague; CVS log shows it originally mentioned amiga3 and sparc, with minor tweaks arriving sparc64. > OpenBSD 7.1 (GENERIC.MP) #1269: Mon Apr 11 22:05:10 MDT 2022 > dera...@sparc64.openbsd.org:/usr/src/sys/arch/sparc64/compile/GENERIC.MP Can you try with a snapshot, please? > mpi0 at pci8 dev
7.1 sparc64 softraid0 1.5TB/2TB partition limit of RAID 5 + c
Hi, I was attempting to have a RAID 5 softraid0 setup on a sparc64 machine (boot log output below) but ran into problems when attempting to create a single partition with the size 5.5TB (RAID 5 with 4 x 2TB hard drives). I found an interesting problem when using disklabel on the softraid0 hard drive device, when attempting to make this 5.5TB partition. The partition "a" would only be allowed as 1.5TB and any partition >= "d" would only be allowed as 2TB, however the limit occurred silently after disklabel had exited. When inside disklabel, I could allocate a single "a" partition to be 5.5TB successfully and was able to write the partition successfully. However, when the disklabel process exited, either with the q command or a kill signal 9, the partition would be shrunk to the limit described above. If the disklabel process was suspended (ctrl-Z), this wouldn't happen and newfs would see the 5.5TB partition, though usage of the partition wouldn't work. The partition would have inaccessible blocks that fsck showed extreme anger at, when it saw it at boot time. I did bump into a kernel panic when doing the sequence (kernel panic output is below the boot log): disklabel single partition 5.5TB written, suspend disklabel process, newfs on partition, kill -9 disklabel process, write a single file to the filesystem ("the_first_file" in the command line output below). The 1.5TB/2TB partition limit is known and expected on sparc64, isn't it? I didn't see the limit mentioned in documentation, though the disklabel manpage does say "On some machines, such as Sparc64, partition tables may not exhibit the full functionality described above.". I bumped into the same limit when attempting to use softraid0 RAID c too. Tell me if you need more information. Best Regards, Michael SPARC Enterprise T5220, No Keyboard Copyright (c) 1998, 2017, Oracle and/or its affiliates. All rights reserved. OpenBoot 4.33.6.h, 65408 MB memory available, Serial #89737534. Ethernet address 0:21:28:59:49:3e, Host ID: 8559493e. Boot device: disk0 File and args: sr0a:/bsd OpenBSD IEEE 1275 Bootblock 2.1 >> OpenBSD BOOT 1.22 ERROR: /iscsi-hba: No iscsi-network-bootpath property sr0* |#/#-#\#|#/#-#\#|#/#-#\#|#/#-#\#Booting sr0:a/bsd 10060280@0x100|#/#-#\#|#/#-#\#|#/#-#\#|#/#-#\#|#/#-#\#|#/#-#\#|#/#-#\#|#/#-#\#|#/#-#\#|#/#-#\#|#/#-#\#|#/#-#\#|#/#-#\#|#/#-#\#|#/#-#\#|#/#-#\#|#/#-#\#|#/#-#\#|#/#-#\#|#/#-#\#|#/#-#\#|#/#-#\#|#/#-#\#|#/#-#\#|#/#-#\#|#/#-#\#|#/#-#\#|#/#-#\#|#/#-#\#|#/#-#\#|#/#-#\#|#/#-#\#|#/#-#\#|#/#-#\#|#/#-#\#|#/#-#\#|#/#-#\#|#/#-#\#|#/#-#\#|#/#-#\#|#/#-#\#|#/#-#\#|#/#-#\#|#/#-#\#|#/#-#\#|#/#-#\#|#/#-#\#|#/#-#\#|#/#-#\#|#/#-#\#|#/#-#\#|#/#-#\#|#/#-#\#|#/#-#\#|#/#-#\#|#/#-#\#|#/#-#\#|#/#-#\#|#/#-#\#|#/#-#\#|#/#-#\#|#/#-#\#|#/#-#\#|#/#-#\#|#/#-#\#|#/#-#\#|#/#-#\#|#/#-#\#|#/#-#\#|#/#-#\#|#/#-#\#|#/#-#\#|#/#-#\#|#/#-#\#|#/#-#\#|#/#-#\#|#/#-#\#|#/#-#\#|#/#-#\#|#/#-#\#|#/#-#\#|#/#-#\#|#/#-#\#|#/#-#\#|#/#-#\#|#/#-#\#|#/#-#\#|#/#-#\#|#/#-#\#|#/#-#\#|#/#-#\#|#/#-#\#|#/#-#\#|#/#-#\#|#/#-#\#|#/#-#\#|#/#-#\#|#/#-#\#|#/#-#\#|#/#-#\#|#/#-#\#|#/#-#\#|#/#-#\#|#/#-#\#|#/#-#\#|#/#-#\#|#/#-#\#|#/#-#\#|#/#-#\#|#/#-#\#|#/#-#\#|#/#-#\#|#/#-#\#|#/#-#\#|#/#-#\#|#/#-#\#|#/#-#\#|#/#-#\#|#/#-#\#|#/#-#\#|#/#-#\#|#/#-#\#|#/#-#\#|#/#-#\#|#/#-#\#|#/#-#\#|#/#-#\#|#/#-#\#|#/#-#\#|#/#-#\#|#/#-#\#|#/#-#\#|#/#-#\#|#/#-#\#|#/#-#\#|#/#-#\#|#/#-#\#|#/#-#\#|#/#-#\#|#/#-#\#|#/#-#\#|#/#-#\#|#/#-#\#|#/#-#\#|#/#-#\#|#/#-#\#|#/#-#\#|#/#-#\#|#/#-#\#|#/#-#\#|#/#-#\#|#/#-#\#|#/#-#\#|#/#-#\#+7688@0x19981f8|#+170388@0x1c0/#-#\#|#/#-#\#|#/#-#\#+4023916@0x1c29994|# /#symbols @ 0xfe93a400 484557-#\#|#/#-#\#|#/#-#\#|#/#-#\#|#/#-#\#|#/#-#\#|#/#-#\#|#/#-#\#|#+165+654072/#-#\#|#/#-#\#|#/#-#\#|#/#-#\#|#/#-#\#|#/#-#\#|#/#-#\#|#/#-#\#|#/#-#\#|#/#-#\#+452627|#/#-#\#|#/#-#\#|#/#-#\#|#/#-#\#|#/#-#\#|#/#-#\#|#/#-#\# start=0x100 [ using 1592456 bytes of bsd ELF symbol table ] console is /virtual-devices@100/console@1 Copyright (c) 1982, 1986, 1989, 1991, 1993 The Regents of the University of California. All rights reserved. Copyright (c) 1995-2022 OpenBSD. All rights reserved. https://www.OpenBSD.org OpenBSD 7.1 (GENERIC.MP) #1269: Mon Apr 11 22:05:10 MDT 2022 dera...@sparc64.openbsd.org:/usr/src/sys/arch/sparc64/compile/GENERIC.MP real mem = 68585259008 (65408MB) avail mem = 67369304064 (64248MB) random: good seed from bootblocks mpath0 at root scsibus0 at mpath0: 256 targets mainbus0 at root: SPARC Enterprise T5220 cpu0 at mainbus0: SUNW,UltraSPARC-T2 (rev 0.0) @ 1415.103 MHz cpu1 at mainbus0: SUNW,UltraSPARC-T2 (rev 0.0) @ 1415.103 MHz cpu2 at mainbus0: SUNW,UltraSPARC-T2 (rev 0.0) @ 1415.103 MHz cpu3 at mainbus0: SUNW,UltraSPARC-T2 (rev 0.0) @ 1415.103 MHz cpu4 at mainbus0: SUNW,UltraSPARC-T2 (rev 0.0) @ 1415.103 MHz cpu5 at mainbus0: SUNW,UltraSPARC-T2 (rev 0.0) @ 1415.103 MHz cpu6 at mainbus0: SUNW,UltraSPARC-T2 (rev 0.0) @ 1415.103 MHz cpu7 at mainbus0: SUNW,UltraSPARC-T2 (rev 0.0) @ 1415.103 MHz cpu8 at mainbus0: