So I took the time to re-test this again.
My z/VM guest has 4 CPUs (but SMT on), and 4 DASD FBA devices that equally 
split a 64GB zFCP/SCSI LUN in 4 16GB FBA chunks.

I've tested (in comment #8) with 2GB RAM where things worked and I wasn't able 
to recreate the error situation.
I then moved to 6GB RAM and things still worked for me.
Then 8GB - where everything was still fine.
And finally 10GB - still don't see the issue.

$ grep -i 'error\|crash\|crit\|panic\|I\/O\|erp\|sense\|fba' /var/log/syslog
ul 28 10:05:23 hwe0005 systemd[1]: Stopping LSB: automatic crash report 
generation...
Jul 28 10:05:23 hwe0005 systemd[1]: Stopping Configure dump on panic for System 
z...
Jul 28 10:07:36 hwe0005 systemd-udevd[514]: dasd-fba: 
/etc/udev/rules.d/41-generic-ccw-0.0.0009.rules:7 Failed to write 
ATTR{/sys/devices/css0/0.0.0007/0.0.0009/online}, ignoring: Invalid argument
Jul 28 10:07:36 hwe0005 systemd-udevd[511]: 0.0.0102: 
/etc/udev/rules.d/41-dasd-fba-0.0.0102.rules:7 Failed to write 
ATTR{/sys/devices/css0/0.0.0001/0.0.0102/online}, ignoring: Invalid argument
Jul 28 10:07:36 hwe0005 systemd-udevd[522]: 0.0.0101: 
/etc/udev/rules.d/41-dasd-fba-0.0.0101.rules:7 Failed to write 
ATTR{/sys/devices/css0/0.0.0000/0.0.0101/online}, ignoring: Invalid argument
Jul 28 10:07:36 hwe0005 systemd-udevd[522]: 0.0.0103: 
/etc/udev/rules.d/41-dasd-fba-0.0.0103.rules:7 Failed to write 
ATTR{/sys/devices/css0/0.0.0002/0.0.0103/online}, ignoring: Invalid argument
Jul 28 10:07:36 hwe0005 systemd-udevd[505]: 0.0.0104: 
/etc/udev/rules.d/41-dasd-fba-0.0.0104.rules:7 Failed to write 
ATTR{/sys/devices/css0/0.0.0003/0.0.0104/online}, ignoring: Invalid argument
Jul 28 10:07:36 hwe0005 kernel: [    4.983272] dasd-fba.f36f2f: 0.0.0101: New 
FBA DASD 9336/10 (CU 6310/80) with 16383 MB and 512 B/blk
Jul 28 10:07:36 hwe0005 kernel: [    4.988020] dasd-fba.f36f2f: 0.0.0102: New 
FBA DASD 9336/10 (CU 6310/80) with 16383 MB and 512 B/blk
Jul 28 10:07:36 hwe0005 kernel: [    4.990317] dasd-fba.f36f2f: 0.0.0103: New 
FBA DASD 9336/10 (CU 6310/80) with 16383 MB and 512 B/blk
Jul 28 10:07:36 hwe0005 kernel: [    4.992370] dasd-fba.f36f2f: 0.0.0104: New 
FBA DASD 9336/10 (CU 6310/80) with 16384 MB and 512 B/blk
Jul 28 10:07:36 hwe0005 systemd[1]: Condition check resulted in Process error 
reports when automatic reporting is enabled (file watch) being skipped.
Jul 28 10:07:36 hwe0005 systemd[1]: Condition check resulted in Unix socket for 
apport crash forwarding being skipped.
Jul 28 10:07:36 hwe0005 systemd[1]: Starting LSB: automatic crash report 
generation...
Jul 28 10:07:36 hwe0005 systemd[1]: Starting Configure dump on panic for System 
z...
Jul 28 10:07:36 hwe0005 apport[764]:  * Starting automatic crash report 
generation: apport
Jul 28 10:07:36 hwe0005 dumpconf[770]: stop on panic configured.
Jul 28 10:07:36 hwe0005 systemd[1]: Finished Configure dump on panic for System 
z.
Jul 28 10:07:36 hwe0005 systemd[1]: Started LSB: automatic crash report 
generation.

I'm wondering a bit about the systemd msgs and the sysfs device tree.
But other than that no ERP, sense, or panics so far ...

$ dmesg | grep -i 'error\|fail\|crash\|warn\|crit\|panic\|erp\|fba'
[    4.983272] dasd-fba.f36f2f: 0.0.0101: New FBA DASD 9336/10 (CU 6310/80) 
with 16383 MB and 512 B/blk
[    4.988020] dasd-fba.f36f2f: 0.0.0102: New FBA DASD 9336/10 (CU 6310/80) 
with 16383 MB and 512 B/blk
[    4.990317] dasd-fba.f36f2f: 0.0.0103: New FBA DASD 9336/10 (CU 6310/80) 
with 16383 MB and 512 B/blk
[    4.992370] dasd-fba.f36f2f: 0.0.0104: New FBA DASD 9336/10 (CU 6310/80) 
with 16384 MB and 512 B/blk
[    5.075981] random: 7 urandom warning(s) missed due to ratelimiting

I always did a quick check of the partition data:

ubuntu@hwe0005:~$ sudo fdisk -l /dev/dasde1
Disk /dev/dasde1: 15.102 GiB, 17178902528 bytes, 33552544 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes

And then created a ext3 file system using -F on all 4 FBA devices one
after the other:

ubuntu@hwe0005:~$ sudo mkfs.ext3 -F /dev/dasde1
mke2fs 1.45.5 (07-Jan-2020)
/dev/dasde1 contains a ext3 file system
        created on Tue Jul 28 09:45:37 2020
Discarding device blocks: done                            
Creating filesystem with 4194068 4k blocks and 1048576 inodes
Filesystem UUID: c34e7583-1dc9-4b8a-8494-7a100338a7e6
Superblock backups stored on blocks: 
        32768, 98304, 163840, 229376, 294912, 819200, 884736, 1605632, 2654208, 
        4096000

Allocating group tables: done                            
Writing inode tables: done                            
Creating journal (16384 blocks): done
Writing superblocks and filesystem accounting information: done   

Does it have a dependency on a certain z/VM version:

And I'm running this z/VM version:
00: CP Q CPLEVEL
00: z/VM Version 6 Release 4.0, service level 1901 (64-bit)
00: Generated at 2019-06-14 14:15:49 UTC

I do the FBA devices always have to be re-enabled before retrying.

Right now I'm a bit lost re-creating this.

@Jan, how did you system and FBAs looked like? And which z/VM version
are you using?

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1879707

Title:
  [UBUNTU 20.04] mke2fs dasd(fba),Failing CCW,default ERP has run out of
  retries and failed

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu-z-systems/+bug/1879707/+subscriptions

-- 
ubuntu-bugs mailing list
[email protected]
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

Reply via email to