Re: [CentOS] Problem with mdadm, raid1 and automatically adds any disk to raid
On Mon, Feb 25, 2019 at 06:50:11AM +0100, Simon Matter via CentOS (centos@centos.org) wrote: > > Hi. > > > > dd if=/dev/zero of=/dev/sdX bs=512 seek=$(($(blockdev --getsz > > /dev/sdX)-1024)) count=1024 > > I didn't check but are you really sure you're cleaning up the end of the > drive? Maybe you should clean the end of every partition first because > metadata may be written there. Mmmmhhh, not sure. I run fdisk on it, basically re-creating everything from the start. The "trying to re-create the MDX's" happens when I use "w" in fdisk. As soon as I hit the "w" it starts re-creating the MDx! Thats the annoying part. [snip] > > No matter what I do as soon as I hit the "w" in fdisk systemd tries to > > assemble the array again without letting me to decide what to do. > > I am not ;-), it's @ work. Jobst -- You seem (in my (humble) opinion (which doesn.t mean much)) to be (or possibly could be) more of a Lisp programmer (but I could be (and probably am) wrong) | |0| | Jobst Schmalenbach, General Manager | | |0| Barrett & Sales Essentials |0|0|0| +61 3 9533 , POBox 277, Caulfield South, 3162, Australia ___ CentOS mailing list CentOS@centos.org https://lists.centos.org/mailman/listinfo/centos
Re: [CentOS] Problem with mdadm, raid1 and automatically adds any disk to raid
> Hi. > > CENTOS 7.6.1810, fresh install - use this as a base to create/upgrade > new/old machines. > > I was trying to setup two disks as a RAID1 array, using these lines > > mdadm --create --verbose /dev/md0 --level=0 --raid-devices=2 /dev/sdb1 > /dev/sdc1 > mdadm --create --verbose /dev/md1 --level=0 --raid-devices=2 /dev/sdb2 > /dev/sdc2 > mdadm --create --verbose /dev/md2 --level=0 --raid-devices=2 /dev/sdb3 > /dev/sdc3 > > then I did a lsblk and realized that I used --level=0 instead of --level=1 > (spelling mistake) > The SIZE was reported double as I created a striped set by mistake, yet I > wanted the mirrored. > > Here starts my problem, I cannot get rid of the /dev/mdX no matter what I > do (try to do). > > I tried to delete the MDX, I removed the disks by failing them, then > removing each array md0, md1 and md2. > I also did > > dd if=/dev/zero of=/dev/sdX bs=512 seek=$(($(blockdev --getsz > /dev/sdX)-1024)) count=1024 I didn't check but are you really sure you're cleaning up the end of the drive? Maybe you should clean the end of every partition first because metadata may be written there. > dd if=/dev/zero of=/dev/sdX bs=512 count=1024 > mdadm --zero-superblock /dev/sdX > > Then I wiped each partition of the drives using fdisk. > > Now every time I start fdisk to setup a new set of partitions I see in > /var/log/messages as soon as I hit "W" in fdisk: > > Feb 25 15:38:32 webber systemd: Started Timer to wait for more drives > before activating degraded array md2.. > Feb 25 15:38:32 webber systemd: Started Timer to wait for more drives > before activating degraded array md1.. > Feb 25 15:38:32 webber systemd: Started Timer to wait for more drives > before activating degraded array md0.. > Feb 25 15:38:32 webber kernel: md/raid1:md0: active with 1 out of 2 > mirrors > Feb 25 15:38:32 webber kernel: md0: detected capacity change from 0 to > 5363466240 > Feb 25 15:39:02 webber systemd: Created slice > system-mdadm\x2dlast\x2dresort.slice. > Feb 25 15:39:02 webber systemd: Starting Activate md array md1 even > though degraded... > Feb 25 15:39:02 webber systemd: Starting Activate md array md2 even > though degraded... > Feb 25 15:39:02 webber kernel: md/raid1:md1: active with 0 out of 2 > mirrors > Feb 25 15:39:02 webber kernel: md1: failed to create bitmap (-5) > Feb 25 15:39:02 webber mdadm: mdadm: failed to start array /dev/md/1: > Input/output error > Feb 25 15:39:02 webber systemd: mdadm-last-resort@md1.service: main > process exited, code=exited, status=1/FAILURE > > I check /proc/mdstat and sure enough, there it is trying to assemble an > Array I DID NOT TOLD IT TO DO. > > I do NOT WANT this to happen, it creates the same "SHIT" (the incorrect > array) over and over again (systemd frustration). Noo, you're wiping it wrong :-) > So I tried to delete them again, wiped them again, killed processes, wiped > disks. > > No matter what I do as soon as I hit the "w" in fdisk systemd tries to > assemble the array again without letting me to decide what to do. Nothing easier than that, just terminate systemd while doing the disk management and restart it after you're done. BTW, PID is 1. Seriously, there is certainly some systemd unit you may be able to deactivate before doing such things. However, I don't know which one it is. I've been fighting a similar crap: On HPE servers when running cciss_vol_status through the disk monitoring system, whenever cciss_vol_status is run and reports hardware RAID status, systemd scans all partition tables and tries to detect LVM2 devices and whatever. Kernel log is just filled with useless scans and I have no idea how to get rid of it. Nice new systemd world. Regards, Simon ___ CentOS mailing list CentOS@centos.org https://lists.centos.org/mailman/listinfo/centos
[CentOS] Problem with mdadm, raid1 and automatically adds any disk to raid
Hi. CENTOS 7.6.1810, fresh install - use this as a base to create/upgrade new/old machines. I was trying to setup two disks as a RAID1 array, using these lines mdadm --create --verbose /dev/md0 --level=0 --raid-devices=2 /dev/sdb1 /dev/sdc1 mdadm --create --verbose /dev/md1 --level=0 --raid-devices=2 /dev/sdb2 /dev/sdc2 mdadm --create --verbose /dev/md2 --level=0 --raid-devices=2 /dev/sdb3 /dev/sdc3 then I did a lsblk and realized that I used --level=0 instead of --level=1 (spelling mistake) The SIZE was reported double as I created a striped set by mistake, yet I wanted the mirrored. Here starts my problem, I cannot get rid of the /dev/mdX no matter what I do (try to do). I tried to delete the MDX, I removed the disks by failing them, then removing each array md0, md1 and md2. I also did dd if=/dev/zero of=/dev/sdX bs=512 seek=$(($(blockdev --getsz /dev/sdX)-1024)) count=1024 dd if=/dev/zero of=/dev/sdX bs=512 count=1024 mdadm --zero-superblock /dev/sdX Then I wiped each partition of the drives using fdisk. Now every time I start fdisk to setup a new set of partitions I see in /var/log/messages as soon as I hit "W" in fdisk: Feb 25 15:38:32 webber systemd: Started Timer to wait for more drives before activating degraded array md2.. Feb 25 15:38:32 webber systemd: Started Timer to wait for more drives before activating degraded array md1.. Feb 25 15:38:32 webber systemd: Started Timer to wait for more drives before activating degraded array md0.. Feb 25 15:38:32 webber kernel: md/raid1:md0: active with 1 out of 2 mirrors Feb 25 15:38:32 webber kernel: md0: detected capacity change from 0 to 5363466240 Feb 25 15:39:02 webber systemd: Created slice system-mdadm\x2dlast\x2dresort.slice. Feb 25 15:39:02 webber systemd: Starting Activate md array md1 even though degraded... Feb 25 15:39:02 webber systemd: Starting Activate md array md2 even though degraded... Feb 25 15:39:02 webber kernel: md/raid1:md1: active with 0 out of 2 mirrors Feb 25 15:39:02 webber kernel: md1: failed to create bitmap (-5) Feb 25 15:39:02 webber mdadm: mdadm: failed to start array /dev/md/1: Input/output error Feb 25 15:39:02 webber systemd: mdadm-last-resort@md1.service: main process exited, code=exited, status=1/FAILURE I check /proc/mdstat and sure enough, there it is trying to assemble an Array I DID NOT TOLD IT TO DO. I do NOT WANT this to happen, it creates the same "SHIT" (the incorrect array) over and over again (systemd frustration). So I tried to delete them again, wiped them again, killed processes, wiped disks. No matter what I do as soon as I hit the "w" in fdisk systemd tries to assemble the array again without letting me to decide what to do. Help! Jobst -- windoze 98: useless extension to a minor patch release for 32-bit extensions and a graphical shell for a 16-bit patch to an 8-bit operating system originally coded for a 4-bit microprocessor, written by a 2-bit company that can't stand for 1 bit of competition! | |0| | Jobst Schmalenbach, General Manager | | |0| Barrett & Sales Essentials |0|0|0| +61 3 9533 , POBox 277, Caulfield South, 3162, Australia ___ CentOS mailing list CentOS@centos.org https://lists.centos.org/mailman/listinfo/centos
[CentOS] Nvme m.2 disk problem
Hi list, I'm running Centos 7.6 on an Corsair Force MP500 120 GB. Root fs is ext4 and this drive is ~1 year old. System works very well except on boot. During boot process I got always a file system check on nvme drive. Running smartctl on this drive I got this: === START OF SMART DATA SECTION === SMART overall-health self-assessment test result: PASSED SMART/Health Information (NVMe Log 0x02, NSID 0x1) Critical Warning: 0x00 Temperature:41 Celsius Available Spare:100% Available Spare Threshold: 1% Percentage Used:1% Data Units Read:5,355,595 [2,74 TB] Data Units Written: 5,826,517 [2,98 TB] Host Read Commands: 67,978,550 Host Write Commands:75,422,898 Controller Busy Time: 32,863 Power Cycles: 811 Power On Hours: 2,813 Unsafe Shutdowns: 317 Media and Data Integrity Errors:0 Error Information Log Entries: 177 Warning Comp. Temperature Time:0 Critical Comp. Temperature Time:0 Temperature Sensor 2: 77 Celsius Error Information (NVMe Log 0x01, max 64 entries) Num ErrCount SQId CmdId Status PELoc LBA NSIDVS 0177 0 0x0014 0x4004 - 8796109799680 1 - 1176 0 0x0019 0x4004 - 8796109799680 1 - 2175 0 0x001a 0x4004 - 8796109799680 1 - 3174 0 0x0005 0x4004 - 8796109799680 1 - 4173 0 0x000c 0x4004 - 8796109799680 1 - 5172 0 0x0019 0x4004 - 8796109799680 1 - 6171 0 0x001d 0x4004 - 8796109799680 1 - 7170 0 0x0014 0x4004 - 8796109799680 1 - 8169 0 0x0011 0x4004 - 8796109799680 1 - 9168 0 0x000f 0x4004 - 8796109799680 1 - 10167 0 0x 0x4004 - 8796109799680 1 - 11166 0 0x0006 0x4004 - 8796109799680 1 - 12165 0 0x0008 0x4004 - 8796109799680 1 - 13164 0 0x000e 0x4004 - 8796109799680 1 - 14163 0 0x0008 0x4004 - 8796109799680 1 - 15162 0 0x0006 0x4004 - 8796109799680 1 - ... (48 entries not shown) I noticed that Unsafe shutdowns increased rapidly and I don't know why there is an unsafe shutdown. Every 3/4 boot this value is increased by 1 and I don't know why. I can't find any errors on system logs. Can someone point me in the right direction? Thanks in advance. Alessandro. ___ CentOS mailing list CentOS@centos.org https://lists.centos.org/mailman/listinfo/centos