Re: [Users] Issues after updating to 7.0.14 (136)
Hello thanks for the advices, I did disable onboot=yes => no , so that at next reboot my CTs don't start automatically and enter a dead lock/loop than I could restart manually my CTs In fact I discuss with devs , it seems I enter a dead lock when my centos8 CTs (using NFT / netfilter) stop/suspended when rebooting the HWNode they are looking a this potential issue with netflter and latest updates . to be continued ... Thanks . De: "Oleksiy Tkachenko" À: "OpenVZ users" Envoyé: Mercredi 8 Juillet 2020 23:49:51 Objet: Re: [Users] Issues after updating to 7.0.14 (136) >> ... >> Error in ploop_check (check.c:663): Dirty flag is set >> ... >> # ploop mount >> /vz/private/144dc737-b4e3-4c03-852c-25a6df06cee4/root.hdd/DiskDescriptor.xml >> Error in ploop_mount_image (ploop.c:2495): Image >> /vz/private/144dc737-b4e3-4c03-852c-25a6df06cee4/root.hdd/root.hds already >> used by device /dev/ploop11432 >> ... >> >> I am lost , any help appreciated . I heard about 2 possible solutions: 1. Reboot HW and stop CT. Then "ploop mount" CT's DiskDescriptor.xml for e2fsck. Unmount and restart CT. 2. If won't help then create fresh new CT and move "broken" root.hds there. -- Oleksiy ___ Users mailing list Users@openvz.org https://lists.openvz.org/mailman/listinfo/users ___ Users mailing list Users@openvz.org https://lists.openvz.org/mailman/listinfo/users
Re: [Users] Issues after updating to 7.0.14 (136)
>> ... >> Error in ploop_check (check.c:663): Dirty flag is set >> ... >> # ploop mount /vz/private/144dc737-b4e3-4c03-852c-25a6df06cee4/root.hdd/DiskDescriptor.xml >> Error in ploop_mount_image (ploop.c:2495): Image /vz/private/144dc737-b4e3-4c03-852c-25a6df06cee4/root.hdd/root.hds already used by device /dev/ploop11432 >> ... >> >> I am lost , any help appreciated . I heard about 2 possible solutions: 1. Reboot HW and stop CT. Then "ploop mount" CT's DiskDescriptor.xml for e2fsck. Unmount and restart CT. 2. If won't help then create fresh new CT and move "broken" root.hds there. -- Oleksiy ___ Users mailing list Users@openvz.org https://lists.openvz.org/mailman/listinfo/users
Re: [Users] Issues after updating to 7.0.14 (136)
Hello, Thanks to all who have replied to this thread so far - my apologies for taking so long to get back to you all. In terms of where I'm seeing the EXT4 errors, they are showing up in the kernel log on the node itself, so the output of 'dmesg' is regularly seeing entries such as these: [375095.199203] EXT4-fs (ploop43209p1): Remounting filesystem read-only [375095.199267] EXT4-fs error (device ploop43209p1) in ext4_ext_remove_space:3073: IO failure [375095.199400] EXT4-fs error (device ploop43209p1) in ext4_ext_truncate:4692: IO failure [375095.199517] EXT4-fs error (device ploop43209p1) in ext4_reserve_inode_write:5358: Journal has aborted [375095.199637] EXT4-fs error (device ploop43209p1) in ext4_truncate:4145: Journal has aborted [375095.199779] EXT4-fs error (device ploop43209p1) in ext4_reserve_inode_write:5358: Journal has aborted [375095.199957] EXT4-fs error (device ploop43209p1) in ext4_orphan_del:2731: Journal has aborted [375095.200138] EXT4-fs error (device ploop43209p1) in ext4_reserve_inode_write:5358: Journal has aborted [461642.709690] EXT4-fs (ploop43209p1): error count since last fsck: 8 [461642.709702] EXT4-fs (ploop43209p1): initial error at time 1593576601: ext4_ext_remove_space:3000: inode 136354 [461642.709708] EXT4-fs (ploop43209p1): last error at time 1593576601: ext4_reserve_inode_write:5358: inode 136354 Inside the container itself, not much is being logged, since the affected container in in this particular instance is indeed (as per the errors above) mounted read-only due to the errors its root.hdd filesystem is experiencing. Having dug a bit more into what happened here, I suspect that this corruption may have come about when the containers were being moved either to or from the standby node and the live node, but I can't be 100% sure of that. The picture is further muddied in that the standby node (the node that we used for evacuating containers from the node to be updated) was itself initially updated to 7.0.14 (135). However, the live node (which was updated a short time after the standby node) appears to have got 7.0.14 (136). So I don't know if the issue was in fact with 7.0.14 (135) (which was on the standby node, where the containers would have been moved to, and moved back from), or on 7.0.14 (136) on the live node. Were there any known issues with 7.0.14 (135) that might correlate with what I'm seeing above ? Anyway, once again, thanks to everyone who has replied so far. If anyone has any further questions or would like any further information, please let me know and I will be happy to assist. Thank you, Kevin Drysdale. On Thu, 2 Jul 2020, Jehan PROCACCIA wrote: yes , you are right, I do get the same virtuozzo-release as mentioned in the initial subject, sorry for the noise . # cat /etc/virtuozzo-release OpenVZ release 7.0.14 (136) but anyway, I don't see any ploop / fsck error in the host /var/log/vzctl.log inside the CT , where did you see those errors ? Jehan . _ De: "jjs - mainphrame" À: "OpenVZ users" Envoyé: Jeudi 2 Juillet 2020 19:33:23 Objet: Re: [Users] Issues after updating to 7.0.14 (136) Thanks for that sanity check, the conundrum is resolved. vzlinux-release and virtuozzo-release are indeed different things. Jake On Thu, Jul 2, 2020 at 10:27 AM Jonathan Wright wrote: /etc/redhat-release and /etc/virtuozzo-release are two different things. On 7/2/20 12:16 PM, jjs - mainphrame wrote: Jehan - I get the same output here - [root@annie ~]# yum repolist |grep virt virtuozzolinux-base VirtuozzoLinux Base 15,415+189 virtuozzolinux-updates VirtuozzoLinux Updates 0 I'm baffled as to how you're on 7.8.0 while I'm at 7.0,15 even though I'm fully up to date. # uname -a Linux annie.ufcfan.org 3.10.0-1127.8.2.vz7.151.10 #1 SMP Mon Jun 1 19:05:52 MSK 2020 x86_64 x86_64 x86_64 GNU/Linux Jake On Thu, Jul 2, 2020 at 10:08 AM Jehan PROCACCIA wrote: no factory , just repos virtuozzolinux-base and openvz-os # yum repolist |grep virt virtuozzolinux-base VirtuozzoLinux Base 15 415+189 virtuozzolinux-updates VirtuozzoLinux Updates 0 Jehan . _ De: "jjs - mainphrame" À: "OpenVZ users" Cc: "Kevin Drysdale" Envoyé: Jeudi 2 Juillet 2020 18:22:33 Objet: Re: [Users] Issues after updating to 7.0.14 (136) Jehan, are you running factory? My ovz hosts are up to date, and I see: [root@annie ~]# cat /etc/virtuozzo-release OpenVZ release 7.0.15 (222) Jake On Thu,
Re: [Users] Issues after updating to 7.0.14 (136)
Hello If it can help, what I did so far to try to re-enable dead CTs # prlctl stop ldap2 Stopping the CT... Failed to stop the CT: PRL_ERR_VZCTL_OPERATION_FAILED (Details: Cannot lock the Container ) # cat /vz/lock/144dc737-b4e3-4c03-852c-25a6df06cee4.lck 6227 resuming # ps auwx | grep 6227 root 6227 0.0 0.0 92140 6984 ? S 15:10 0:00 /usr/sbin/vzctl resume 144dc737-b4e3-4c03-852c-25a6df06cee4 # kill -9 6227 still cannot stop the CT (Cannot lock the Container...) # df |grep 144dc737-b4e3-4c03-852c-25a6df06cee4 /dev/ploop11432p1 10188052 2546636 7100848 27% /vz/root/144dc737-b4e3-4c03-852c-25a6df06cee4 none 1048576 0 1048576 0% /vz/private/144dc737-b4e3-4c03-852c-25a6df06cee4/dump/Dump/.criu.cgyard.56I2ls # umount /dev/ploop11432p1 # ploop check -F /vz/private/144dc737-b4e3-4c03-852c-25a6df06cee4/root.hdd/root.hds Reopen rw /vz/private/144dc737-b4e3-4c03-852c-25a6df06cee4/root.hdd/root.hds Error in ploop_check (check.c:663): Dirty flag is set # ploop mount /vz/private/144dc737-b4e3-4c03-852c-25a6df06cee4/root.hdd/DiskDescriptor.xml Error in ploop_mount_image (ploop.c:2495): Image /vz/private/144dc737-b4e3-4c03-852c-25a6df06cee4/root.hdd/root.hds already used by device /dev/ploop11432 # df -H | grep ploop11432 => nothing I am lost , any help appreciated . Thanks . Le 06/07/2020 à 15:37, Jehan Procaccia IMT a écrit : Hello, I am back to the initial pb related to that post , since I updated to /OpenVZ release 7.0.14 (136) | ///Virtuozzo Linux release 7.8.0 (609)// , I am also facing CT corrupted status . I don't see the exact same error as mentioned by Kevin Drysdale below (ploop/fsck) , but I am not able to enter certain CT neither can I stop them /[root@olb~]# prlctl stop trans8// //Stopping the CT...// //Failed to stop the CT: PRL_ERR_VZCTL_OPERATION_FAILED (Details: Cannot lock the Container// //)// / /[root@olb ~]# prlctl enter trans8// //Unable to get init pid// //enter into CT failed// // //exited from CT 02faecdd-ddb6-42eb-8103-202508f18256/ For those CTs that fail to enter or stop, I noticed that there is a 2nd device mounted with name ending in /dump/Dump/.criu.cgyard.4EJB8c// / /[root@olb ~]# df -H |grep 02faecdd-ddb6-42eb-8103-202508f18256// ///dev/ploop53152p1 11G 2,2G 7,7G 23% /vz/root/02faecdd-ddb6-42eb-8103-202508f18256// //none 537M 0 537M 0% /vz/private/02faecdd-ddb6-42eb-8103-202508f18256/dump/Dump/.criu.cgyard.4EJB8c/ //[root@olb ~]# prlctl list | grep 02faecdd-ddb6-42eb-8103-202508f18256// //{02faecdd-ddb6-42eb-8103-202508f18256} running 157.159.196.17 CT isptrans8// // I rebooted the whole hardware node, and since reboot here is the related vzctl.log /2020-07-06T15:10:38+0200 vzctl : CT 02faecdd-ddb6-42eb-8103-202508f18256 : Removing the stale lock file /vz/lock/02faecdd-ddb6-42eb-8103-202508f18256.lck// //2020-07-06T15:10:38+0200 vzctl : CT 02faecdd-ddb6-42eb-8103-202508f18256 : Restoring the Container ...// //2020-07-06T15:10:38+0200 vzctl : CT 02faecdd-ddb6-42eb-8103-202508f18256 : Mount image: /vz/private/02faecdd-ddb6-42eb-8103-202508f18256/root.hdd // //2020-07-06T15:10:38+0200 : Opening delta /vz/private/02faecdd-ddb6-42eb-8103-202508f18256/root.hdd/root.hds// //2020-07-06T15:10:38+0200 : Opening delta /vz/private/02faecdd-ddb6-42eb-8103-202508f18256/root.hdd/root.hds// //2020-07-06T15:10:38+0200 : Opening delta /vz/private/02faecdd-ddb6-42eb-8103-202508f18256/root.hdd/root.hds// //2020-07-06T15:10:38+0200 : Adding delta dev=/dev/ploop53152 img=/vz/private/02faecdd-ddb6-42eb-8103-202508f18256/root.hdd/root.hds (rw)// //2020-07-06T15:10:39+0200 : Mounted /dev/ploop53152p1 at /vz/root/02faecdd-ddb6-42eb-8103-202508f18256 fstype=ext4 data=',balloon_ino=12' // //2020-07-06T15:10:39+0200 vzctl : CT 02faecdd-ddb6-42eb-8103-202508f18256 : Container is mounted// //2020-07-06T15:10:40+0200 vzctl : CT 02faecdd-ddb6-42eb-8103-202508f18256 : Setting permissions for image=/vz/private/02faecdd-ddb6-42eb-8103-202508f18256/root.hdd// //2020-07-06T15:10:40+0200 vzctl : CT 02faecdd-ddb6-42eb-8103-202508f18256 : Configure memguarantee: 0%// //2020-07-06T15:18:12+0200 vzctl : CT 02faecdd-ddb6-42eb-8103-202508f18256 : Unable to get init pid// //2020-07-06T15:18:12+0200 vzctl : CT 02faecdd-ddb6-42eb-8103-202508f18256 : enter into CT failed// //2020-07-06T15:19:49+0200 vzctl : CT 02faecdd-ddb6-42eb-8103-202508f18256 : Cannot lock the Container// //2020-07-06T15:25:33+0200 vzctl : CT 02faecdd-ddb6-42eb-8103-202508f18256 : Unable to get init pid// //2020-07-06T15:25:33+0200 vzctl : CT 02faecdd-ddb6-42eb-8103-202508f18256 : enter into CT failed/ on another CT failing to enter / stop same kind of logs + /Error (criu /: /2020-07-06T15:10:38+0200 vzctl : CT 4ae48335-5b63-475d-8629-c8d742cb0ba0 : Restoring the Container ...// //2020-07-06T15:10:38+0200 vzctl : CT 4ae48335-5b63-475d-8629-c8d742cb0ba0 : Mount
Re: [Users] Issues after updating to 7.0.14 (136)
Hello, I am back to the initial pb related to that post , since I updated to /OpenVZ release 7.0.14 (136) | ///Virtuozzo Linux release 7.8.0 (609)// , I am also facing CT corrupted status . I don't see the exact same error as mentioned by Kevin Drysdale below (ploop/fsck) , but I am not able to enter certain CT neither can I stop them /[root@olb~]# prlctl stop trans8// //Stopping the CT...// //Failed to stop the CT: PRL_ERR_VZCTL_OPERATION_FAILED (Details: Cannot lock the Container// //)// / /[root@olb ~]# prlctl enter trans8// //Unable to get init pid// //enter into CT failed// // //exited from CT 02faecdd-ddb6-42eb-8103-202508f18256/ For those CTs that fail to enter or stop, I noticed that there is a 2nd device mounted with name ending in /dump/Dump/.criu.cgyard.4EJB8c// / /[root@olb ~]# df -H |grep 02faecdd-ddb6-42eb-8103-202508f18256// ///dev/ploop53152p1 11G 2,2G 7,7G 23% /vz/root/02faecdd-ddb6-42eb-8103-202508f18256// //none 537M 0 537M 0% /vz/private/02faecdd-ddb6-42eb-8103-202508f18256/dump/Dump/.criu.cgyard.4EJB8c/ //[root@olb ~]# prlctl list | grep 02faecdd-ddb6-42eb-8103-202508f18256// //{02faecdd-ddb6-42eb-8103-202508f18256} running 157.159.196.17 CT isptrans8// // I rebooted the whole hardware node, and since reboot here is the related vzctl.log /2020-07-06T15:10:38+0200 vzctl : CT 02faecdd-ddb6-42eb-8103-202508f18256 : Removing the stale lock file /vz/lock/02faecdd-ddb6-42eb-8103-202508f18256.lck// //2020-07-06T15:10:38+0200 vzctl : CT 02faecdd-ddb6-42eb-8103-202508f18256 : Restoring the Container ...// //2020-07-06T15:10:38+0200 vzctl : CT 02faecdd-ddb6-42eb-8103-202508f18256 : Mount image: /vz/private/02faecdd-ddb6-42eb-8103-202508f18256/root.hdd // //2020-07-06T15:10:38+0200 : Opening delta /vz/private/02faecdd-ddb6-42eb-8103-202508f18256/root.hdd/root.hds// //2020-07-06T15:10:38+0200 : Opening delta /vz/private/02faecdd-ddb6-42eb-8103-202508f18256/root.hdd/root.hds// //2020-07-06T15:10:38+0200 : Opening delta /vz/private/02faecdd-ddb6-42eb-8103-202508f18256/root.hdd/root.hds// //2020-07-06T15:10:38+0200 : Adding delta dev=/dev/ploop53152 img=/vz/private/02faecdd-ddb6-42eb-8103-202508f18256/root.hdd/root.hds (rw)// //2020-07-06T15:10:39+0200 : Mounted /dev/ploop53152p1 at /vz/root/02faecdd-ddb6-42eb-8103-202508f18256 fstype=ext4 data=',balloon_ino=12' // //2020-07-06T15:10:39+0200 vzctl : CT 02faecdd-ddb6-42eb-8103-202508f18256 : Container is mounted// //2020-07-06T15:10:40+0200 vzctl : CT 02faecdd-ddb6-42eb-8103-202508f18256 : Setting permissions for image=/vz/private/02faecdd-ddb6-42eb-8103-202508f18256/root.hdd// //2020-07-06T15:10:40+0200 vzctl : CT 02faecdd-ddb6-42eb-8103-202508f18256 : Configure memguarantee: 0%// //2020-07-06T15:18:12+0200 vzctl : CT 02faecdd-ddb6-42eb-8103-202508f18256 : Unable to get init pid// //2020-07-06T15:18:12+0200 vzctl : CT 02faecdd-ddb6-42eb-8103-202508f18256 : enter into CT failed// //2020-07-06T15:19:49+0200 vzctl : CT 02faecdd-ddb6-42eb-8103-202508f18256 : Cannot lock the Container// //2020-07-06T15:25:33+0200 vzctl : CT 02faecdd-ddb6-42eb-8103-202508f18256 : Unable to get init pid// //2020-07-06T15:25:33+0200 vzctl : CT 02faecdd-ddb6-42eb-8103-202508f18256 : enter into CT failed/ on another CT failing to enter / stop same kind of logs + /Error (criu /: /2020-07-06T15:10:38+0200 vzctl : CT 4ae48335-5b63-475d-8629-c8d742cb0ba0 : Restoring the Container ...// //2020-07-06T15:10:38+0200 vzctl : CT 4ae48335-5b63-475d-8629-c8d742cb0ba0 : Mount image: /vz/private/4ae48335-5b63-475d-8629-c8d742cb0ba0/root.hdd // //2020-07-06T15:10:38+0200 : Opening delta /vz/private/4ae48335-5b63-475d-8629-c8d742cb0ba0/root.hdd/root.hds// //2020-07-06T15:10:39+0200 : Opening delta /vz/private/4ae48335-5b63-475d-8629-c8d742cb0ba0/root.hdd/root.hds// //2020-07-06T15:10:39+0200 : Opening delta /vz/private/4ae48335-5b63-475d-8629-c8d742cb0ba0/root.hdd/root.hds// //2020-07-06T15:10:39+0200 : Adding delta dev=/dev/ploop36049 img=/vz/private/4ae48335-5b63-475d-8629-c8d742cb0ba0/root.hdd/root.hds (rw)// //2020-07-06T15:10:41+0200 : Mounted /dev/ploop36049p1 at /vz/root/4ae48335-5b63-475d-8629-c8d742cb0ba0 fstype=ext4 data=',balloon_ino=12' // //2020-07-06T15:10:41+0200 vzctl : CT 4ae48335-5b63-475d-8629-c8d742cb0ba0 : Container is mounted// //2020-07-06T15:10:41+0200 vzctl : CT 4ae48335-5b63-475d-8629-c8d742cb0ba0 : Setting permissions for image=/vz/private/4ae48335-5b63-475d-8629-c8d742cb0ba0/root.hdd// //2020-07-06T15:10:41+0200 vzctl : CT 4ae48335-5b63-475d-8629-c8d742cb0ba0 : Configure memguarantee: 0%// //2020-07-06T15:10:57+0200 vzeventd : Run: /etc/vz/vzevent.d/ve-stop id=4ae48335-5b63-475d-8629-c8d742cb0ba0// //2020-07-06T15:10:57+0200 vzctl : CT 4ae48335-5b63-475d-8629-c8d742cb0ba0 : (03.038774) Error (criu/util.c:666): exited, status=4// //2020-07-06T15:10:57+0200 vzctl : CT 4ae48335-5b63-475d-8629-c8d742cb0ba0 : (14.446513) 1: Error
Re: [Users] Issues after updating to 7.0.14 (136)
yes , you are right, I do get the same virtuozzo-release as mentioned in the initial subject, sorry for the noise . # cat /etc/virtuozzo-release OpenVZ release 7.0.14 (136) but anyway, I don't see any ploop / fsck error in the host /var/log/vzctl.log inside the CT , where did you see those errors ? Jehan . De: "jjs - mainphrame" À: "OpenVZ users" Envoyé: Jeudi 2 Juillet 2020 19:33:23 Objet: Re: [Users] Issues after updating to 7.0.14 (136) Thanks for that sanity check, the conundrum is resolved. vzlinux-release and virtuozzo-release are indeed different things. Jake On Thu, Jul 2, 2020 at 10:27 AM Jonathan Wright < [ mailto:jonat...@knownhost.com | jonat...@knownhost.com ] > wrote: /etc/redhat-release and /etc/virtuozzo-release are two different things. On 7/2/20 12:16 PM, jjs - mainphrame wrote: BQ_BEGIN Jehan - I get the same output here - [root@annie ~]# yum repolist |grep virt virtuozzolinux-base VirtuozzoLinux Base 15,415+189 virtuozzolinux-updates VirtuozzoLinux Updates 0 I'm baffled as to how you're on 7.8.0 while I'm at 7.0,15 even though I'm fully up to date. # uname -a Linux [ http://annie.ufcfan.org/ | annie.ufcfan.org ] 3.10.0-1127.8.2.vz7.151.10 #1 SMP Mon Jun 1 19:05:52 MSK 2020 x86_64 x86_64 x86_64 GNU/Linux Jake On Thu, Jul 2, 2020 at 10:08 AM Jehan PROCACCIA < [ mailto:jehan.procac...@imtbs-tsp.eu | jehan.procac...@imtbs-tsp.eu ] > wrote: BQ_BEGIN no factory , just repos virtuozzolinux-base and openvz-os # yum repolist |grep virt virtuozzolinux-base VirtuozzoLinux Base 15 415+189 virtuozzolinux-updates VirtuozzoLinux Updates 0 Jehan . De: "jjs - mainphrame" < [ mailto:j...@mainphrame.com | j...@mainphrame.com ] > À: "OpenVZ users" < [ mailto:users@openvz.org | users@openvz.org ] > Cc: "Kevin Drysdale" < [ mailto:kevin.drysd...@iomart.com | kevin.drysd...@iomart.com ] > Envoyé: Jeudi 2 Juillet 2020 18:22:33 Objet: Re: [Users] Issues after updating to 7.0.14 (136) Jehan, are you running factory? My ovz hosts are up to date, and I see: [root@annie ~]# cat /etc/virtuozzo-release OpenVZ release 7.0.15 (222) Jake On Thu, Jul 2, 2020 at 9:08 AM Jehan Procaccia IMT < [ mailto:jehan.procac...@imtbs-tsp.eu | jehan.procac...@imtbs-tsp.eu ] > wrote: BQ_BEGIN "updating to 7.0.14 (136)" !? I did an update yesterday , I am far behind that version # cat /etc/vzlinux-release Virtuozzo Linux release 7.8.0 (609) # uname -a Linux localhost 3.10.0-1127.8.2.vz7.151.14 #1 SMP Tue Jun 9 12:58:54 MSK 2020 x86_64 x86_64 x86_64 GNU/Linux why don't you try to update to latest version ? Le 29/06/2020 à 12:30, Kevin Drysdale a écrit : BQ_BEGIN Hello, After updating one of our OpenVZ VPS hosting nodes at the end of last week, we've started to have issues with corruption apparently occurring inside containers. Issues of this nature have never affected the node previously, and there do not appear to be any hardware issues that could explain this. Specifically, a few hours after updating, we began to see containers experiencing errors such as this in the logs: [90471.678994] EXT4-fs (ploop35454p1): error count since last fsck: 25 [90471.679022] EXT4-fs (ploop35454p1): initial error at time 1593205255: ext4_ext_find_extent:904: inode 136399 [90471.679030] EXT4-fs (ploop35454p1): last error at time 1593232922: ext4_ext_find_extent:904: inode 136399 [95189.954569] EXT4-fs (ploop42983p1): error count since last fsck: 67 [95189.954582] EXT4-fs (ploop42983p1): initial error at time 1593210174: htree_dirblock_to_tree:918: inode 926441: block 3683060 [95189.954589] EXT4-fs (ploop42983p1): last error at time 1593276902: ext4_iget:4435: inode 1849777 [95714.207432] EXT4-fs (ploop60706p1): error count since last fsck: 42 [95714.207447] EXT4-fs (ploop60706p1): initial error at time 1593210489: ext4_ext_find_extent:904: inode 136272 [95714.207452] EXT4-fs (ploop60706p1): last error at time 1593231063: ext4_ext_find_extent:904: inode 136272 Shutting the containers down and manually mounting and e2fsck'ing their filesystems did clear these errors, but each of the containers (which were mostly used for running Plesk) had widespread issues with corrupt or missing files after the fsck's completed, necessitating their being restored from backup. Concurrently, we also began to see messages like this appearing in /var/log/vzctl.log, which again have never appeared at any point prior to this update being installed: /var/log/vzctl.log:2020-06-26T21:05:19+0100 : Error in fill_hole (check.c:240): Warning: ploop image '/vz/private/8288448/root.hdd/root.hds' is sparse /var/log/vzctl.log:2020-06-26T21:09:41+0100 : Error in fill_hole (check.c:240): Warning: ploop image '/vz/private/8288450/root.hdd/root.hds' is sparse /var/log/vzctl.log:2020-06-26T21:16:22+0100 : Error in fill_hole (check.c:240): Warning: ploop imag
Re: [Users] Issues after updating to 7.0.14 (136)
Thanks for that sanity check, the conundrum is resolved. vzlinux-release and virtuozzo-release are indeed different things. Jake On Thu, Jul 2, 2020 at 10:27 AM Jonathan Wright wrote: > /etc/redhat-release and /etc/virtuozzo-release are two different things. > On 7/2/20 12:16 PM, jjs - mainphrame wrote: > > Jehan - > > I get the same output here - > > [root@annie ~]# yum repolist |grep virt > virtuozzolinux-baseVirtuozzoLinux Base > 15,415+189 > virtuozzolinux-updates VirtuozzoLinux Updates > 0 > > I'm baffled as to how you're on 7.8.0 while I'm at 7.0,15 even though I'm > fully up to date. > > # uname -a > Linux annie.ufcfan.org 3.10.0-1127.8.2.vz7.151.10 #1 SMP Mon Jun 1 > 19:05:52 MSK 2020 x86_64 x86_64 x86_64 GNU/Linux > > Jake > > On Thu, Jul 2, 2020 at 10:08 AM Jehan PROCACCIA < > jehan.procac...@imtbs-tsp.eu> wrote: > >> no factory , just repos virtuozzolinux-base and openvz-os >> >> # yum repolist |grep virt >> virtuozzolinux-baseVirtuozzoLinux Base15 >> 415+189 >> virtuozzolinux-updates VirtuozzoLinux >> Updates 0 >> >> Jehan . >> >> ---------- >> *De: *"jjs - mainphrame" >> *À: *"OpenVZ users" >> *Cc: *"Kevin Drysdale" >> *Envoyé: *Jeudi 2 Juillet 2020 18:22:33 >> *Objet: *Re: [Users] Issues after updating to 7.0.14 (136) >> >> Jehan, are you running factory? >> >> My ovz hosts are up to date, and I see: >> >> [root@annie ~]# cat /etc/virtuozzo-release >> OpenVZ release 7.0.15 (222) >> >> Jake >> >> >> On Thu, Jul 2, 2020 at 9:08 AM Jehan Procaccia IMT < >> jehan.procac...@imtbs-tsp.eu> wrote: >> >>> "updating to 7.0.14 (136)" !? >>> >>> I did an update yesterday , I am far behind that version >>> >>> *# cat /etc/vzlinux-release* >>> *Virtuozzo Linux release 7.8.0 (609)* >>> >>> *# uname -a * >>> *Linux localhost 3.10.0-1127.8.2.vz7.151.14 #1 SMP Tue Jun 9 12:58:54 >>> MSK 2020 x86_64 x86_64 x86_64 GNU/Linux* >>> >>> why don't you try to update to latest version ? >>> >>> >>> Le 29/06/2020 à 12:30, Kevin Drysdale a écrit : >>> >>> Hello, >>> >>> After updating one of our OpenVZ VPS hosting nodes at the end of last >>> week, we've started to have issues with corruption apparently occurring >>> inside containers. Issues of this nature have never affected the node >>> previously, and there do not appear to be any hardware issues that could >>> explain this. >>> >>> Specifically, a few hours after updating, we began to see containers >>> experiencing errors such as this in the logs: >>> >>> [90471.678994] EXT4-fs (ploop35454p1): error count since last fsck: 25 >>> [90471.679022] EXT4-fs (ploop35454p1): initial error at time 1593205255: >>> ext4_ext_find_extent:904: inode 136399 >>> [90471.679030] EXT4-fs (ploop35454p1): last error at time 1593232922: >>> ext4_ext_find_extent:904: inode 136399 >>> [95189.954569] EXT4-fs (ploop42983p1): error count since last fsck: 67 >>> [95189.954582] EXT4-fs (ploop42983p1): initial error at time 1593210174: >>> htree_dirblock_to_tree:918: inode 926441: block 3683060 >>> [95189.954589] EXT4-fs (ploop42983p1): last error at time 1593276902: >>> ext4_iget:4435: inode 1849777 >>> [95714.207432] EXT4-fs (ploop60706p1): error count since last fsck: 42 >>> [95714.207447] EXT4-fs (ploop60706p1): initial error at time 1593210489: >>> ext4_ext_find_extent:904: inode 136272 >>> [95714.207452] EXT4-fs (ploop60706p1): last error at time 1593231063: >>> ext4_ext_find_extent:904: inode 136272 >>> >>> Shutting the containers down and manually mounting and e2fsck'ing their >>> filesystems did clear these errors, but each of the containers (which were >>> mostly used for running Plesk) had widespread issues with corrupt or >>> missing files after the fsck's completed, necessitating their being >>> restored from backup. >>> >>> Concurrently, we also began to see messages like this appearing in >>> /var/log/vzctl.log, which again have never appeared at any point prior to >>> this update being installed: >>> >>> /var/log/vzctl.log:2020-06-26T21:05:19+0100 : Error in fill_hole >>> (check.c:240): Warning: ploop image '/vz/private/8288448/root.hdd/root.hds' >>> is sparse >>> /var/l
Re: [Users] Issues after updating to 7.0.14 (136)
/etc/redhat-release and /etc/virtuozzo-release are two different things. On 7/2/20 12:16 PM, jjs - mainphrame wrote: Jehan - I get the same output here - [root@annie ~]# yum repolist |grep virt virtuozzolinux-base VirtuozzoLinux Base 15,415+189 virtuozzolinux-updates VirtuozzoLinux Updates 0 I'm baffled as to how you're on 7.8.0 while I'm at 7.0,15 even though I'm fully up to date. # uname -a Linux annie.ufcfan.org <http://annie.ufcfan.org> 3.10.0-1127.8.2.vz7.151.10 #1 SMP Mon Jun 1 19:05:52 MSK 2020 x86_64 x86_64 x86_64 GNU/Linux Jake On Thu, Jul 2, 2020 at 10:08 AM Jehan PROCACCIA mailto:jehan.procac...@imtbs-tsp.eu>> wrote: no factory , just repos virtuozzolinux-base and openvz-os # yum repolist |grep virt virtuozzolinux-base VirtuozzoLinux Base 15 415+189 virtuozzolinux-updates VirtuozzoLinux Updates 0 Jehan . *De: *"jjs - mainphrame" mailto:j...@mainphrame.com>> *À: *"OpenVZ users" mailto:users@openvz.org>> *Cc: *"Kevin Drysdale" mailto:kevin.drysd...@iomart.com>> *Envoyé: *Jeudi 2 Juillet 2020 18:22:33 *Objet: *Re: [Users] Issues after updating to 7.0.14 (136) Jehan, are you running factory? My ovz hosts are up to date, and I see: [root@annie ~]# cat /etc/virtuozzo-release OpenVZ release 7.0.15 (222) Jake On Thu, Jul 2, 2020 at 9:08 AM Jehan Procaccia IMT mailto:jehan.procac...@imtbs-tsp.eu>> wrote: "updating to 7.0.14 (136)" !? I did an update yesterday , I am far behind that version /# cat /etc/vzlinux-release// / /Virtuozzo Linux release 7.8.0 (609)/ / / /# uname -a // //Linux localhost 3.10.0-1127.8.2.vz7.151.14 #1 SMP Tue Jun 9 12:58:54 MSK 2020 x86_64 x86_64 x86_64 GNU/Linux// / why don't you try to update to latest version ? Le 29/06/2020 à 12:30, Kevin Drysdale a écrit : Hello, After updating one of our OpenVZ VPS hosting nodes at the end of last week, we've started to have issues with corruption apparently occurring inside containers. Issues of this nature have never affected the node previously, and there do not appear to be any hardware issues that could explain this. Specifically, a few hours after updating, we began to see containers experiencing errors such as this in the logs: [90471.678994] EXT4-fs (ploop35454p1): error count since last fsck: 25 [90471.679022] EXT4-fs (ploop35454p1): initial error at time 1593205255: ext4_ext_find_extent:904: inode 136399 [90471.679030] EXT4-fs (ploop35454p1): last error at time 1593232922: ext4_ext_find_extent:904: inode 136399 [95189.954569] EXT4-fs (ploop42983p1): error count since last fsck: 67 [95189.954582] EXT4-fs (ploop42983p1): initial error at time 1593210174: htree_dirblock_to_tree:918: inode 926441: block 3683060 [95189.954589] EXT4-fs (ploop42983p1): last error at time 1593276902: ext4_iget:4435: inode 1849777 [95714.207432] EXT4-fs (ploop60706p1): error count since last fsck: 42 [95714.207447] EXT4-fs (ploop60706p1): initial error at time 1593210489: ext4_ext_find_extent:904: inode 136272 [95714.207452] EXT4-fs (ploop60706p1): last error at time 1593231063: ext4_ext_find_extent:904: inode 136272 Shutting the containers down and manually mounting and e2fsck'ing their filesystems did clear these errors, but each of the containers (which were mostly used for running Plesk) had widespread issues with corrupt or missing files after the fsck's completed, necessitating their being restored from backup. Concurrently, we also began to see messages like this appearing in /var/log/vzctl.log, which again have never appeared at any point prior to this update being installed: /var/log/vzctl.log:2020-06-26T21:05:19+0100 : Error in fill_hole (check.c:240): Warning: ploop image '/vz/private/8288448/root.hdd/root.hds' is sparse /var/log/vzctl.log:2020-06-26T21:09:41+0100 : Error in fill_hole (check.c:240): Warning: ploop image '/vz/private/8288450/root.hdd/root.hds' is sparse /var/log/vzctl.log:2020-06-26T21:16:22+0100 : Error in fill_hole (check.c:240): Warning: ploop image '/vz/private/8288451/root.hdd/root.hds' is sparse /var/log/vzctl.log:2020
Re: [Users] Issues after updating to 7.0.14 (136)
Jehan - I get the same output here - [root@annie ~]# yum repolist |grep virt virtuozzolinux-baseVirtuozzoLinux Base 15,415+189 virtuozzolinux-updates VirtuozzoLinux Updates 0 I'm baffled as to how you're on 7.8.0 while I'm at 7.0,15 even though I'm fully up to date. # uname -a Linux annie.ufcfan.org 3.10.0-1127.8.2.vz7.151.10 #1 SMP Mon Jun 1 19:05:52 MSK 2020 x86_64 x86_64 x86_64 GNU/Linux Jake On Thu, Jul 2, 2020 at 10:08 AM Jehan PROCACCIA < jehan.procac...@imtbs-tsp.eu> wrote: > no factory , just repos virtuozzolinux-base and openvz-os > > # yum repolist |grep virt > virtuozzolinux-baseVirtuozzoLinux Base15 > 415+189 > virtuozzolinux-updates VirtuozzoLinux > Updates 0 > > Jehan . > > -- > *De: *"jjs - mainphrame" > *À: *"OpenVZ users" > *Cc: *"Kevin Drysdale" > *Envoyé: *Jeudi 2 Juillet 2020 18:22:33 > *Objet: *Re: [Users] Issues after updating to 7.0.14 (136) > > Jehan, are you running factory? > > My ovz hosts are up to date, and I see: > > [root@annie ~]# cat /etc/virtuozzo-release > OpenVZ release 7.0.15 (222) > > Jake > > > On Thu, Jul 2, 2020 at 9:08 AM Jehan Procaccia IMT < > jehan.procac...@imtbs-tsp.eu> wrote: > >> "updating to 7.0.14 (136)" !? >> >> I did an update yesterday , I am far behind that version >> >> *# cat /etc/vzlinux-release* >> *Virtuozzo Linux release 7.8.0 (609)* >> >> *# uname -a * >> *Linux localhost 3.10.0-1127.8.2.vz7.151.14 #1 SMP Tue Jun 9 12:58:54 MSK >> 2020 x86_64 x86_64 x86_64 GNU/Linux* >> >> why don't you try to update to latest version ? >> >> >> Le 29/06/2020 à 12:30, Kevin Drysdale a écrit : >> >> Hello, >> >> After updating one of our OpenVZ VPS hosting nodes at the end of last >> week, we've started to have issues with corruption apparently occurring >> inside containers. Issues of this nature have never affected the node >> previously, and there do not appear to be any hardware issues that could >> explain this. >> >> Specifically, a few hours after updating, we began to see containers >> experiencing errors such as this in the logs: >> >> [90471.678994] EXT4-fs (ploop35454p1): error count since last fsck: 25 >> [90471.679022] EXT4-fs (ploop35454p1): initial error at time 1593205255: >> ext4_ext_find_extent:904: inode 136399 >> [90471.679030] EXT4-fs (ploop35454p1): last error at time 1593232922: >> ext4_ext_find_extent:904: inode 136399 >> [95189.954569] EXT4-fs (ploop42983p1): error count since last fsck: 67 >> [95189.954582] EXT4-fs (ploop42983p1): initial error at time 1593210174: >> htree_dirblock_to_tree:918: inode 926441: block 3683060 >> [95189.954589] EXT4-fs (ploop42983p1): last error at time 1593276902: >> ext4_iget:4435: inode 1849777 >> [95714.207432] EXT4-fs (ploop60706p1): error count since last fsck: 42 >> [95714.207447] EXT4-fs (ploop60706p1): initial error at time 1593210489: >> ext4_ext_find_extent:904: inode 136272 >> [95714.207452] EXT4-fs (ploop60706p1): last error at time 1593231063: >> ext4_ext_find_extent:904: inode 136272 >> >> Shutting the containers down and manually mounting and e2fsck'ing their >> filesystems did clear these errors, but each of the containers (which were >> mostly used for running Plesk) had widespread issues with corrupt or >> missing files after the fsck's completed, necessitating their being >> restored from backup. >> >> Concurrently, we also began to see messages like this appearing in >> /var/log/vzctl.log, which again have never appeared at any point prior to >> this update being installed: >> >> /var/log/vzctl.log:2020-06-26T21:05:19+0100 : Error in fill_hole >> (check.c:240): Warning: ploop image '/vz/private/8288448/root.hdd/root.hds' >> is sparse >> /var/log/vzctl.log:2020-06-26T21:09:41+0100 : Error in fill_hole >> (check.c:240): Warning: ploop image '/vz/private/8288450/root.hdd/root.hds' >> is sparse >> /var/log/vzctl.log:2020-06-26T21:16:22+0100 : Error in fill_hole >> (check.c:240): Warning: ploop image '/vz/private/8288451/root.hdd/root.hds' >> is sparse >> /var/log/vzctl.log:2020-06-26T21:19:57+0100 : Error in fill_hole >> (check.c:240): Warning: ploop image '/vz/private/8288452/root.hdd/root.hds' >> is sparse >> >> The basic procedure we follow when updating our nodes is as follows: >> >> 1, Update the standby node we keep spare for this process >> 2. vzmigrate all containers from the live node being updated to the >> stan
Re: [Users] Issues after updating to 7.0.14 (136)
no factory , just repos virtuozzolinux-base and openvz-os # yum repolist |grep virt virtuozzolinux-base VirtuozzoLinux Base 15 415+189 virtuozzolinux-updates VirtuozzoLinux Updates 0 Jehan . De: "jjs - mainphrame" À: "OpenVZ users" Cc: "Kevin Drysdale" Envoyé: Jeudi 2 Juillet 2020 18:22:33 Objet: Re: [Users] Issues after updating to 7.0.14 (136) Jehan, are you running factory? My ovz hosts are up to date, and I see: [root@annie ~]# cat /etc/virtuozzo-release OpenVZ release 7.0.15 (222) Jake On Thu, Jul 2, 2020 at 9:08 AM Jehan Procaccia IMT < [ mailto:jehan.procac...@imtbs-tsp.eu | jehan.procac...@imtbs-tsp.eu ] > wrote: "updating to 7.0.14 (136)" !? I did an update yesterday , I am far behind that version # cat /etc/vzlinux-release Virtuozzo Linux release 7.8.0 (609) # uname -a Linux localhost 3.10.0-1127.8.2.vz7.151.14 #1 SMP Tue Jun 9 12:58:54 MSK 2020 x86_64 x86_64 x86_64 GNU/Linux why don't you try to update to latest version ? Le 29/06/2020 à 12:30, Kevin Drysdale a écrit : BQ_BEGIN Hello, After updating one of our OpenVZ VPS hosting nodes at the end of last week, we've started to have issues with corruption apparently occurring inside containers. Issues of this nature have never affected the node previously, and there do not appear to be any hardware issues that could explain this. Specifically, a few hours after updating, we began to see containers experiencing errors such as this in the logs: [90471.678994] EXT4-fs (ploop35454p1): error count since last fsck: 25 [90471.679022] EXT4-fs (ploop35454p1): initial error at time 1593205255: ext4_ext_find_extent:904: inode 136399 [90471.679030] EXT4-fs (ploop35454p1): last error at time 1593232922: ext4_ext_find_extent:904: inode 136399 [95189.954569] EXT4-fs (ploop42983p1): error count since last fsck: 67 [95189.954582] EXT4-fs (ploop42983p1): initial error at time 1593210174: htree_dirblock_to_tree:918: inode 926441: block 3683060 [95189.954589] EXT4-fs (ploop42983p1): last error at time 1593276902: ext4_iget:4435: inode 1849777 [95714.207432] EXT4-fs (ploop60706p1): error count since last fsck: 42 [95714.207447] EXT4-fs (ploop60706p1): initial error at time 1593210489: ext4_ext_find_extent:904: inode 136272 [95714.207452] EXT4-fs (ploop60706p1): last error at time 1593231063: ext4_ext_find_extent:904: inode 136272 Shutting the containers down and manually mounting and e2fsck'ing their filesystems did clear these errors, but each of the containers (which were mostly used for running Plesk) had widespread issues with corrupt or missing files after the fsck's completed, necessitating their being restored from backup. Concurrently, we also began to see messages like this appearing in /var/log/vzctl.log, which again have never appeared at any point prior to this update being installed: /var/log/vzctl.log:2020-06-26T21:05:19+0100 : Error in fill_hole (check.c:240): Warning: ploop image '/vz/private/8288448/root.hdd/root.hds' is sparse /var/log/vzctl.log:2020-06-26T21:09:41+0100 : Error in fill_hole (check.c:240): Warning: ploop image '/vz/private/8288450/root.hdd/root.hds' is sparse /var/log/vzctl.log:2020-06-26T21:16:22+0100 : Error in fill_hole (check.c:240): Warning: ploop image '/vz/private/8288451/root.hdd/root.hds' is sparse /var/log/vzctl.log:2020-06-26T21:19:57+0100 : Error in fill_hole (check.c:240): Warning: ploop image '/vz/private/8288452/root.hdd/root.hds' is sparse The basic procedure we follow when updating our nodes is as follows: 1, Update the standby node we keep spare for this process 2. vzmigrate all containers from the live node being updated to the standby node 3. Update the live node 4. Reboot the live node 5. vzmigrate the containers from the standby node back to the live node they originally came from So the only tool which has been used to affect these containers is 'vzmigrate' itself, so I'm at something of a loss as to how to explain the root.hdd images for these containers containing sparse gaps. This is something we have never done, as we have always been aware that OpenVZ does not support their use inside a container's hard drive image. And the fact that these images have suddenly become sparse at the same time they have started to exhibit filesystem corruption is somewhat concerning. We can restore all affected containers from backups, but I wanted to get in touch with the list to see if anyone else at any other site has experienced these or similar issues after applying the 7.0.14 (136) update. Thank you, Kevin Drysdale. ___ Users mailing list [ mailto:Users@openvz.org | Users@openvz.org ] [ https://lists.openvz.org/mailman/listinfo/users | https://lists.openvz.org/mailman/listinfo/users ] ___ Users mailing list [ mailto:Users@openvz.org | Users@openvz.org
Re: [Users] Issues after updating to 7.0.14 (136)
Hello Kevin, What was the OpenVz version *before* update to 7.0.14-136? Sparse files for CTs are here for at least two years. Best regards, Konstantin -Original Message- From: users-boun...@openvz.org On Behalf Of Kevin Drysdale Sent: Monday, June 29, 2020 1:30 PM To: users@openvz.org Subject: [Users] Issues after updating to 7.0.14 (136) Hello, After updating one of our OpenVZ VPS hosting nodes at the end of last week, we've started to have issues with corruption apparently occurring inside containers. Issues of this nature have never affected the node previously, and there do not appear to be any hardware issues that could explain this. Specifically, a few hours after updating, we began to see containers experiencing errors such as this in the logs: [90471.678994] EXT4-fs (ploop35454p1): error count since last fsck: 25 [90471.679022] EXT4-fs (ploop35454p1): initial error at time 1593205255: ext4_ext_find_extent:904: inode 136399 [90471.679030] EXT4-fs (ploop35454p1): last error at time 1593232922: ext4_ext_find_extent:904: inode 136399 [95189.954569] EXT4-fs (ploop42983p1): error count since last fsck: 67 [95189.954582] EXT4-fs (ploop42983p1): initial error at time 1593210174: htree_dirblock_to_tree:918: inode 926441: block 3683060 [95189.954589] EXT4-fs (ploop42983p1): last error at time 1593276902: ext4_iget:4435: inode 1849777 [95714.207432] EXT4-fs (ploop60706p1): error count since last fsck: 42 [95714.207447] EXT4-fs (ploop60706p1): initial error at time 1593210489: ext4_ext_find_extent:904: inode 136272 [95714.207452] EXT4-fs (ploop60706p1): last error at time 1593231063: ext4_ext_find_extent:904: inode 136272 Shutting the containers down and manually mounting and e2fsck'ing their filesystems did clear these errors, but each of the containers (which were mostly used for running Plesk) had widespread issues with corrupt or missing files after the fsck's completed, necessitating their being restored from backup. Concurrently, we also began to see messages like this appearing in /var/log/vzctl.log, which again have never appeared at any point prior to this update being installed: /var/log/vzctl.log:2020-06-26T21:05:19+0100 : Error in fill_hole (check.c:240): Warning: ploop image '/vz/private/8288448/root.hdd/root.hds' is sparse /var/log/vzctl.log:2020-06-26T21:09:41+0100 : Error in fill_hole (check.c:240): Warning: ploop image '/vz/private/8288450/root.hdd/root.hds' is sparse /var/log/vzctl.log:2020-06-26T21:16:22+0100 : Error in fill_hole (check.c:240): Warning: ploop image '/vz/private/8288451/root.hdd/root.hds' is sparse /var/log/vzctl.log:2020-06-26T21:19:57+0100 : Error in fill_hole (check.c:240): Warning: ploop image '/vz/private/8288452/root.hdd/root.hds' is sparse The basic procedure we follow when updating our nodes is as follows: 1, Update the standby node we keep spare for this process 2. vzmigrate all containers from the live node being updated to the standby node 3. Update the live node 4. Reboot the live node 5. vzmigrate the containers from the standby node back to the live node they originally came from So the only tool which has been used to affect these containers is 'vzmigrate' itself, so I'm at something of a loss as to how to explain the root.hdd images for these containers containing sparse gaps. This is something we have never done, as we have always been aware that OpenVZ does not support their use inside a container's hard drive image. And the fact that these images have suddenly become sparse at the same time they have started to exhibit filesystem corruption is somewhat concerning. We can restore all affected containers from backups, but I wanted to get in touch with the list to see if anyone else at any other site has experienced these or similar issues after applying the 7.0.14 (136) update. Thank you, Kevin Drysdale. ___ Users mailing list Users@openvz.org https://lists.openvz.org/mailman/listinfo/users ___ Users mailing list Users@openvz.org https://lists.openvz.org/mailman/listinfo/users
Re: [Users] Issues after updating to 7.0.14 (136)
Jehan, are you running factory? My ovz hosts are up to date, and I see: [root@annie ~]# cat /etc/virtuozzo-release OpenVZ release 7.0.15 (222) Jake On Thu, Jul 2, 2020 at 9:08 AM Jehan Procaccia IMT < jehan.procac...@imtbs-tsp.eu> wrote: > "updating to 7.0.14 (136)" !? > > I did an update yesterday , I am far behind that version > > *# cat /etc/vzlinux-release* > *Virtuozzo Linux release 7.8.0 (609)* > > *# uname -a * > *Linux localhost 3.10.0-1127.8.2.vz7.151.14 #1 SMP Tue Jun 9 12:58:54 MSK > 2020 x86_64 x86_64 x86_64 GNU/Linux* > > why don't you try to update to latest version ? > > > Le 29/06/2020 à 12:30, Kevin Drysdale a écrit : > > Hello, > > After updating one of our OpenVZ VPS hosting nodes at the end of last > week, we've started to have issues with corruption apparently occurring > inside containers. Issues of this nature have never affected the node > previously, and there do not appear to be any hardware issues that could > explain this. > > Specifically, a few hours after updating, we began to see containers > experiencing errors such as this in the logs: > > [90471.678994] EXT4-fs (ploop35454p1): error count since last fsck: 25 > [90471.679022] EXT4-fs (ploop35454p1): initial error at time 1593205255: > ext4_ext_find_extent:904: inode 136399 > [90471.679030] EXT4-fs (ploop35454p1): last error at time 1593232922: > ext4_ext_find_extent:904: inode 136399 > [95189.954569] EXT4-fs (ploop42983p1): error count since last fsck: 67 > [95189.954582] EXT4-fs (ploop42983p1): initial error at time 1593210174: > htree_dirblock_to_tree:918: inode 926441: block 3683060 > [95189.954589] EXT4-fs (ploop42983p1): last error at time 1593276902: > ext4_iget:4435: inode 1849777 > [95714.207432] EXT4-fs (ploop60706p1): error count since last fsck: 42 > [95714.207447] EXT4-fs (ploop60706p1): initial error at time 1593210489: > ext4_ext_find_extent:904: inode 136272 > [95714.207452] EXT4-fs (ploop60706p1): last error at time 1593231063: > ext4_ext_find_extent:904: inode 136272 > > Shutting the containers down and manually mounting and e2fsck'ing their > filesystems did clear these errors, but each of the containers (which were > mostly used for running Plesk) had widespread issues with corrupt or > missing files after the fsck's completed, necessitating their being > restored from backup. > > Concurrently, we also began to see messages like this appearing in > /var/log/vzctl.log, which again have never appeared at any point prior to > this update being installed: > > /var/log/vzctl.log:2020-06-26T21:05:19+0100 : Error in fill_hole > (check.c:240): Warning: ploop image '/vz/private/8288448/root.hdd/root.hds' > is sparse > /var/log/vzctl.log:2020-06-26T21:09:41+0100 : Error in fill_hole > (check.c:240): Warning: ploop image '/vz/private/8288450/root.hdd/root.hds' > is sparse > /var/log/vzctl.log:2020-06-26T21:16:22+0100 : Error in fill_hole > (check.c:240): Warning: ploop image '/vz/private/8288451/root.hdd/root.hds' > is sparse > /var/log/vzctl.log:2020-06-26T21:19:57+0100 : Error in fill_hole > (check.c:240): Warning: ploop image '/vz/private/8288452/root.hdd/root.hds' > is sparse > > The basic procedure we follow when updating our nodes is as follows: > > 1, Update the standby node we keep spare for this process > 2. vzmigrate all containers from the live node being updated to the > standby node > 3. Update the live node > 4. Reboot the live node > 5. vzmigrate the containers from the standby node back to the live node > they originally came from > > So the only tool which has been used to affect these containers is > 'vzmigrate' itself, so I'm at something of a loss as to how to explain the > root.hdd images for these containers containing sparse gaps. This is > something we have never done, as we have always been aware that OpenVZ does > not support their use inside a container's hard drive image. And the fact > that these images have suddenly become sparse at the same time they have > started to exhibit filesystem corruption is somewhat concerning. > > We can restore all affected containers from backups, but I wanted to get > in touch with the list to see if anyone else at any other site has > experienced these or similar issues after applying the 7.0.14 (136) update. > > Thank you, > Kevin Drysdale. > > > > > ___ > Users mailing list > Users@openvz.org > https://lists.openvz.org/mailman/listinfo/users > > > ___ > Users mailing list > Users@openvz.org > https://lists.openvz.org/mailman/listinfo/users > ___ Users mailing list Users@openvz.org https://lists.openvz.org/mailman/listinfo/users
Re: [Users] Issues after updating to 7.0.14 (136)
"updating to 7.0.14 (136)" !? I did an update yesterday , I am far behind that version /# cat /etc/vzlinux-release// / /Virtuozzo Linux release 7.8.0 (609)/ / / /# uname -a // //Linux localhost 3.10.0-1127.8.2.vz7.151.14 #1 SMP Tue Jun 9 12:58:54 MSK 2020 x86_64 x86_64 x86_64 GNU/Linux// / why don't you try to update to latest version ? Le 29/06/2020 à 12:30, Kevin Drysdale a écrit : Hello, After updating one of our OpenVZ VPS hosting nodes at the end of last week, we've started to have issues with corruption apparently occurring inside containers. Issues of this nature have never affected the node previously, and there do not appear to be any hardware issues that could explain this. Specifically, a few hours after updating, we began to see containers experiencing errors such as this in the logs: [90471.678994] EXT4-fs (ploop35454p1): error count since last fsck: 25 [90471.679022] EXT4-fs (ploop35454p1): initial error at time 1593205255: ext4_ext_find_extent:904: inode 136399 [90471.679030] EXT4-fs (ploop35454p1): last error at time 1593232922: ext4_ext_find_extent:904: inode 136399 [95189.954569] EXT4-fs (ploop42983p1): error count since last fsck: 67 [95189.954582] EXT4-fs (ploop42983p1): initial error at time 1593210174: htree_dirblock_to_tree:918: inode 926441: block 3683060 [95189.954589] EXT4-fs (ploop42983p1): last error at time 1593276902: ext4_iget:4435: inode 1849777 [95714.207432] EXT4-fs (ploop60706p1): error count since last fsck: 42 [95714.207447] EXT4-fs (ploop60706p1): initial error at time 1593210489: ext4_ext_find_extent:904: inode 136272 [95714.207452] EXT4-fs (ploop60706p1): last error at time 1593231063: ext4_ext_find_extent:904: inode 136272 Shutting the containers down and manually mounting and e2fsck'ing their filesystems did clear these errors, but each of the containers (which were mostly used for running Plesk) had widespread issues with corrupt or missing files after the fsck's completed, necessitating their being restored from backup. Concurrently, we also began to see messages like this appearing in /var/log/vzctl.log, which again have never appeared at any point prior to this update being installed: /var/log/vzctl.log:2020-06-26T21:05:19+0100 : Error in fill_hole (check.c:240): Warning: ploop image '/vz/private/8288448/root.hdd/root.hds' is sparse /var/log/vzctl.log:2020-06-26T21:09:41+0100 : Error in fill_hole (check.c:240): Warning: ploop image '/vz/private/8288450/root.hdd/root.hds' is sparse /var/log/vzctl.log:2020-06-26T21:16:22+0100 : Error in fill_hole (check.c:240): Warning: ploop image '/vz/private/8288451/root.hdd/root.hds' is sparse /var/log/vzctl.log:2020-06-26T21:19:57+0100 : Error in fill_hole (check.c:240): Warning: ploop image '/vz/private/8288452/root.hdd/root.hds' is sparse The basic procedure we follow when updating our nodes is as follows: 1, Update the standby node we keep spare for this process 2. vzmigrate all containers from the live node being updated to the standby node 3. Update the live node 4. Reboot the live node 5. vzmigrate the containers from the standby node back to the live node they originally came from So the only tool which has been used to affect these containers is 'vzmigrate' itself, so I'm at something of a loss as to how to explain the root.hdd images for these containers containing sparse gaps. This is something we have never done, as we have always been aware that OpenVZ does not support their use inside a container's hard drive image. And the fact that these images have suddenly become sparse at the same time they have started to exhibit filesystem corruption is somewhat concerning. We can restore all affected containers from backups, but I wanted to get in touch with the list to see if anyone else at any other site has experienced these or similar issues after applying the 7.0.14 (136) update. Thank you, Kevin Drysdale. ___ Users mailing list Users@openvz.org https://lists.openvz.org/mailman/listinfo/users ___ Users mailing list Users@openvz.org https://lists.openvz.org/mailman/listinfo/users