Re: [Users] Issues after updating to 7.0.14 (136)

2020-07-09 Thread Jehan PROCACCIA
Hello 
thanks for the advices, I did disable onboot=yes => no , so that at next reboot 
my CTs don't start automatically and enter a dead lock/loop 
than I could restart manually my CTs 
In fact I discuss with devs , it seems I enter a dead lock when my centos8 CTs 
(using NFT / netfilter) stop/suspended when rebooting the HWNode 
they are looking a this potential issue with netflter and latest updates . 

to be continued ... 

Thanks . 


De: "Oleksiy Tkachenko"  
À: "OpenVZ users"  
Envoyé: Mercredi 8 Juillet 2020 23:49:51 
Objet: Re: [Users] Issues after updating to 7.0.14 (136) 

>> ... 
>> Error in ploop_check (check.c:663): Dirty flag is set 
>> ... 
>> # ploop mount 
>> /vz/private/144dc737-b4e3-4c03-852c-25a6df06cee4/root.hdd/DiskDescriptor.xml 
>> Error in ploop_mount_image (ploop.c:2495): Image 
>> /vz/private/144dc737-b4e3-4c03-852c-25a6df06cee4/root.hdd/root.hds already 
>> used by device /dev/ploop11432 
>> ... 
>> 
>> I am lost , any help appreciated . 

I heard about 2 possible solutions: 
1. Reboot HW and stop CT. Then "ploop mount" CT's DiskDescriptor.xml for 
e2fsck. Unmount and restart CT. 
2. If won't help then create fresh new CT and move "broken" root.hds there. 

-- 
Oleksiy 

___ 
Users mailing list 
Users@openvz.org 
https://lists.openvz.org/mailman/listinfo/users 
___
Users mailing list
Users@openvz.org
https://lists.openvz.org/mailman/listinfo/users


Re: [Users] Issues after updating to 7.0.14 (136)

2020-07-08 Thread Oleksiy Tkachenko
>> ...
>> Error in ploop_check (check.c:663): Dirty flag is set
>> ...
>> # ploop mount
/vz/private/144dc737-b4e3-4c03-852c-25a6df06cee4/root.hdd/DiskDescriptor.xml
>> Error in ploop_mount_image (ploop.c:2495): Image
/vz/private/144dc737-b4e3-4c03-852c-25a6df06cee4/root.hdd/root.hds already
used by device /dev/ploop11432
>> ...
>>
>> I am lost , any help appreciated  .

I heard about 2 possible solutions:
1. Reboot HW and stop CT. Then "ploop mount" CT's DiskDescriptor.xml for
e2fsck. Unmount and restart CT.
2. If won't help then create fresh new CT and move "broken" root.hds there.

--
Oleksiy
___
Users mailing list
Users@openvz.org
https://lists.openvz.org/mailman/listinfo/users


Re: [Users] Issues after updating to 7.0.14 (136)

2020-07-07 Thread Kevin Drysdale

Hello,

Thanks to all who have replied to this thread so far - my apologies for 
taking so long to get back to you all.


In terms of where I'm seeing the EXT4 errors, they are showing up in the 
kernel log on the node itself, so the output of 'dmesg' is regularly 
seeing entries such as these:


[375095.199203] EXT4-fs (ploop43209p1): Remounting filesystem read-only
[375095.199267] EXT4-fs error (device ploop43209p1) in 
ext4_ext_remove_space:3073: IO failure
[375095.199400] EXT4-fs error (device ploop43209p1) in ext4_ext_truncate:4692: 
IO failure
[375095.199517] EXT4-fs error (device ploop43209p1) in 
ext4_reserve_inode_write:5358: Journal has aborted
[375095.199637] EXT4-fs error (device ploop43209p1) in ext4_truncate:4145: 
Journal has aborted
[375095.199779] EXT4-fs error (device ploop43209p1) in 
ext4_reserve_inode_write:5358: Journal has aborted
[375095.199957] EXT4-fs error (device ploop43209p1) in ext4_orphan_del:2731: 
Journal has aborted
[375095.200138] EXT4-fs error (device ploop43209p1) in 
ext4_reserve_inode_write:5358: Journal has aborted
[461642.709690] EXT4-fs (ploop43209p1): error count since last fsck: 8
[461642.709702] EXT4-fs (ploop43209p1): initial error at time 1593576601: 
ext4_ext_remove_space:3000: inode 136354
[461642.709708] EXT4-fs (ploop43209p1): last error at time 1593576601: 
ext4_reserve_inode_write:5358: inode 136354

Inside the container itself, not much is being logged, since the affected 
container in in this particular instance is indeed (as per the errors 
above) mounted read-only due to the errors its root.hdd filesystem is 
experiencing.


Having dug a bit more into what happened here, I suspect that this 
corruption may have come about when the containers were being moved either 
to or from the standby node and the live node, but I can't be 100% sure of 
that.


The picture is further muddied in that the standby node (the node that we 
used for evacuating containers from the node to be updated) was itself 
initially updated to 7.0.14 (135).  However, the live node (which was 
updated a short time after the standby node) appears to have got 7.0.14 
(136).  So I don't know if the issue was in fact with 7.0.14 (135) (which 
was on the standby node, where the containers would have been moved to, 
and moved back from), or on 7.0.14 (136) on the live node.  Were there any 
known issues with 7.0.14 (135) that might correlate with what I'm seeing 
above ?


Anyway, once again, thanks to everyone who has replied so far.  If anyone 
has any further questions or would like any further information, please 
let me know and I will be happy to assist.


Thank you,
Kevin Drysdale.


On Thu, 2 Jul 2020, Jehan PROCACCIA wrote:


yes , you are right, I do get the same virtuozzo-release  as mentioned in the 
initial subject, sorry for the noise .

# cat /etc/virtuozzo-release
OpenVZ release 7.0.14 (136)

but anyway, I don't see any ploop / fsck error in the host /var/log/vzctl.log
inside the CT , where did you see those errors ?

Jehan .

_
De: "jjs - mainphrame" 
À: "OpenVZ users" 
Envoyé: Jeudi 2 Juillet 2020 19:33:23
Objet: Re: [Users] Issues after updating to 7.0.14 (136)

Thanks for that sanity check, the conundrum is resolved. vzlinux-release and 
virtuozzo-release are indeed different things.
Jake

On Thu, Jul 2, 2020 at 10:27 AM Jonathan Wright  wrote:

  /etc/redhat-release and /etc/virtuozzo-release are two different things.

  On 7/2/20 12:16 PM, jjs - mainphrame wrote:
  Jehan - 

  I get the same output here -

  [root@annie ~]# yum repolist  |grep virt
  virtuozzolinux-base    VirtuozzoLinux Base                            
15,415+189
  virtuozzolinux-updates VirtuozzoLinux Updates                             
     0

  I'm baffled as to how you're on 7.8.0 while I'm at 7.0,15 even though I'm 
fully up to date.

  # uname -a
  Linux annie.ufcfan.org 3.10.0-1127.8.2.vz7.151.10 #1 SMP Mon Jun 1 
19:05:52 MSK 2020 x86_64 x86_64 x86_64 GNU/Linux

Jake

On Thu, Jul 2, 2020 at 10:08 AM Jehan PROCACCIA  
wrote:
  no factory , just repos virtuozzolinux-base and openvz-os

# yum repolist  |grep virt
virtuozzolinux-base    VirtuozzoLinux Base    15 415+189
virtuozzolinux-updates VirtuozzoLinux Updates  0

Jehan .

_
De: "jjs - mainphrame" 
À: "OpenVZ users" 
Cc: "Kevin Drysdale" 
Envoyé: Jeudi 2 Juillet 2020 18:22:33
Objet: Re: [Users] Issues after updating to 7.0.14 (136)

Jehan, are you running factory?

My ovz hosts are up to date, and I see:

[root@annie ~]# cat /etc/virtuozzo-release
OpenVZ release 7.0.15 (222)

Jake


On Thu, 

Re: [Users] Issues after updating to 7.0.14 (136)

2020-07-06 Thread Jehan Procaccia IMT

Hello

If it can help, what I did so far to try to re-enable dead CTs

# prlctl stop ldap2
Stopping the CT...
Failed to stop the CT: PRL_ERR_VZCTL_OPERATION_FAILED (Details: Cannot 
lock the Container

)
# cat /vz/lock/144dc737-b4e3-4c03-852c-25a6df06cee4.lck
6227
resuming
# ps auwx | grep 6227
root    6227  0.0  0.0  92140  6984 ?    S    15:10   0:00 
/usr/sbin/vzctl resume 144dc737-b4e3-4c03-852c-25a6df06cee4

# kill -9  6227

still cannot stop the CT  (Cannot lock the Container...)


# df |grep 144dc737-b4e3-4c03-852c-25a6df06cee4
/dev/ploop11432p1  10188052   2546636    7100848  27% 
/vz/root/144dc737-b4e3-4c03-852c-25a6df06cee4
none    1048576 0    1048576   0% 
/vz/private/144dc737-b4e3-4c03-852c-25a6df06cee4/dump/Dump/.criu.cgyard.56I2ls

# umount /dev/ploop11432p1

# ploop check -F 
/vz/private/144dc737-b4e3-4c03-852c-25a6df06cee4/root.hdd/root.hds

Reopen rw /vz/private/144dc737-b4e3-4c03-852c-25a6df06cee4/root.hdd/root.hds
Error in ploop_check (check.c:663): Dirty flag is set

# ploop mount 
/vz/private/144dc737-b4e3-4c03-852c-25a6df06cee4/root.hdd/DiskDescriptor.xml
Error in ploop_mount_image (ploop.c:2495): Image 
/vz/private/144dc737-b4e3-4c03-852c-25a6df06cee4/root.hdd/root.hds 
already used by device /dev/ploop11432

# df -H | grep ploop11432
=> nothing

I am lost , any help appreciated  .

Thanks .

Le 06/07/2020 à 15:37, Jehan Procaccia IMT a écrit :


Hello,

I am back to the initial pb related to that post , since I updated to 
/OpenVZ release 7.0.14 (136)  | ///Virtuozzo Linux release 7.8.0 
(609)// , I am also facing CT corrupted status .


I don't see the exact same error as mentioned by Kevin Drysdale below 
(ploop/fsck) , but I am not able to enter certain CT neither can I 
stop them


/[root@olb~]# prlctl stop trans8//
//Stopping the CT...//
//Failed to stop the CT: PRL_ERR_VZCTL_OPERATION_FAILED (Details: 
Cannot lock the Container//

//)//
/

/[root@olb ~]# prlctl enter trans8//
//Unable to get init pid//
//enter into CT failed//
//
//exited from CT 02faecdd-ddb6-42eb-8103-202508f18256/

For those CTs that fail to enter or stop, I noticed that there is a 
2nd device mounted with name ending in /dump/Dump/.criu.cgyard.4EJB8c//

/

/[root@olb ~]# df -H |grep 02faecdd-ddb6-42eb-8103-202508f18256//
///dev/ploop53152p1  11G    2,2G  7,7G  23% 
/vz/root/02faecdd-ddb6-42eb-8103-202508f18256//
//none  537M   0  537M   0% 
/vz/private/02faecdd-ddb6-42eb-8103-202508f18256/dump/Dump/.criu.cgyard.4EJB8c/



//[root@olb ~]# prlctl list | grep 02faecdd-ddb6-42eb-8103-202508f18256//
//{02faecdd-ddb6-42eb-8103-202508f18256}  running 157.159.196.17  CT 
isptrans8//

//

I rebooted the whole hardware node, and since reboot here is the 
related vzctl.log


/2020-07-06T15:10:38+0200 vzctl : CT 
02faecdd-ddb6-42eb-8103-202508f18256 : Removing the stale lock file 
/vz/lock/02faecdd-ddb6-42eb-8103-202508f18256.lck//
//2020-07-06T15:10:38+0200 vzctl : CT 
02faecdd-ddb6-42eb-8103-202508f18256 : Restoring the Container ...//
//2020-07-06T15:10:38+0200 vzctl : CT 
02faecdd-ddb6-42eb-8103-202508f18256 : Mount image: 
/vz/private/02faecdd-ddb6-42eb-8103-202508f18256/root.hdd //
//2020-07-06T15:10:38+0200 : Opening delta 
/vz/private/02faecdd-ddb6-42eb-8103-202508f18256/root.hdd/root.hds//
//2020-07-06T15:10:38+0200 : Opening delta 
/vz/private/02faecdd-ddb6-42eb-8103-202508f18256/root.hdd/root.hds//
//2020-07-06T15:10:38+0200 : Opening delta 
/vz/private/02faecdd-ddb6-42eb-8103-202508f18256/root.hdd/root.hds//
//2020-07-06T15:10:38+0200 : Adding delta dev=/dev/ploop53152 
img=/vz/private/02faecdd-ddb6-42eb-8103-202508f18256/root.hdd/root.hds 
(rw)//
//2020-07-06T15:10:39+0200 : Mounted /dev/ploop53152p1 at 
/vz/root/02faecdd-ddb6-42eb-8103-202508f18256 fstype=ext4 
data=',balloon_ino=12' //
//2020-07-06T15:10:39+0200 vzctl : CT 
02faecdd-ddb6-42eb-8103-202508f18256 : Container is mounted//
//2020-07-06T15:10:40+0200 vzctl : CT 
02faecdd-ddb6-42eb-8103-202508f18256 : Setting permissions for 
image=/vz/private/02faecdd-ddb6-42eb-8103-202508f18256/root.hdd//
//2020-07-06T15:10:40+0200 vzctl : CT 
02faecdd-ddb6-42eb-8103-202508f18256 : Configure memguarantee: 0%//
//2020-07-06T15:18:12+0200 vzctl : CT 
02faecdd-ddb6-42eb-8103-202508f18256 : Unable to get init pid//
//2020-07-06T15:18:12+0200 vzctl : CT 
02faecdd-ddb6-42eb-8103-202508f18256 : enter into CT failed//
//2020-07-06T15:19:49+0200 vzctl : CT 
02faecdd-ddb6-42eb-8103-202508f18256 : Cannot lock the Container//
//2020-07-06T15:25:33+0200 vzctl : CT 
02faecdd-ddb6-42eb-8103-202508f18256 : Unable to get init pid//
//2020-07-06T15:25:33+0200 vzctl : CT 
02faecdd-ddb6-42eb-8103-202508f18256 : enter into CT failed/


on another CT failing to enter / stop same kind of logs  + /Error 
(criu /:


/2020-07-06T15:10:38+0200 vzctl : CT 
4ae48335-5b63-475d-8629-c8d742cb0ba0 : Restoring the Container ...//
//2020-07-06T15:10:38+0200 vzctl : CT 
4ae48335-5b63-475d-8629-c8d742cb0ba0 : Mount 

Re: [Users] Issues after updating to 7.0.14 (136)

2020-07-06 Thread Jehan Procaccia IMT

Hello,

I am back to the initial pb related to that post , since I updated to 
/OpenVZ release 7.0.14 (136)  | ///Virtuozzo Linux release 7.8.0 (609)// 
, I am also facing CT corrupted status .


I don't see the exact same error as mentioned by Kevin Drysdale below 
(ploop/fsck) , but I am not able to enter certain CT neither can I stop 
them


/[root@olb~]# prlctl stop trans8//
//Stopping the CT...//
//Failed to stop the CT: PRL_ERR_VZCTL_OPERATION_FAILED (Details: Cannot 
lock the Container//

//)//
/

/[root@olb ~]# prlctl enter trans8//
//Unable to get init pid//
//enter into CT failed//
//
//exited from CT 02faecdd-ddb6-42eb-8103-202508f18256/

For those CTs that fail to enter or stop, I noticed that there is a 2nd 
device mounted with name ending in /dump/Dump/.criu.cgyard.4EJB8c//

/

/[root@olb ~]# df -H |grep 02faecdd-ddb6-42eb-8103-202508f18256//
///dev/ploop53152p1  11G    2,2G  7,7G  23% 
/vz/root/02faecdd-ddb6-42eb-8103-202508f18256//
//none  537M   0  537M   0% 
/vz/private/02faecdd-ddb6-42eb-8103-202508f18256/dump/Dump/.criu.cgyard.4EJB8c/



//[root@olb ~]# prlctl list | grep 02faecdd-ddb6-42eb-8103-202508f18256//
//{02faecdd-ddb6-42eb-8103-202508f18256}  running 157.159.196.17  CT 
isptrans8//

//

I rebooted the whole hardware node, and since reboot here is the related 
vzctl.log


/2020-07-06T15:10:38+0200 vzctl : CT 
02faecdd-ddb6-42eb-8103-202508f18256 : Removing the stale lock file 
/vz/lock/02faecdd-ddb6-42eb-8103-202508f18256.lck//
//2020-07-06T15:10:38+0200 vzctl : CT 
02faecdd-ddb6-42eb-8103-202508f18256 : Restoring the Container ...//
//2020-07-06T15:10:38+0200 vzctl : CT 
02faecdd-ddb6-42eb-8103-202508f18256 : Mount image: 
/vz/private/02faecdd-ddb6-42eb-8103-202508f18256/root.hdd //
//2020-07-06T15:10:38+0200 : Opening delta 
/vz/private/02faecdd-ddb6-42eb-8103-202508f18256/root.hdd/root.hds//
//2020-07-06T15:10:38+0200 : Opening delta 
/vz/private/02faecdd-ddb6-42eb-8103-202508f18256/root.hdd/root.hds//
//2020-07-06T15:10:38+0200 : Opening delta 
/vz/private/02faecdd-ddb6-42eb-8103-202508f18256/root.hdd/root.hds//
//2020-07-06T15:10:38+0200 : Adding delta dev=/dev/ploop53152 
img=/vz/private/02faecdd-ddb6-42eb-8103-202508f18256/root.hdd/root.hds 
(rw)//
//2020-07-06T15:10:39+0200 : Mounted /dev/ploop53152p1 at 
/vz/root/02faecdd-ddb6-42eb-8103-202508f18256 fstype=ext4 
data=',balloon_ino=12' //
//2020-07-06T15:10:39+0200 vzctl : CT 
02faecdd-ddb6-42eb-8103-202508f18256 : Container is mounted//
//2020-07-06T15:10:40+0200 vzctl : CT 
02faecdd-ddb6-42eb-8103-202508f18256 : Setting permissions for 
image=/vz/private/02faecdd-ddb6-42eb-8103-202508f18256/root.hdd//
//2020-07-06T15:10:40+0200 vzctl : CT 
02faecdd-ddb6-42eb-8103-202508f18256 : Configure memguarantee: 0%//
//2020-07-06T15:18:12+0200 vzctl : CT 
02faecdd-ddb6-42eb-8103-202508f18256 : Unable to get init pid//
//2020-07-06T15:18:12+0200 vzctl : CT 
02faecdd-ddb6-42eb-8103-202508f18256 : enter into CT failed//
//2020-07-06T15:19:49+0200 vzctl : CT 
02faecdd-ddb6-42eb-8103-202508f18256 : Cannot lock the Container//
//2020-07-06T15:25:33+0200 vzctl : CT 
02faecdd-ddb6-42eb-8103-202508f18256 : Unable to get init pid//
//2020-07-06T15:25:33+0200 vzctl : CT 
02faecdd-ddb6-42eb-8103-202508f18256 : enter into CT failed/


on another CT failing to enter / stop same kind of logs  + /Error (criu /:

/2020-07-06T15:10:38+0200 vzctl : CT 
4ae48335-5b63-475d-8629-c8d742cb0ba0 : Restoring the Container ...//
//2020-07-06T15:10:38+0200 vzctl : CT 
4ae48335-5b63-475d-8629-c8d742cb0ba0 : Mount image: 
/vz/private/4ae48335-5b63-475d-8629-c8d742cb0ba0/root.hdd //
//2020-07-06T15:10:38+0200 : Opening delta 
/vz/private/4ae48335-5b63-475d-8629-c8d742cb0ba0/root.hdd/root.hds//
//2020-07-06T15:10:39+0200 : Opening delta 
/vz/private/4ae48335-5b63-475d-8629-c8d742cb0ba0/root.hdd/root.hds//
//2020-07-06T15:10:39+0200 : Opening delta 
/vz/private/4ae48335-5b63-475d-8629-c8d742cb0ba0/root.hdd/root.hds//
//2020-07-06T15:10:39+0200 : Adding delta dev=/dev/ploop36049 
img=/vz/private/4ae48335-5b63-475d-8629-c8d742cb0ba0/root.hdd/root.hds 
(rw)//
//2020-07-06T15:10:41+0200 : Mounted /dev/ploop36049p1 at 
/vz/root/4ae48335-5b63-475d-8629-c8d742cb0ba0 fstype=ext4 
data=',balloon_ino=12' //
//2020-07-06T15:10:41+0200 vzctl : CT 
4ae48335-5b63-475d-8629-c8d742cb0ba0 : Container is mounted//
//2020-07-06T15:10:41+0200 vzctl : CT 
4ae48335-5b63-475d-8629-c8d742cb0ba0 : Setting permissions for 
image=/vz/private/4ae48335-5b63-475d-8629-c8d742cb0ba0/root.hdd//
//2020-07-06T15:10:41+0200 vzctl : CT 
4ae48335-5b63-475d-8629-c8d742cb0ba0 : Configure memguarantee: 0%//
//2020-07-06T15:10:57+0200 vzeventd : Run: /etc/vz/vzevent.d/ve-stop 
id=4ae48335-5b63-475d-8629-c8d742cb0ba0//
//2020-07-06T15:10:57+0200 vzctl : CT 
4ae48335-5b63-475d-8629-c8d742cb0ba0 : (03.038774) Error 
(criu/util.c:666): exited, status=4//
//2020-07-06T15:10:57+0200 vzctl : CT 
4ae48335-5b63-475d-8629-c8d742cb0ba0 : (14.446513)  1: Error 

Re: [Users] Issues after updating to 7.0.14 (136)

2020-07-02 Thread Jehan PROCACCIA
yes , you are right, I do get the same virtuozzo-release as mentioned in the 
initial subject, sorry for the noise . 

# cat /etc/virtuozzo-release 
OpenVZ release 7.0.14 (136) 

but anyway, I don't see any ploop / fsck error in the host /var/log/vzctl.log 
inside the CT , where did you see those errors ? 

Jehan . 


De: "jjs - mainphrame"  
À: "OpenVZ users"  
Envoyé: Jeudi 2 Juillet 2020 19:33:23 
Objet: Re: [Users] Issues after updating to 7.0.14 (136) 

Thanks for that sanity check, the conundrum is resolved. vzlinux-release and 
virtuozzo-release are indeed different things. 
Jake 

On Thu, Jul 2, 2020 at 10:27 AM Jonathan Wright < [ 
mailto:jonat...@knownhost.com | jonat...@knownhost.com ] > wrote: 





/etc/redhat-release and /etc/virtuozzo-release are two different things. 
On 7/2/20 12:16 PM, jjs - mainphrame wrote: 

BQ_BEGIN

Jehan - 

I get the same output here - 

[root@annie ~]# yum repolist |grep virt 
virtuozzolinux-base VirtuozzoLinux Base 15,415+189 
virtuozzolinux-updates VirtuozzoLinux Updates 0 

I'm baffled as to how you're on 7.8.0 while I'm at 7.0,15 even though I'm fully 
up to date. 

# uname -a 
Linux [ http://annie.ufcfan.org/ | annie.ufcfan.org ] 
3.10.0-1127.8.2.vz7.151.10 #1 SMP Mon Jun 1 19:05:52 MSK 2020 x86_64 x86_64 
x86_64 GNU/Linux 

Jake 

On Thu, Jul 2, 2020 at 10:08 AM Jehan PROCACCIA < [ 
mailto:jehan.procac...@imtbs-tsp.eu | jehan.procac...@imtbs-tsp.eu ] > wrote: 

BQ_BEGIN

no factory , just repos virtuozzolinux-base and openvz-os 

# yum repolist |grep virt 
virtuozzolinux-base VirtuozzoLinux Base 15 415+189 
virtuozzolinux-updates VirtuozzoLinux Updates 0 

Jehan . 


De: "jjs - mainphrame" < [ mailto:j...@mainphrame.com | j...@mainphrame.com ] > 
À: "OpenVZ users" < [ mailto:users@openvz.org | users@openvz.org ] > 
Cc: "Kevin Drysdale" < [ mailto:kevin.drysd...@iomart.com | 
kevin.drysd...@iomart.com ] > 
Envoyé: Jeudi 2 Juillet 2020 18:22:33 
Objet: Re: [Users] Issues after updating to 7.0.14 (136) 

Jehan, are you running factory? 

My ovz hosts are up to date, and I see: 

[root@annie ~]# cat /etc/virtuozzo-release 
OpenVZ release 7.0.15 (222) 

Jake 


On Thu, Jul 2, 2020 at 9:08 AM Jehan Procaccia IMT < [ 
mailto:jehan.procac...@imtbs-tsp.eu | jehan.procac...@imtbs-tsp.eu ] > wrote: 

BQ_BEGIN

"updating to 7.0.14 (136)" !? 

I did an update yesterday , I am far behind that version 

# cat /etc/vzlinux-release 
Virtuozzo Linux release 7.8.0 (609) 

# uname -a 
Linux localhost 3.10.0-1127.8.2.vz7.151.14 #1 SMP Tue Jun 9 12:58:54 MSK 2020 
x86_64 x86_64 x86_64 GNU/Linux 

why don't you try to update to latest version ? 


Le 29/06/2020 à 12:30, Kevin Drysdale a écrit : 

BQ_BEGIN
Hello, 

After updating one of our OpenVZ VPS hosting nodes at the end of last week, 
we've started to have issues with corruption apparently occurring inside 
containers. Issues of this nature have never affected the node previously, and 
there do not appear to be any hardware issues that could explain this. 

Specifically, a few hours after updating, we began to see containers 
experiencing errors such as this in the logs: 

[90471.678994] EXT4-fs (ploop35454p1): error count since last fsck: 25 
[90471.679022] EXT4-fs (ploop35454p1): initial error at time 1593205255: 
ext4_ext_find_extent:904: inode 136399 
[90471.679030] EXT4-fs (ploop35454p1): last error at time 1593232922: 
ext4_ext_find_extent:904: inode 136399 
[95189.954569] EXT4-fs (ploop42983p1): error count since last fsck: 67 
[95189.954582] EXT4-fs (ploop42983p1): initial error at time 1593210174: 
htree_dirblock_to_tree:918: inode 926441: block 3683060 
[95189.954589] EXT4-fs (ploop42983p1): last error at time 1593276902: 
ext4_iget:4435: inode 1849777 
[95714.207432] EXT4-fs (ploop60706p1): error count since last fsck: 42 
[95714.207447] EXT4-fs (ploop60706p1): initial error at time 1593210489: 
ext4_ext_find_extent:904: inode 136272 
[95714.207452] EXT4-fs (ploop60706p1): last error at time 1593231063: 
ext4_ext_find_extent:904: inode 136272 

Shutting the containers down and manually mounting and e2fsck'ing their 
filesystems did clear these errors, but each of the containers (which were 
mostly used for running Plesk) had widespread issues with corrupt or missing 
files after the fsck's completed, necessitating their being restored from 
backup. 

Concurrently, we also began to see messages like this appearing in 
/var/log/vzctl.log, which again have never appeared at any point prior to this 
update being installed: 

/var/log/vzctl.log:2020-06-26T21:05:19+0100 : Error in fill_hole (check.c:240): 
Warning: ploop image '/vz/private/8288448/root.hdd/root.hds' is sparse 
/var/log/vzctl.log:2020-06-26T21:09:41+0100 : Error in fill_hole (check.c:240): 
Warning: ploop image '/vz/private/8288450/root.hdd/root.hds' is sparse 
/var/log/vzctl.log:2020-06-26T21:16:22+0100 : Error in fill_hole (check.c:240): 
Warning: ploop imag

Re: [Users] Issues after updating to 7.0.14 (136)

2020-07-02 Thread jjs - mainphrame
Thanks for that sanity check, the conundrum is resolved. vzlinux-release
and virtuozzo-release are indeed different things.

Jake

On Thu, Jul 2, 2020 at 10:27 AM Jonathan Wright 
wrote:

> /etc/redhat-release and /etc/virtuozzo-release are two different things.
> On 7/2/20 12:16 PM, jjs - mainphrame wrote:
>
> Jehan -
>
> I get the same output here -
>
> [root@annie ~]# yum repolist  |grep virt
> virtuozzolinux-baseVirtuozzoLinux Base
>  15,415+189
> virtuozzolinux-updates VirtuozzoLinux Updates
>  0
>
> I'm baffled as to how you're on 7.8.0 while I'm at 7.0,15 even though I'm
> fully up to date.
>
> # uname -a
> Linux annie.ufcfan.org 3.10.0-1127.8.2.vz7.151.10 #1 SMP Mon Jun 1
> 19:05:52 MSK 2020 x86_64 x86_64 x86_64 GNU/Linux
>
> Jake
>
> On Thu, Jul 2, 2020 at 10:08 AM Jehan PROCACCIA <
> jehan.procac...@imtbs-tsp.eu> wrote:
>
>> no factory , just repos virtuozzolinux-base and openvz-os
>>
>> # yum repolist  |grep virt
>> virtuozzolinux-baseVirtuozzoLinux Base15
>> 415+189
>> virtuozzolinux-updates VirtuozzoLinux
>> Updates  0
>>
>> Jehan .
>>
>> ----------
>> *De: *"jjs - mainphrame" 
>> *À: *"OpenVZ users" 
>> *Cc: *"Kevin Drysdale" 
>> *Envoyé: *Jeudi 2 Juillet 2020 18:22:33
>> *Objet: *Re: [Users] Issues after updating to 7.0.14 (136)
>>
>> Jehan, are you running factory?
>>
>> My ovz hosts are up to date, and I see:
>>
>> [root@annie ~]# cat /etc/virtuozzo-release
>> OpenVZ release 7.0.15 (222)
>>
>> Jake
>>
>>
>> On Thu, Jul 2, 2020 at 9:08 AM Jehan Procaccia IMT <
>> jehan.procac...@imtbs-tsp.eu> wrote:
>>
>>> "updating to 7.0.14 (136)" !?
>>>
>>> I did an update yesterday , I am far behind that version
>>>
>>> *# cat /etc/vzlinux-release*
>>> *Virtuozzo Linux release 7.8.0 (609)*
>>>
>>> *# uname -a *
>>> *Linux localhost 3.10.0-1127.8.2.vz7.151.14 #1 SMP Tue Jun 9 12:58:54
>>> MSK 2020 x86_64 x86_64 x86_64 GNU/Linux*
>>>
>>> why don't you try to update to latest version ?
>>>
>>>
>>> Le 29/06/2020 à 12:30, Kevin Drysdale a écrit :
>>>
>>> Hello,
>>>
>>> After updating one of our OpenVZ VPS hosting nodes at the end of last
>>> week, we've started to have issues with corruption apparently occurring
>>> inside containers.  Issues of this nature have never affected the node
>>> previously, and there do not appear to be any hardware issues that could
>>> explain this.
>>>
>>> Specifically, a few hours after updating, we began to see containers
>>> experiencing errors such as this in the logs:
>>>
>>> [90471.678994] EXT4-fs (ploop35454p1): error count since last fsck: 25
>>> [90471.679022] EXT4-fs (ploop35454p1): initial error at time 1593205255:
>>> ext4_ext_find_extent:904: inode 136399
>>> [90471.679030] EXT4-fs (ploop35454p1): last error at time 1593232922:
>>> ext4_ext_find_extent:904: inode 136399
>>> [95189.954569] EXT4-fs (ploop42983p1): error count since last fsck: 67
>>> [95189.954582] EXT4-fs (ploop42983p1): initial error at time 1593210174:
>>> htree_dirblock_to_tree:918: inode 926441: block 3683060
>>> [95189.954589] EXT4-fs (ploop42983p1): last error at time 1593276902:
>>> ext4_iget:4435: inode 1849777
>>> [95714.207432] EXT4-fs (ploop60706p1): error count since last fsck: 42
>>> [95714.207447] EXT4-fs (ploop60706p1): initial error at time 1593210489:
>>> ext4_ext_find_extent:904: inode 136272
>>> [95714.207452] EXT4-fs (ploop60706p1): last error at time 1593231063:
>>> ext4_ext_find_extent:904: inode 136272
>>>
>>> Shutting the containers down and manually mounting and e2fsck'ing their
>>> filesystems did clear these errors, but each of the containers (which were
>>> mostly used for running Plesk) had widespread issues with corrupt or
>>> missing files after the fsck's completed, necessitating their being
>>> restored from backup.
>>>
>>> Concurrently, we also began to see messages like this appearing in
>>> /var/log/vzctl.log, which again have never appeared at any point prior to
>>> this update being installed:
>>>
>>> /var/log/vzctl.log:2020-06-26T21:05:19+0100 : Error in fill_hole
>>> (check.c:240): Warning: ploop image '/vz/private/8288448/root.hdd/root.hds'
>>> is sparse
>>> /var/l

Re: [Users] Issues after updating to 7.0.14 (136)

2020-07-02 Thread Jonathan Wright

/etc/redhat-release and /etc/virtuozzo-release are two different things.

On 7/2/20 12:16 PM, jjs - mainphrame wrote:

Jehan -

I get the same output here -

[root@annie ~]# yum repolist  |grep virt
virtuozzolinux-base    VirtuozzoLinux Base      15,415+189
virtuozzolinux-updates VirtuozzoLinux Updates                0

I'm baffled as to how you're on 7.8.0 while I'm at 7.0,15 even though 
I'm fully up to date.


# uname -a
Linux annie.ufcfan.org <http://annie.ufcfan.org> 
3.10.0-1127.8.2.vz7.151.10 #1 SMP Mon Jun 1 19:05:52 MSK 2020 x86_64 
x86_64 x86_64 GNU/Linux


Jake

On Thu, Jul 2, 2020 at 10:08 AM Jehan PROCACCIA 
mailto:jehan.procac...@imtbs-tsp.eu>> 
wrote:


no factory , just repos virtuozzolinux-base and openvz-os

# yum repolist  |grep virt
virtuozzolinux-base    VirtuozzoLinux
Base    15 415+189
virtuozzolinux-updates VirtuozzoLinux
Updates  0

Jehan .


*De: *"jjs - mainphrame" mailto:j...@mainphrame.com>>
*À: *"OpenVZ users" mailto:users@openvz.org>>
*Cc: *"Kevin Drysdale" mailto:kevin.drysd...@iomart.com>>
    *Envoyé: *Jeudi 2 Juillet 2020 18:22:33
    *Objet: *Re: [Users] Issues after updating to 7.0.14 (136)

Jehan, are you running factory?

My ovz hosts are up to date, and I see:

[root@annie ~]# cat /etc/virtuozzo-release
OpenVZ release 7.0.15 (222)

Jake


On Thu, Jul 2, 2020 at 9:08 AM Jehan Procaccia IMT
mailto:jehan.procac...@imtbs-tsp.eu>> wrote:

"updating to 7.0.14 (136)" !?

I did an update yesterday , I am far behind that version

/# cat /etc/vzlinux-release//
/
/Virtuozzo Linux release 7.8.0 (609)/
/
/
/# uname -a //
//Linux localhost 3.10.0-1127.8.2.vz7.151.14 #1 SMP Tue Jun 9
12:58:54 MSK 2020 x86_64 x86_64 x86_64 GNU/Linux//
/
why don't you try to update to latest version ?


Le 29/06/2020 à 12:30, Kevin Drysdale a écrit :

Hello,

After updating one of our OpenVZ VPS hosting nodes at the
end of last week, we've started to have issues with
corruption apparently occurring inside containers.  Issues
of this nature have never affected the node previously,
and there do not appear to be any hardware issues that
could explain this.

Specifically, a few hours after updating, we began to see
containers experiencing errors such as this in the logs:

[90471.678994] EXT4-fs (ploop35454p1): error count since
last fsck: 25
[90471.679022] EXT4-fs (ploop35454p1): initial error at
time 1593205255: ext4_ext_find_extent:904: inode 136399
[90471.679030] EXT4-fs (ploop35454p1): last error at time
1593232922: ext4_ext_find_extent:904: inode 136399
[95189.954569] EXT4-fs (ploop42983p1): error count since
last fsck: 67
[95189.954582] EXT4-fs (ploop42983p1): initial error at
time 1593210174: htree_dirblock_to_tree:918: inode 926441:
block 3683060
[95189.954589] EXT4-fs (ploop42983p1): last error at time
1593276902: ext4_iget:4435: inode 1849777
[95714.207432] EXT4-fs (ploop60706p1): error count since
last fsck: 42
[95714.207447] EXT4-fs (ploop60706p1): initial error at
time 1593210489: ext4_ext_find_extent:904: inode 136272
[95714.207452] EXT4-fs (ploop60706p1): last error at time
1593231063: ext4_ext_find_extent:904: inode 136272

Shutting the containers down and manually mounting and
e2fsck'ing their filesystems did clear these errors, but
each of the containers (which were mostly used for running
Plesk) had widespread issues with corrupt or missing files
after the fsck's completed, necessitating their being
restored from backup.

Concurrently, we also began to see messages like this
appearing in /var/log/vzctl.log, which again have never
appeared at any point prior to this update being installed:

/var/log/vzctl.log:2020-06-26T21:05:19+0100 : Error in
fill_hole (check.c:240): Warning: ploop image
'/vz/private/8288448/root.hdd/root.hds' is sparse
/var/log/vzctl.log:2020-06-26T21:09:41+0100 : Error in
fill_hole (check.c:240): Warning: ploop image
'/vz/private/8288450/root.hdd/root.hds' is sparse
/var/log/vzctl.log:2020-06-26T21:16:22+0100 : Error in
fill_hole (check.c:240): Warning: ploop image
'/vz/private/8288451/root.hdd/root.hds' is sparse
/var/log/vzctl.log:2020

Re: [Users] Issues after updating to 7.0.14 (136)

2020-07-02 Thread jjs - mainphrame
Jehan -

I get the same output here -

[root@annie ~]# yum repolist  |grep virt
virtuozzolinux-baseVirtuozzoLinux Base
 15,415+189
virtuozzolinux-updates VirtuozzoLinux Updates
   0

I'm baffled as to how you're on 7.8.0 while I'm at 7.0,15 even though I'm
fully up to date.

# uname -a
Linux annie.ufcfan.org 3.10.0-1127.8.2.vz7.151.10 #1 SMP Mon Jun 1 19:05:52
MSK 2020 x86_64 x86_64 x86_64 GNU/Linux

Jake

On Thu, Jul 2, 2020 at 10:08 AM Jehan PROCACCIA <
jehan.procac...@imtbs-tsp.eu> wrote:

> no factory , just repos virtuozzolinux-base and openvz-os
>
> # yum repolist  |grep virt
> virtuozzolinux-baseVirtuozzoLinux Base15
> 415+189
> virtuozzolinux-updates VirtuozzoLinux
> Updates  0
>
> Jehan .
>
> --
> *De: *"jjs - mainphrame" 
> *À: *"OpenVZ users" 
> *Cc: *"Kevin Drysdale" 
> *Envoyé: *Jeudi 2 Juillet 2020 18:22:33
> *Objet: *Re: [Users] Issues after updating to 7.0.14 (136)
>
> Jehan, are you running factory?
>
> My ovz hosts are up to date, and I see:
>
> [root@annie ~]# cat /etc/virtuozzo-release
> OpenVZ release 7.0.15 (222)
>
> Jake
>
>
> On Thu, Jul 2, 2020 at 9:08 AM Jehan Procaccia IMT <
> jehan.procac...@imtbs-tsp.eu> wrote:
>
>> "updating to 7.0.14 (136)" !?
>>
>> I did an update yesterday , I am far behind that version
>>
>> *# cat /etc/vzlinux-release*
>> *Virtuozzo Linux release 7.8.0 (609)*
>>
>> *# uname -a *
>> *Linux localhost 3.10.0-1127.8.2.vz7.151.14 #1 SMP Tue Jun 9 12:58:54 MSK
>> 2020 x86_64 x86_64 x86_64 GNU/Linux*
>>
>> why don't you try to update to latest version ?
>>
>>
>> Le 29/06/2020 à 12:30, Kevin Drysdale a écrit :
>>
>> Hello,
>>
>> After updating one of our OpenVZ VPS hosting nodes at the end of last
>> week, we've started to have issues with corruption apparently occurring
>> inside containers.  Issues of this nature have never affected the node
>> previously, and there do not appear to be any hardware issues that could
>> explain this.
>>
>> Specifically, a few hours after updating, we began to see containers
>> experiencing errors such as this in the logs:
>>
>> [90471.678994] EXT4-fs (ploop35454p1): error count since last fsck: 25
>> [90471.679022] EXT4-fs (ploop35454p1): initial error at time 1593205255:
>> ext4_ext_find_extent:904: inode 136399
>> [90471.679030] EXT4-fs (ploop35454p1): last error at time 1593232922:
>> ext4_ext_find_extent:904: inode 136399
>> [95189.954569] EXT4-fs (ploop42983p1): error count since last fsck: 67
>> [95189.954582] EXT4-fs (ploop42983p1): initial error at time 1593210174:
>> htree_dirblock_to_tree:918: inode 926441: block 3683060
>> [95189.954589] EXT4-fs (ploop42983p1): last error at time 1593276902:
>> ext4_iget:4435: inode 1849777
>> [95714.207432] EXT4-fs (ploop60706p1): error count since last fsck: 42
>> [95714.207447] EXT4-fs (ploop60706p1): initial error at time 1593210489:
>> ext4_ext_find_extent:904: inode 136272
>> [95714.207452] EXT4-fs (ploop60706p1): last error at time 1593231063:
>> ext4_ext_find_extent:904: inode 136272
>>
>> Shutting the containers down and manually mounting and e2fsck'ing their
>> filesystems did clear these errors, but each of the containers (which were
>> mostly used for running Plesk) had widespread issues with corrupt or
>> missing files after the fsck's completed, necessitating their being
>> restored from backup.
>>
>> Concurrently, we also began to see messages like this appearing in
>> /var/log/vzctl.log, which again have never appeared at any point prior to
>> this update being installed:
>>
>> /var/log/vzctl.log:2020-06-26T21:05:19+0100 : Error in fill_hole
>> (check.c:240): Warning: ploop image '/vz/private/8288448/root.hdd/root.hds'
>> is sparse
>> /var/log/vzctl.log:2020-06-26T21:09:41+0100 : Error in fill_hole
>> (check.c:240): Warning: ploop image '/vz/private/8288450/root.hdd/root.hds'
>> is sparse
>> /var/log/vzctl.log:2020-06-26T21:16:22+0100 : Error in fill_hole
>> (check.c:240): Warning: ploop image '/vz/private/8288451/root.hdd/root.hds'
>> is sparse
>> /var/log/vzctl.log:2020-06-26T21:19:57+0100 : Error in fill_hole
>> (check.c:240): Warning: ploop image '/vz/private/8288452/root.hdd/root.hds'
>> is sparse
>>
>> The basic procedure we follow when updating our nodes is as follows:
>>
>> 1, Update the standby node we keep spare for this process
>> 2. vzmigrate all containers from the live node being updated to the
>> stan

Re: [Users] Issues after updating to 7.0.14 (136)

2020-07-02 Thread Jehan PROCACCIA
no factory , just repos virtuozzolinux-base and openvz-os 

# yum repolist |grep virt 
virtuozzolinux-base VirtuozzoLinux Base 15 415+189 
virtuozzolinux-updates VirtuozzoLinux Updates 0 

Jehan . 


De: "jjs - mainphrame"  
À: "OpenVZ users"  
Cc: "Kevin Drysdale"  
Envoyé: Jeudi 2 Juillet 2020 18:22:33 
Objet: Re: [Users] Issues after updating to 7.0.14 (136) 

Jehan, are you running factory? 

My ovz hosts are up to date, and I see: 

[root@annie ~]# cat /etc/virtuozzo-release 
OpenVZ release 7.0.15 (222) 

Jake 


On Thu, Jul 2, 2020 at 9:08 AM Jehan Procaccia IMT < [ 
mailto:jehan.procac...@imtbs-tsp.eu | jehan.procac...@imtbs-tsp.eu ] > wrote: 



"updating to 7.0.14 (136)" !? 

I did an update yesterday , I am far behind that version 

# cat /etc/vzlinux-release 
Virtuozzo Linux release 7.8.0 (609) 

# uname -a 
Linux localhost 3.10.0-1127.8.2.vz7.151.14 #1 SMP Tue Jun 9 12:58:54 MSK 2020 
x86_64 x86_64 x86_64 GNU/Linux 

why don't you try to update to latest version ? 


Le 29/06/2020 à 12:30, Kevin Drysdale a écrit : 

BQ_BEGIN
Hello, 

After updating one of our OpenVZ VPS hosting nodes at the end of last week, 
we've started to have issues with corruption apparently occurring inside 
containers. Issues of this nature have never affected the node previously, and 
there do not appear to be any hardware issues that could explain this. 

Specifically, a few hours after updating, we began to see containers 
experiencing errors such as this in the logs: 

[90471.678994] EXT4-fs (ploop35454p1): error count since last fsck: 25 
[90471.679022] EXT4-fs (ploop35454p1): initial error at time 1593205255: 
ext4_ext_find_extent:904: inode 136399 
[90471.679030] EXT4-fs (ploop35454p1): last error at time 1593232922: 
ext4_ext_find_extent:904: inode 136399 
[95189.954569] EXT4-fs (ploop42983p1): error count since last fsck: 67 
[95189.954582] EXT4-fs (ploop42983p1): initial error at time 1593210174: 
htree_dirblock_to_tree:918: inode 926441: block 3683060 
[95189.954589] EXT4-fs (ploop42983p1): last error at time 1593276902: 
ext4_iget:4435: inode 1849777 
[95714.207432] EXT4-fs (ploop60706p1): error count since last fsck: 42 
[95714.207447] EXT4-fs (ploop60706p1): initial error at time 1593210489: 
ext4_ext_find_extent:904: inode 136272 
[95714.207452] EXT4-fs (ploop60706p1): last error at time 1593231063: 
ext4_ext_find_extent:904: inode 136272 

Shutting the containers down and manually mounting and e2fsck'ing their 
filesystems did clear these errors, but each of the containers (which were 
mostly used for running Plesk) had widespread issues with corrupt or missing 
files after the fsck's completed, necessitating their being restored from 
backup. 

Concurrently, we also began to see messages like this appearing in 
/var/log/vzctl.log, which again have never appeared at any point prior to this 
update being installed: 

/var/log/vzctl.log:2020-06-26T21:05:19+0100 : Error in fill_hole (check.c:240): 
Warning: ploop image '/vz/private/8288448/root.hdd/root.hds' is sparse 
/var/log/vzctl.log:2020-06-26T21:09:41+0100 : Error in fill_hole (check.c:240): 
Warning: ploop image '/vz/private/8288450/root.hdd/root.hds' is sparse 
/var/log/vzctl.log:2020-06-26T21:16:22+0100 : Error in fill_hole (check.c:240): 
Warning: ploop image '/vz/private/8288451/root.hdd/root.hds' is sparse 
/var/log/vzctl.log:2020-06-26T21:19:57+0100 : Error in fill_hole (check.c:240): 
Warning: ploop image '/vz/private/8288452/root.hdd/root.hds' is sparse 

The basic procedure we follow when updating our nodes is as follows: 

1, Update the standby node we keep spare for this process 
2. vzmigrate all containers from the live node being updated to the standby 
node 
3. Update the live node 
4. Reboot the live node 
5. vzmigrate the containers from the standby node back to the live node they 
originally came from 

So the only tool which has been used to affect these containers is 'vzmigrate' 
itself, so I'm at something of a loss as to how to explain the root.hdd images 
for these containers containing sparse gaps. This is something we have never 
done, as we have always been aware that OpenVZ does not support their use 
inside a container's hard drive image. And the fact that these images have 
suddenly become sparse at the same time they have started to exhibit filesystem 
corruption is somewhat concerning. 

We can restore all affected containers from backups, but I wanted to get in 
touch with the list to see if anyone else at any other site has experienced 
these or similar issues after applying the 7.0.14 (136) update. 

Thank you, 
Kevin Drysdale. 




___ 
Users mailing list 
[ mailto:Users@openvz.org | Users@openvz.org ] 
[ https://lists.openvz.org/mailman/listinfo/users | 
https://lists.openvz.org/mailman/listinfo/users ] 





___ 
Users mailing list 
[ mailto:Users@openvz.org | Users@openvz.org 

Re: [Users] Issues after updating to 7.0.14 (136)

2020-07-02 Thread Konstantin Bukharov
Hello Kevin,

What was the OpenVz version *before* update to 7.0.14-136?

Sparse files for CTs are here for at least two years.

Best regards,
Konstantin

-Original Message-
From: users-boun...@openvz.org  On Behalf Of Kevin 
Drysdale
Sent: Monday, June 29, 2020 1:30 PM
To: users@openvz.org
Subject: [Users] Issues after updating to 7.0.14 (136)

Hello,

After updating one of our OpenVZ VPS hosting nodes at the end of last week, 
we've started to have issues with corruption apparently occurring inside 
containers.  Issues of this nature have never affected the node previously, and 
there do not appear to be any hardware issues that could explain this.

Specifically, a few hours after updating, we began to see containers 
experiencing errors such as this in the logs:

[90471.678994] EXT4-fs (ploop35454p1): error count since last fsck: 25 
[90471.679022] EXT4-fs (ploop35454p1): initial error at time 1593205255: 
ext4_ext_find_extent:904: inode 136399 [90471.679030] EXT4-fs (ploop35454p1): 
last error at time 1593232922: ext4_ext_find_extent:904: inode 136399 
[95189.954569] EXT4-fs (ploop42983p1): error count since last fsck: 67 
[95189.954582] EXT4-fs (ploop42983p1): initial error at time 1593210174: 
htree_dirblock_to_tree:918: inode 926441: block 3683060 [95189.954589] EXT4-fs 
(ploop42983p1): last error at time 1593276902: ext4_iget:4435: inode 1849777 
[95714.207432] EXT4-fs (ploop60706p1): error count since last fsck: 42 
[95714.207447] EXT4-fs (ploop60706p1): initial error at time 1593210489: 
ext4_ext_find_extent:904: inode 136272 [95714.207452] EXT4-fs (ploop60706p1): 
last error at time 1593231063: ext4_ext_find_extent:904: inode 136272

Shutting the containers down and manually mounting and e2fsck'ing their 
filesystems did clear these errors, but each of the containers (which were 
mostly used for running Plesk) had widespread issues with corrupt or missing 
files after the fsck's completed, necessitating their being restored from 
backup.

Concurrently, we also began to see messages like this appearing in 
/var/log/vzctl.log, which again have never appeared at any point prior to this 
update being installed:

/var/log/vzctl.log:2020-06-26T21:05:19+0100 : Error in fill_hole (check.c:240): 
Warning: ploop image '/vz/private/8288448/root.hdd/root.hds' is sparse
/var/log/vzctl.log:2020-06-26T21:09:41+0100 : Error in fill_hole (check.c:240): 
Warning: ploop image '/vz/private/8288450/root.hdd/root.hds' is sparse
/var/log/vzctl.log:2020-06-26T21:16:22+0100 : Error in fill_hole (check.c:240): 
Warning: ploop image '/vz/private/8288451/root.hdd/root.hds' is sparse
/var/log/vzctl.log:2020-06-26T21:19:57+0100 : Error in fill_hole (check.c:240): 
Warning: ploop image '/vz/private/8288452/root.hdd/root.hds' is sparse

The basic procedure we follow when updating our nodes is as follows:

1, Update the standby node we keep spare for this process 2. vzmigrate all 
containers from the live node being updated to the standby node 3. Update the 
live node 4. Reboot the live node 5. vzmigrate the containers from the standby 
node back to the live node they originally came from

So the only tool which has been used to affect these containers is 'vzmigrate' 
itself, so I'm at something of a loss as to how to explain the root.hdd images 
for these containers containing sparse gaps.  This is something we have never 
done, as we have always been aware that OpenVZ does not support their use 
inside a container's hard drive image.  And the fact that these images have 
suddenly become sparse at the same time they have started to exhibit filesystem 
corruption is somewhat concerning.

We can restore all affected containers from backups, but I wanted to get in 
touch with the list to see if anyone else at any other site has experienced 
these or similar issues after applying the 7.0.14 (136) update.

Thank you,
Kevin Drysdale.




___
Users mailing list
Users@openvz.org
https://lists.openvz.org/mailman/listinfo/users
___
Users mailing list
Users@openvz.org
https://lists.openvz.org/mailman/listinfo/users


Re: [Users] Issues after updating to 7.0.14 (136)

2020-07-02 Thread jjs - mainphrame
Jehan, are you running factory?

My ovz hosts are up to date, and I see:

[root@annie ~]# cat /etc/virtuozzo-release
OpenVZ release 7.0.15 (222)

Jake


On Thu, Jul 2, 2020 at 9:08 AM Jehan Procaccia IMT <
jehan.procac...@imtbs-tsp.eu> wrote:

> "updating to 7.0.14 (136)" !?
>
> I did an update yesterday , I am far behind that version
>
> *# cat /etc/vzlinux-release*
> *Virtuozzo Linux release 7.8.0 (609)*
>
> *# uname -a *
> *Linux localhost 3.10.0-1127.8.2.vz7.151.14 #1 SMP Tue Jun 9 12:58:54 MSK
> 2020 x86_64 x86_64 x86_64 GNU/Linux*
>
> why don't you try to update to latest version ?
>
>
> Le 29/06/2020 à 12:30, Kevin Drysdale a écrit :
>
> Hello,
>
> After updating one of our OpenVZ VPS hosting nodes at the end of last
> week, we've started to have issues with corruption apparently occurring
> inside containers.  Issues of this nature have never affected the node
> previously, and there do not appear to be any hardware issues that could
> explain this.
>
> Specifically, a few hours after updating, we began to see containers
> experiencing errors such as this in the logs:
>
> [90471.678994] EXT4-fs (ploop35454p1): error count since last fsck: 25
> [90471.679022] EXT4-fs (ploop35454p1): initial error at time 1593205255:
> ext4_ext_find_extent:904: inode 136399
> [90471.679030] EXT4-fs (ploop35454p1): last error at time 1593232922:
> ext4_ext_find_extent:904: inode 136399
> [95189.954569] EXT4-fs (ploop42983p1): error count since last fsck: 67
> [95189.954582] EXT4-fs (ploop42983p1): initial error at time 1593210174:
> htree_dirblock_to_tree:918: inode 926441: block 3683060
> [95189.954589] EXT4-fs (ploop42983p1): last error at time 1593276902:
> ext4_iget:4435: inode 1849777
> [95714.207432] EXT4-fs (ploop60706p1): error count since last fsck: 42
> [95714.207447] EXT4-fs (ploop60706p1): initial error at time 1593210489:
> ext4_ext_find_extent:904: inode 136272
> [95714.207452] EXT4-fs (ploop60706p1): last error at time 1593231063:
> ext4_ext_find_extent:904: inode 136272
>
> Shutting the containers down and manually mounting and e2fsck'ing their
> filesystems did clear these errors, but each of the containers (which were
> mostly used for running Plesk) had widespread issues with corrupt or
> missing files after the fsck's completed, necessitating their being
> restored from backup.
>
> Concurrently, we also began to see messages like this appearing in
> /var/log/vzctl.log, which again have never appeared at any point prior to
> this update being installed:
>
> /var/log/vzctl.log:2020-06-26T21:05:19+0100 : Error in fill_hole
> (check.c:240): Warning: ploop image '/vz/private/8288448/root.hdd/root.hds'
> is sparse
> /var/log/vzctl.log:2020-06-26T21:09:41+0100 : Error in fill_hole
> (check.c:240): Warning: ploop image '/vz/private/8288450/root.hdd/root.hds'
> is sparse
> /var/log/vzctl.log:2020-06-26T21:16:22+0100 : Error in fill_hole
> (check.c:240): Warning: ploop image '/vz/private/8288451/root.hdd/root.hds'
> is sparse
> /var/log/vzctl.log:2020-06-26T21:19:57+0100 : Error in fill_hole
> (check.c:240): Warning: ploop image '/vz/private/8288452/root.hdd/root.hds'
> is sparse
>
> The basic procedure we follow when updating our nodes is as follows:
>
> 1, Update the standby node we keep spare for this process
> 2. vzmigrate all containers from the live node being updated to the
> standby node
> 3. Update the live node
> 4. Reboot the live node
> 5. vzmigrate the containers from the standby node back to the live node
> they originally came from
>
> So the only tool which has been used to affect these containers is
> 'vzmigrate' itself, so I'm at something of a loss as to how to explain the
> root.hdd images for these containers containing sparse gaps.  This is
> something we have never done, as we have always been aware that OpenVZ does
> not support their use inside a container's hard drive image.  And the fact
> that these images have suddenly become sparse at the same time they have
> started to exhibit filesystem corruption is somewhat concerning.
>
> We can restore all affected containers from backups, but I wanted to get
> in touch with the list to see if anyone else at any other site has
> experienced these or similar issues after applying the 7.0.14 (136) update.
>
> Thank you,
> Kevin Drysdale.
>
>
>
>
> ___
> Users mailing list
> Users@openvz.org
> https://lists.openvz.org/mailman/listinfo/users
>
>
> ___
> Users mailing list
> Users@openvz.org
> https://lists.openvz.org/mailman/listinfo/users
>
___
Users mailing list
Users@openvz.org
https://lists.openvz.org/mailman/listinfo/users


Re: [Users] Issues after updating to 7.0.14 (136)

2020-07-02 Thread Jehan Procaccia IMT

"updating to 7.0.14 (136)" !?

I did an update yesterday , I am far behind that version

/# cat /etc/vzlinux-release//
/
/Virtuozzo Linux release 7.8.0 (609)/
/
/
/# uname -a //
//Linux localhost 3.10.0-1127.8.2.vz7.151.14 #1 SMP Tue Jun 9 12:58:54 
MSK 2020 x86_64 x86_64 x86_64 GNU/Linux//

/
why don't you try to update to latest version ?


Le 29/06/2020 à 12:30, Kevin Drysdale a écrit :

Hello,

After updating one of our OpenVZ VPS hosting nodes at the end of last 
week, we've started to have issues with corruption apparently 
occurring inside containers.  Issues of this nature have never 
affected the node previously, and there do not appear to be any 
hardware issues that could explain this.


Specifically, a few hours after updating, we began to see containers 
experiencing errors such as this in the logs:


[90471.678994] EXT4-fs (ploop35454p1): error count since last fsck: 25
[90471.679022] EXT4-fs (ploop35454p1): initial error at time 
1593205255: ext4_ext_find_extent:904: inode 136399
[90471.679030] EXT4-fs (ploop35454p1): last error at time 1593232922: 
ext4_ext_find_extent:904: inode 136399

[95189.954569] EXT4-fs (ploop42983p1): error count since last fsck: 67
[95189.954582] EXT4-fs (ploop42983p1): initial error at time 
1593210174: htree_dirblock_to_tree:918: inode 926441: block 3683060
[95189.954589] EXT4-fs (ploop42983p1): last error at time 1593276902: 
ext4_iget:4435: inode 1849777

[95714.207432] EXT4-fs (ploop60706p1): error count since last fsck: 42
[95714.207447] EXT4-fs (ploop60706p1): initial error at time 
1593210489: ext4_ext_find_extent:904: inode 136272
[95714.207452] EXT4-fs (ploop60706p1): last error at time 1593231063: 
ext4_ext_find_extent:904: inode 136272


Shutting the containers down and manually mounting and e2fsck'ing 
their filesystems did clear these errors, but each of the containers 
(which were mostly used for running Plesk) had widespread issues with 
corrupt or missing files after the fsck's completed, necessitating 
their being restored from backup.


Concurrently, we also began to see messages like this appearing in 
/var/log/vzctl.log, which again have never appeared at any point prior 
to this update being installed:


/var/log/vzctl.log:2020-06-26T21:05:19+0100 : Error in fill_hole 
(check.c:240): Warning: ploop image 
'/vz/private/8288448/root.hdd/root.hds' is sparse
/var/log/vzctl.log:2020-06-26T21:09:41+0100 : Error in fill_hole 
(check.c:240): Warning: ploop image 
'/vz/private/8288450/root.hdd/root.hds' is sparse
/var/log/vzctl.log:2020-06-26T21:16:22+0100 : Error in fill_hole 
(check.c:240): Warning: ploop image 
'/vz/private/8288451/root.hdd/root.hds' is sparse
/var/log/vzctl.log:2020-06-26T21:19:57+0100 : Error in fill_hole 
(check.c:240): Warning: ploop image 
'/vz/private/8288452/root.hdd/root.hds' is sparse


The basic procedure we follow when updating our nodes is as follows:

1, Update the standby node we keep spare for this process
2. vzmigrate all containers from the live node being updated to the 
standby node

3. Update the live node
4. Reboot the live node
5. vzmigrate the containers from the standby node back to the live 
node they originally came from


So the only tool which has been used to affect these containers is 
'vzmigrate' itself, so I'm at something of a loss as to how to explain 
the root.hdd images for these containers containing sparse gaps.  This 
is something we have never done, as we have always been aware that 
OpenVZ does not support their use inside a container's hard drive 
image.  And the fact that these images have suddenly become sparse at 
the same time they have started to exhibit filesystem corruption is 
somewhat concerning.


We can restore all affected containers from backups, but I wanted to 
get in touch with the list to see if anyone else at any other site has 
experienced these or similar issues after applying the 7.0.14 (136) 
update.


Thank you,
Kevin Drysdale.




___
Users mailing list
Users@openvz.org
https://lists.openvz.org/mailman/listinfo/users



___
Users mailing list
Users@openvz.org
https://lists.openvz.org/mailman/listinfo/users