Unfortunately it’s a separate problem with lack of user namespace support in 
CRIU at the moment. It wouldn’t prevent phaul-client from starting on source.
It’s a bit tricky to investigate without remote access.

Since you’re saying it’s a clean fresh install, I might be able to replicate 
the issue in my sandbox. I’ll setup two VMs with openvz 7u15 and see how 
migration works for me.

Pavel Vokhmyanin
Software Developer, Virtuozzo R&D

Otradnaya street 2b/9, “Otradnoe” Techno Park | Moscow | Russia
Phone: +7 (495) 139 80 17, ext 77449  | 
pvokhmya...@virtuozzo.com<mailto:pvokhmya...@virtuozzo.com>
Skype: pvokhmyanin

Virtuozzo.com<https://virtuozzo.com/>

From: users-boun...@openvz.org [mailto:users-boun...@openvz.org] On Behalf Of 
Joe Dougherty
Sent: Monday, March 22, 2021 1:48 PM
To: OpenVZ users
Subject: Re: [Users] Unable to perform migration between two fairly new OpenVZ 
7 nodes

It looks like the issue might be with the destination server, CRIU works fine 
on the source server but I get this error when trying to suspend and restore on 
the destination server:

(00.134699) Error (criu/namespaces.c:481): Can't dump nested user namespace for 
5051
(00.134708) Error (criu/namespaces.c:836): Can't make userns id
(00.140699) Error (criu/util.c:684): exited, status=1
(00.142811) Error (criu/util.c:684): exited, status=1
(00.144311) Error (criu/cr-dump.c:2275): Dumping FAILED.
Failed to checkpoint the Container
All dump files and logs were saved to /vz/private/10597/dump/Dump.fail
Checkpointing failed
Container is already running

Also confirmed that this command works on both servers: "python 
/usr/lib/python2.7/site-packages/phaul/shell/phaul_client.py --help"

On Mon, Mar 22, 2021 at 6:40 AM Pavel Vokhmyanin 
<pvokhmya...@virtuozzo.com<mailto:pvokhmya...@virtuozzo.com>> wrote:
I see, versions are correct, no problems there.

It might be a bit awkward to debug this over the mail further, but lets try 
couple basic tests.

Can you tell me, does “vzctl suspend %CTID%; vzctl resume %CTID%” succeed? This 
should test whether CRIU has all dependencies and is operational.
Besides, is there an exception if you run “python 
/usr/lib/python2.7/site-packages/phaul/shell/phaul_client.py --help” ?

Right now we know that phaul-client doesn’t start on your source server, but 
can’t tell exactly what the problem is. Hopefully these tests will give us more 
clues.

Pavel Vokhmyanin
Software Developer, Virtuozzo R&D

Otradnaya street 2b/9, “Otradnoe” Techno Park | Moscow | Russia
Phone: +7 (495) 139 80 17, ext 77449  | 
pvokhmya...@virtuozzo.com<mailto:pvokhmya...@virtuozzo.com>
Skype: pvokhmyanin

Virtuozzo.com<https://virtuozzo.com/>

From: users-boun...@openvz.org<mailto:users-boun...@openvz.org> 
[mailto:users-boun...@openvz.org<mailto:users-boun...@openvz.org>] On Behalf Of 
Joe Dougherty
Sent: Monday, March 22, 2021 12:56 PM
To: OpenVZ users
Subject: Re: [Users] Unable to perform migration between two fairly new OpenVZ 
7 nodes

Source server:

# rpm -q vzmigrate phaul
vzmigrate-7.0.138-1.vz7.x86_64
phaul-0.1.76-1.vz7.noarch

Destination server:

# rpm -q vzmigrate phaul
vzmigrate-7.0.138-1.vz7.x86_64
phaul-0.1.76-1.vz7.noarch

On Mon, Mar 22, 2021 at 5:50 AM Pavel Vokhmyanin 
<pvokhmya...@virtuozzo.com<mailto:pvokhmya...@virtuozzo.com>> wrote:
Hello Joe,

These symptoms indicate that phaul failed to start.
Recently there were changes in arguments for phaul. If you have an old 
vzmigrate and new phaul or vice-versa, you could get this behavior.
You should have either  phaul <=0.1.78 + vzmigrate <=7.0.140 OR phaul 0.1.79 + 
vzmgirate >=7.0.142.

Can you elaborate what package versions you’re using on the source server?
# rpm –q vzmigrate phaul

Pavel Vokhmyanin
Software Developer, Virtuozzo R&D

Otradnaya street 2b/9, “Otradnoe” Techno Park | Moscow | Russia
Phone: +7 (495) 139 80 17, ext 77449  | 
pvokhmya...@virtuozzo.com<mailto:pvokhmya...@virtuozzo.com>
Skype: pvokhmyanin

Virtuozzo.com<https://virtuozzo.com/>

From: users-boun...@openvz.org<mailto:users-boun...@openvz.org> 
[mailto:users-boun...@openvz.org<mailto:users-boun...@openvz.org>] On Behalf Of 
Joe Dougherty
Sent: Monday, March 22, 2021 2:26 AM
To: OpenVZ users
Subject: [Users] Unable to perform migration between two fairly new OpenVZ 7 
nodes

I'm attempting to perform migrations between two nodes, one build about a month 
ago and the other build this morning. I can migrate the containers if I stop 
them first, but attempts to migrate them while powered on (both using --live 
and without) fail due to phaul-service.

The command I'm running:

vzmigrate -vvv --keep-dst --ssh="-p 2200 -i /root/.ssh/key" server2 10597

Here's the tail end of the output where it fails:

2021-03-21 19:13:08.979: Warm migration stage started
2021-03-21 19:13:08.979: Compression is enabled
2021-03-21 19:13:09.063: Io multiplexer peer aborted
2021-03-21 19:13:09.063: 2021-03-21 16:13:10.778: Phaul service failed to live 
migrate CT
2021-03-21 19:13:09.064: 2021-03-21 16:13:10.778: cmd 'runphaulmigr' error 
[-73] : Phaul service failed to live migrate CT
2021-03-21 19:13:09.064: Phaul service failed to live migrate CT
2021-03-21 19:13:09.064: Phaul failed to live migrate CT (/var/log/phaul.log)
2021-03-21 19:13:09.064: 2021-03-21 16:13:10.779: cleaning : rename : 
/vz/private/10597 -> /vz/private/10597.migrated
2021-03-21 19:13:09.064: 2021-03-21 16:13:10.779: cleaning : destroy CT 10597
2021-03-21 19:13:09.073: 2021-03-21 16:13:10.788: cleaning : 'rmdir' dir : 
/vz/root/10597
2021-03-21 19:13:09.074: 2021-03-21 16:13:10.788: can not find entry for delete 
: [/vz/root/10597]
2021-03-21 19:13:09.074: 2021-03-21 16:13:10.788: cleaning : rename : 
/vz/private/10597 -> /vz/private/10597.migrated
2021-03-21 19:13:09.074: 2021-03-21 16:13:10.788: can not move 
'/vz/private/10597' -> '/vz/private/10597.migrated' : No such file or directory
2021-03-21 19:13:09.074: 2021-03-21 16:13:10.788: Can't do correct cleaning: 
can not move '/vz/private/10597' -> '/vz/private/10597.migrated' : No such file 
or directory
2021-03-21 19:13:09.074: 2021-03-21 16:13:10.788: unlocking 10597
2021-03-21 19:13:09.075: Can't move/copy CT 10597 -> CT 10597, [], [] : Phaul 
failed to live migrate CT (/var/log/phaul.log)
2021-03-21 19:13:09.075: cleaning : 'rm' file : /vz/dump/10597-criu_err.log
2021-03-21 19:13:09.075: can not find entry for delete : 
[/vz/dump/10597-criu_err.log]
2021-03-21 19:13:10.075: unlocking 10597
2021-03-21 19:13:10.075: close channel

There is no log file.

# cat /var/log/phaul.log
cat: /var/log/phaul.log: No such file or directory

Any ideas on how I can fix this so I can begin performing migrations without 
having to power off the containers first?

Thank you.

-Joe

_______________________________________________
Users mailing list
Users@openvz.org<mailto:Users@openvz.org>
https://lists.openvz.org/mailman/listinfo/users


--
-Joe Dougherty
Chief Operating Officer
Secure Dragon LLC
www.SecureDragon.net<http://www.SecureDragon.net>
_______________________________________________
Users mailing list
Users@openvz.org<mailto:Users@openvz.org>
https://lists.openvz.org/mailman/listinfo/users


--
-Joe
_______________________________________________
Users mailing list
Users@openvz.org
https://lists.openvz.org/mailman/listinfo/users

Reply via email to