On Mon, 2013-01-21 at 13:32 +0400, Stanislav Kinsbursky wrote: > > On Fre, 2013-01-18 at 16:26 +0100, Roman Haefeli wrote: > >> Hi all > >> > >> Only recently I discovered that online migration seems to work for us > >> now. CT on NFS or NFS mounted inside a CT is non-issue now. > >> > >> We are running all our CTs on an NFS filesystem shared between > >> hostnodes. While checkpointing and restoring works flawlessly with that > >> setup, I noticed that "vzctl chkpnt CTID" writes slower to the NFS mount > >> compared to a dd-write, for instance. > >> > >> 'dd if=/dev/zero of=/mnt/nfs/deleteme bs=1M' writes with approx. 70MB/s. > >> 'iotop' shows that 'vzctl chkpnt CTID' writes with only 18MB/s to the > >> same dir. > >> > >> However, the speed of writing is similar to the one of dd when I use the > >> 'bs=8k' option for dd. This makes me assume that quite some write > >> performance could be gained, if checkpointing would write bigger blocks > >> at time. I haven't read the respective source code to confirm my > >> assumption that small block sizes are used, as my skills are far too > >> limited, but if that is really the case, wouldn't it make sense to use > >> bigger writes in order to improve checkpointing performance? > >> > >> What do you think? > > > > I found a way to speed up checkpointing so it uses the maximum possible > > write speed. On the hostnodes we mount the nfs share with the 'sync' > > mount option (in order to avoid mutual storage lags between the CTs). > > However, when using 'async' the write speed is not dependent on the > > block size anymore and is always fast, this means also checkpointing is > > fast with 'async'. As we still want the CT's private areas to be mounted > > with 'sync', the solution was to use a separate mount for the /vz/dump > > directory with mount option 'async'. This way we can achieve maximum > > checkpointing speed (which is ~70MB/s on our machines). > > > > Hello, Roman. > It's not correct to compare checkpointing to dd, because it doesn't > perform sequential writes. We perform a lot of disk seek operations > during checkpointing.
I see. > And yes, NFS works much faster in async mode (async mode shadows seek > operations and allows to perform many writes before awaiting for > attributes update from > server), than in sync. This can help you to reduce CPT time. It significantly does so in our case. > But you have to fsync resulting checkpointing image on source node to > make sure that it's consistent on shared storage before resuming on > another node. The checkpointing and restoring is done by pacemaker - by the ManageVE resource agent [1], to be precise. I checked the script and as far as I can see it doesn't take any precautions to make sure everything is synced before restoring. I had the impression that everything was running fine. What would be the effect of restoring from an incomplete dump file? Would I immediately notice it (if it works at all)? Roman [1] https://github.com/ClusterLabs/resource-agents/blob/master/heartbeat/ManageVE _______________________________________________ Users mailing list Users@openvz.org https://lists.openvz.org/mailman/listinfo/users