Re: [Xen-devel] [Planning for Xen-4.6] Migration v2
在 11/26/2014 03:54 AM, Andrew Cooper 写道: Hello, The purpose of this email is to plan how to progress the migrationv2 series through to being merged. I believe I have CC'd everyone with a specific interest in this area, but apologies if I have missed anyone. Migration v2 is in exclusive use in XenServer 6.5. We primarily developed migration v2 because we needed a 32bit - 64bit toolstack upgrade path. The code has all the features XenServer previously supported, and we consider it fully baked and without any known bugs, including transparent legacy-to-v2 conversion on upgrade. We did endeavour to get migration v2 into Xen 4.5, but regrettably this did not happen. A consequence of this, along with the code being in XenServer 6.5, is that the wire format is now set in stone. Luckily, it has been explicitly designed to be easy to extend in a forward compatible manor, so this is not a problem moving forward. The expectation is that the migration v2 code will completely replace the existing migration code, which will involve removing xc_domain_save.c and xc_domain_restore.c, as well as assorted other orphaned code in libxenctrl and libxenguest There are 3 areas of concern which have been identified so far. 1) TMEM support Migration v2 doesn't currently have any tmem migration support. The maintainers have been asked whether they actually expect legacy tmem migration to work, but I have not heard any reply yet. At the very least, migration v2 tmem support would want some new thought put into wire protocol. I am hoping that, as TMEM is still tech preview and still in the process of having XSA-15 fixed, working tmem migration v2 is not insisted as a prerequisite. 2) Remus/COLO support Migration v2 doesn't currently have any Remus support. There was a draft series which added Remus support, and showed that it was particularly simple to add Remus support to migration v2. I integrated several bugfixes as a side effect of that series, but the actual Remus content needed a refresh. This got delayed behind the Remus libxl effort. It is my hope that the Remus maintainers can refresh that series and provide assistance while testing. Sure, I'm planning to refresh the patches as soon as Xen 4.6 merge window opened. And also going to start the work on libxl side because libxl part of migration v2 has already done(although not fullly finished?). And we hope COLO support will go into Xen 4.6 also. 3) Libxl and xl support Libxl and xl have as many problems as the libxc code did when it comes to incompatible wire formats and layering violations. In particular, it is not possible to determine the bitness of the sending libxl-saverestore-helper, meaning that legacy conversion requires active administrator input, or at least a passive assumption that the bitness is the same. There is an xl/libxl part of the migration v2 series which attempts to rectify this all in one go, as there is no alternative way of doing so. The libxl section of the series is certainly not yet complete, but specific queries to the maintainers have thusfar gone unanswered. On the other hand, the series does basically WorkForMe, including transparent legacy upgrade, suggesting that it is at least in an appropriate ballpark. *) Specific non-requirements: There have been issues identified with dynamic (in a p2m sense) guests and migration, which results in failed migration or image corruption. While these issues certainly want fixing, they are bugs which exist in the legacy code. As such, they are not prerequisites to fix before v2 can be accepted. Anyway, it is my hope that this planning email can help get things on track to start perusing active development again as soon as the 4.6 dev window opens again, with the aim to get all the code merged as early as possible in the dev window to allow as much testing as possible. ~Andrew . -- Thanks, Yang. ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [Planning for Xen-4.6] Migration v2
On Wed, 2014-11-26 at 17:39 +, Andrew Cooper wrote: IMHO this is fine. It essentially means that for xl users there is some delayed gratification wrt the promise of migration between non-alike dom0s. The migration from 4.5(legacy)-4.6(v2) won't support such migrations, but the next step from 4.6(v2)-4.7(v2) will. Two options exist. 1) Assume that the sending bitness is the same as the receiving bitness. This is already the status quo, and will require that the two dom0s are the same width. As I said above I think this is absolutely acceptable as a transitional step. 2) Allow the administrator to specify the bitness of the sending side. In this case, xl 4.5(legacy)-4.6(v2) works even cross-bitness. If this is trivial to plumb in and you are motivated to do so then this seems like a reasonable enough stretch goal. Ian. ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [Planning for Xen-4.6] Migration v2
On Tue, Nov 25, Andrew Cooper wrote: The purpose of this email is to plan how to progress the migrationv2 series through to being merged. I believe I have CC'd everyone with a specific interest in this area, but apologies if I have missed anyone. While you mow that lawn, did you guys think of handling downtime of the migrated VM? I added some knobs to abort migration in a very libxc specific way. What I would like to see is a simple user interface for virsh/xl to control the downtime. See the thread limit downtime during life migration from xl/virsh: http://lists.xenproject.org/archives/html/xen-devel/2014-03/msg00785.html Olaf ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [Planning for Xen-4.6] Migration v2
On Wed, Nov 26, Andrew Cooper wrote: It is certainly my hope going forward that different knobs can be exposed. One thing I think would be interesting is some proper calculations of the delta in the dirty set, and offering a threshold which chooses between pause and complete or abort the migration and complain that the VM is too active The pause and complete step is what causes unexpected time jumps in the guest. Would be nice if that can be controlled with a knob. Olaf ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [Planning for Xen-4.6] Migration v2
On Tue, 2014-11-25 at 19:54 +, Andrew Cooper wrote: There is an xl/libxl part of the migration v2 series which attempts to rectify this all in one go, as there is no alternative way of doing so. The libxl section of the series is certainly not yet complete, but specific queries to the maintainers have thusfar gone unanswered. On the other hand, the series does basically WorkForMe, including transparent legacy upgrade, suggesting that it is at least in an appropriate ballpark. Is this, from [PATCH 27/29] [VERY RFC] tools/libxl: Support restoring legacy streams: This WorksForMe in the success case, but the error handling is certainly lacking. Specifically, the conversion scripts output fd can't be closed until the v2 read loop has exited (cleanly or otherwise), without risking a close()/open() race silently replacing the fd behind the loops back. However, it can't be closed when the read loop exits, as the conversion script child might still be alive, and would prefer terminating cleaning than failing with a bad FD. Obviously, having one error handler block for the success/failure of the other side is a no-go, and would still involve a preselecting which was expected to exit first. Does anyone have any clever ideas of how to asynchronously collect the events the conversion script has exited, the save helper has exited and the v2 read loop has finished given the available infrastructure, to kick of a combined cleanup of all 3? ? I said then: This is probably one for Ian when he gets back, but a state machine which is cranked in response to the callbacks from the various completion events might be one way to approach this. Prodding Ian again (by moving to the To: line...) Was there any other questions? I've had a scrobble through the bit of v7 which 00/29 suggests might contain them, but that's the only one I saw. Ian. ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [Planning for Xen-4.6] Migration v2
On Tue, 2014-11-25 at 19:54 +, Andrew Cooper wrote: 3) Libxl and xl support Libxl and xl have as many problems as the libxc code did when it comes to incompatible wire formats and layering violations. In particular, it is not possible to determine the bitness of the sending libxl-saverestore-helper, meaning that legacy conversion requires active administrator input, or at least a passive assumption that the bitness is the same. IOW when migrating legacy-new we have the same restriction as we do today in the purely legacy world, which is that the two dom0's must having match bit widths? IMHO this is fine. It essentially means that for xl users there is some delayed gratification wrt the promise of migration between non-alike dom0s. The migration from 4.5(legacy)-4.6(v2) won't support such migrations, but the next step from 4.6(v2)-4.7(v2) will. Ian. ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [Planning for Xen-4.6] Migration v2
On 26/11/14 16:50, Ian Campbell wrote: On Tue, 2014-11-25 at 19:54 +, Andrew Cooper wrote: 3) Libxl and xl support Libxl and xl have as many problems as the libxc code did when it comes to incompatible wire formats and layering violations. In particular, it is not possible to determine the bitness of the sending libxl-saverestore-helper, meaning that legacy conversion requires active administrator input, or at least a passive assumption that the bitness is the same. IOW when migrating legacy-new we have the same restriction as we do today in the purely legacy world, which is that the two dom0's must having match bit widths? The legacy-new conversion removes bitness from the equation, but the bitness of the legacy side is an input parameter to conversion. For XenServer, this is easy, as all older versions of XenServer are 32bit. This version, and future versions will use the new format, where bitness is specifically irrelevant. For xl, this is harder. There exist both 32 and 64bit versions doing legacy migration, and on the receiving side it is impossible to determine, given only the incoming stream. IMHO this is fine. It essentially means that for xl users there is some delayed gratification wrt the promise of migration between non-alike dom0s. The migration from 4.5(legacy)-4.6(v2) won't support such migrations, but the next step from 4.6(v2)-4.7(v2) will. Two options exist. 1) Assume that the sending bitness is the same as the receiving bitness. This is already the status quo, and will require that the two dom0s are the same width. 2) Allow the administrator to specify the bitness of the sending side. In this case, xl 4.5(legacy)-4.6(v2) works even cross-bitness. ~Andrew ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
[Xen-devel] [Planning for Xen-4.6] Migration v2
Hello, The purpose of this email is to plan how to progress the migrationv2 series through to being merged. I believe I have CC'd everyone with a specific interest in this area, but apologies if I have missed anyone. Migration v2 is in exclusive use in XenServer 6.5. We primarily developed migration v2 because we needed a 32bit - 64bit toolstack upgrade path. The code has all the features XenServer previously supported, and we consider it fully baked and without any known bugs, including transparent legacy-to-v2 conversion on upgrade. We did endeavour to get migration v2 into Xen 4.5, but regrettably this did not happen. A consequence of this, along with the code being in XenServer 6.5, is that the wire format is now set in stone. Luckily, it has been explicitly designed to be easy to extend in a forward compatible manor, so this is not a problem moving forward. The expectation is that the migration v2 code will completely replace the existing migration code, which will involve removing xc_domain_save.c and xc_domain_restore.c, as well as assorted other orphaned code in libxenctrl and libxenguest There are 3 areas of concern which have been identified so far. 1) TMEM support Migration v2 doesn't currently have any tmem migration support. The maintainers have been asked whether they actually expect legacy tmem migration to work, but I have not heard any reply yet. At the very least, migration v2 tmem support would want some new thought put into wire protocol. I am hoping that, as TMEM is still tech preview and still in the process of having XSA-15 fixed, working tmem migration v2 is not insisted as a prerequisite. 2) Remus/COLO support Migration v2 doesn't currently have any Remus support. There was a draft series which added Remus support, and showed that it was particularly simple to add Remus support to migration v2. I integrated several bugfixes as a side effect of that series, but the actual Remus content needed a refresh. This got delayed behind the Remus libxl effort. It is my hope that the Remus maintainers can refresh that series and provide assistance while testing. 3) Libxl and xl support Libxl and xl have as many problems as the libxc code did when it comes to incompatible wire formats and layering violations. In particular, it is not possible to determine the bitness of the sending libxl-saverestore-helper, meaning that legacy conversion requires active administrator input, or at least a passive assumption that the bitness is the same. There is an xl/libxl part of the migration v2 series which attempts to rectify this all in one go, as there is no alternative way of doing so. The libxl section of the series is certainly not yet complete, but specific queries to the maintainers have thusfar gone unanswered. On the other hand, the series does basically WorkForMe, including transparent legacy upgrade, suggesting that it is at least in an appropriate ballpark. *) Specific non-requirements: There have been issues identified with dynamic (in a p2m sense) guests and migration, which results in failed migration or image corruption. While these issues certainly want fixing, they are bugs which exist in the legacy code. As such, they are not prerequisites to fix before v2 can be accepted. Anyway, it is my hope that this planning email can help get things on track to start perusing active development again as soon as the 4.6 dev window opens again, with the aim to get all the code merged as early as possible in the dev window to allow as much testing as possible. ~Andrew ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel