Re: [Xen-ia64-devel] Re: PATCH: live migration
Le Mardi 25 Juillet 2006 19:07, Alex Williamson a écrit : On Tue, 2006-07-25 at 10:43 +0200, Tristan Gingold wrote: +static int xc_ia64_shadow_control(int xc_handle, + uint32_t domid, + unsigned int sop, + unsigned long *dirty_bitmap, + unsigned long pages, + xc_shadow_control_stats_t *stats) +{ +if (dirty_bitmap != NULL) { +int i; + +/* Touch the page so that it is in the TC. + FIXME: improve this!!! */ +for (i = 0; i pages / 8; i += PAGE_SIZE) +dirty_bitmap[i / sizeof (unsigned long)] = 0; +dirty_bitmap[pages / BITS_PER_LONG] = 0; +} Hi Tristan, Ok, I think I understand this now, but it still seems hard to follow. What about something like this: --- unsigned char *bmap = (unsigned char *)dirty_bitmap; unsigned long bmap_bytes = ((pages + BITS_PER_LONG - 1) ~(BITS_PER_LONG - 1)) / 8; unsigned int bmap_pages = ((bmap_bytes + PAGE_SIZE - 1) ~(PAGE_SIZE - 1)) / PAGE_SIZE; for (i = 0 ; i bmap_pages ; i++) bmap[i * PAGE_SIZE] = 0; Ok. But the last byte has still to be touched: Suppose the bitmap is 1 PAGE_SIZE long, ie bmap_pages is 1. Suppose dirty_bitmap is allocated accross two physical page (possible because malloc'ed). The for loop will only touch the first physical page! --- It's still a little scary that we're relying on touching memory for it to have a TC entry. I guess I can live with that for now. Thanks, I am too. Contraty to xc_ia64_get_pfn_list which can restart, xc_shadow_control can't be restarted. Therefore if the TC is flushed, the live migration failed. Fortunatly the whole process can be restarted: the domain does not crash. In my opinion the best method is to use guest pseudo physical address instead of virtual address. The logic would be more complex (the virtual to physical conversion has to be done in Linux) but much more reliable. I have to look on ppc sources. Tristan. ___ Xen-ia64-devel mailing list Xen-ia64-devel@lists.xensource.com http://lists.xensource.com/xen-ia64-devel
Re: [Xen-ia64-devel] Re: PATCH: live migration
Le Lundi 24 Juillet 2006 20:47, Alex Williamson a écrit : Hi Tristan, Sorry for the delay, I didn't have as much spare time last week at OLS as I was hoping. A few comments on this patch below. Thanks, Alex On Tue, 2006-07-18 at 11:03 +0200, Tristan Gingold wrote: + +#define BITS_PER_LONG (sizeof(unsigned long) * 8) +#define BITMAP_SIZE ((max_pfn + BITS_PER_LONG - 1) / 8) Looks like this is just borrowed from the x86 flavors, but I'm not sure it makes sense. It appears we're rounding BITMAP_SIZE up, but why not round it up to an even multiple of longs? Would something like this work better: (((max_pfn + (BITS_PER_LONG - 1)) ~(BITS_PER_LONG - 1)) / 8) Makes sense! +/* Dirtied pages won't be saved. + slightly wasteful to peek the whole array evey time, + but this is fast enough for the moment. */ +if (!last_iter) { +/* FIXME!! */ +for (i = 0; i BITMAP_SIZE; i += PAGE_SIZE) +to_send[i] = 0; This zero'ing loop is repeated in several places, but it doesn't make sense to me. BITMAP_SIZE is in bytes, to_send is an unsigned long pointer, and the PAGE_SIZE increment seems rather random. Looks like it should segfault and only very sparsely zero the bitmap array. Am I missing the point? You are right about the possible segfault: I should have written to_send[i/sizeof(unsigned long))]. The purpose of this loop is to setup the translation cache. I will create a function to do this and the hypercall. + free (page_array); - +free (to_send); +free (to_skip); Shouldn't we check for NULL before free'ing? if (to_send) free(to_send); etc... This is not required according to ansi-c: The free function causes the space pointed to by ptr to be deallocated, that is, made available for further allocation. If ptr is a null pointer, no action occurs. + atomic64_set (d-arch.shadow_fault_count, 0); + atomic64_set (d-arch.shadow_dirty_count, 0); + + d-arch.shadow_bitmap_size = (d-max_pages + 63) ~63; 63 may be an obvious value, but I prefer the (BITS_PER_LONG - 1) usage in the userspace tools. Magic numbers are bad. Sure. + if ( sc-pages d-arch.shadow_bitmap_size ) + sc-pages = d-arch.shadow_bitmap_size; + +#define chunk (8*1024) /* Transfer and clear in 1kB chunks for L1 cache. */ Please move this #define out of the function and rename it to something in all caps so it's easy to recognize as a macro. Definitly a style point! This was copied from x86. + for ( i = 0; i sc-pages; i += chunk ) + { + int bytes = sc-pages - i) chunk) ? + chunk : (sc-pages - i)) + 7) / 8; + + if ( copy_to_guest_offset( +sc-dirty_bitmap, +i/(8*sizeof(unsigned long)), +d-arch.shadow_bitmap +(i/(8*sizeof(unsigned long))), BITS_PER_LONG would seem to be a useful simplification here. Ok. + if ( sc-pages d-arch.shadow_bitmap_size ) + sc-pages = d-arch.shadow_bitmap_size; + + if ( copy_to_guest(sc-dirty_bitmap, + d-arch.shadow_bitmap, + (((sc-pages+7)/8)+sizeof(unsigned long)-1) / + sizeof(unsigned long)) ) A comment might be in order for the calculations going on in this last parameter. Ok. +/* Bitmap of shadow dirty bits. + Set iff shadow mode is enabled. */ +u64 *shadow_bitmap; +/* Length (in byte) of shadow bitmap. */ +unsigned long shadow_bitmap_size; The usage of shadow_bitmap_size seems to be in bits. Is this comment correct? Oups, you are right :-) Thank you for this review. I will update the patch. Tristan. ___ Xen-ia64-devel mailing list Xen-ia64-devel@lists.xensource.com http://lists.xensource.com/xen-ia64-devel
[Xen-ia64-devel] Re: PATCH: live migration
Hi Tristan, Sorry for the delay, I didn't have as much spare time last week at OLS as I was hoping. A few comments on this patch below. Thanks, Alex On Tue, 2006-07-18 at 11:03 +0200, Tristan Gingold wrote: + +#define BITS_PER_LONG (sizeof(unsigned long) * 8) +#define BITMAP_SIZE ((max_pfn + BITS_PER_LONG - 1) / 8) Looks like this is just borrowed from the x86 flavors, but I'm not sure it makes sense. It appears we're rounding BITMAP_SIZE up, but why not round it up to an even multiple of longs? Would something like this work better: (((max_pfn + (BITS_PER_LONG - 1)) ~(BITS_PER_LONG - 1)) / 8) +/* Dirtied pages won't be saved. + slightly wasteful to peek the whole array evey time, + but this is fast enough for the moment. */ +if (!last_iter) { +/* FIXME!! */ +for (i = 0; i BITMAP_SIZE; i += PAGE_SIZE) +to_send[i] = 0; This zero'ing loop is repeated in several places, but it doesn't make sense to me. BITMAP_SIZE is in bytes, to_send is an unsigned long pointer, and the PAGE_SIZE increment seems rather random. Looks like it should segfault and only very sparsely zero the bitmap array. Am I missing the point? + free (page_array); - +free (to_send); +free (to_skip); Shouldn't we check for NULL before free'ing? if (to_send) free(to_send); etc... + + atomic64_set (d-arch.shadow_fault_count, 0); + atomic64_set (d-arch.shadow_dirty_count, 0); + + d-arch.shadow_bitmap_size = (d-max_pages + 63) ~63; 63 may be an obvious value, but I prefer the (BITS_PER_LONG - 1) usage in the userspace tools. Magic numbers are bad. + + if ( sc-pages d-arch.shadow_bitmap_size ) + sc-pages = d-arch.shadow_bitmap_size; + +#define chunk (8*1024) /* Transfer and clear in 1kB chunks for L1 cache. */ Please move this #define out of the function and rename it to something in all caps so it's easy to recognize as a macro. + for ( i = 0; i sc-pages; i += chunk ) + { + int bytes = sc-pages - i) chunk) ? + chunk : (sc-pages - i)) + 7) / 8; + + if ( copy_to_guest_offset( +sc-dirty_bitmap, +i/(8*sizeof(unsigned long)), +d-arch.shadow_bitmap +(i/(8*sizeof(unsigned long))), BITS_PER_LONG would seem to be a useful simplification here. + + if ( sc-pages d-arch.shadow_bitmap_size ) + sc-pages = d-arch.shadow_bitmap_size; + + if ( copy_to_guest(sc-dirty_bitmap, + d-arch.shadow_bitmap, + (((sc-pages+7)/8)+sizeof(unsigned long)-1) / + sizeof(unsigned long)) ) A comment might be in order for the calculations going on in this last parameter. +/* Bitmap of shadow dirty bits. + Set iff shadow mode is enabled. */ +u64 *shadow_bitmap; +/* Length (in byte) of shadow bitmap. */ +unsigned long shadow_bitmap_size; The usage of shadow_bitmap_size seems to be in bits. Is this comment correct? -- Alex Williamson HP Open Source Linux Org. ___ Xen-ia64-devel mailing list Xen-ia64-devel@lists.xensource.com http://lists.xensource.com/xen-ia64-devel