Re: [Xen-devel] Re: [XenPPC] [PATCH 2/7] xencomm take 3: xen side preparetion for consolidation.
On Tue, Aug 14, 2007 at 09:48:00AM -0500, Hollis Blanchard wrote: > However, there are a few places below where you call memcpy() without > checking the result of xencomm_maddr_to_vaddr(). Actually, I see the > same issue in the original code in a few places... We should be very > very careful here, since a guest passing a bad paddr could result in Xen > overwriting 0x0. Thank you for comments. The next patch (3/7) addresses those issues. i.e. checking guest supplied values, avoiding races. I intentionally kept this patch(2/7) as small as possible leaving them to the next patch (3/7). Since we can work around the populate physmap issue, it's ok for me to drop multi page support. But we need the next patch or something similar. If you dislike the implementation, I'm willing to refine it. Please suggest. -- yamahata ___ Xen-ppc-devel mailing list Xen-ppc-devel@lists.xensource.com http://lists.xensource.com/xen-ppc-devel
[XenPPC] Re: [Xen-devel] [PATCH 0/7] xencomm take 3: preparation for consolidation and various fixes
On 14/8/07 10:50, "Isaku Yamahata" <[EMAIL PROTECTED]> wrote: > This take 3 patch series is for xencomm consolidation and various fixes. > I split up the xen side patch into 4 smaller ones and cleaned them up. > Linux side patch is same as previous one. > The issue is the complexity and the multi page support. > So I think that 3/7, 4/7 and 7/7 are moot, other patches can be merged. All applied but 3, 4, 7. -- Keir ___ Xen-ppc-devel mailing list Xen-ppc-devel@lists.xensource.com http://lists.xensource.com/xen-ppc-devel
Re: [XenPPC] [PATCH 2/7] xencomm take 3: xen side preparetion for consolidation.
This is definitely needed and I apologize for my maddr/vaddr confusion in the first place. However, there are a few places below where you call memcpy() without checking the result of xencomm_maddr_to_vaddr(). Actually, I see the same issue in the original code in a few places... We should be very very careful here, since a guest passing a bad paddr could result in Xen overwriting 0x0. On Tue, 2007-08-14 at 18:50 +0900, Isaku Yamahata wrote: > # HG changeset patch > # User [EMAIL PROTECTED] > # Date 1187077583 -32400 > # Node ID 4dbbedee6bb8d594287940470a61b8c0c56daf9c > # Parent 68867379b785a9a6fd37ca75be64fa7b5e3b8a2b > [xen, xencomm] preparetion for xencomm consolidation. > Xen/powerpc runs in real mode so that it uses maddr interchangably with > vaddr. > But it isn't the case in xen/ia64. It is necessary to convert maddr to vaddr > to access the page. maddr_to_virt() doesn't convert on powerpc, so it should > work on both archtechture. > PATCHNAME: xencomm_consolidation_maddr_vaddr > > Signed-off-by: Isaku Yamahata <[EMAIL PROTECTED]> > > diff -r 68867379b785 -r 4dbbedee6bb8 xen/common/xencomm.c > --- a/xen/common/xencomm.cTue Aug 14 16:44:42 2007 +0900 > +++ b/xen/common/xencomm.cTue Aug 14 16:46:23 2007 +0900 > @@ -34,6 +34,15 @@ static int xencomm_debug = 1; /* extreme > #define xencomm_debug 0 > #endif > > +static void* > +xencomm_maddr_to_vaddr(unsigned long maddr) > +{ > +if (maddr == 0) > +return NULL; > + > +return maddr_to_virt(maddr); > +} > + > static unsigned long > xencomm_inline_from_guest(void *to, const void *from, unsigned int n, > unsigned int skip) > @@ -54,7 +63,7 @@ xencomm_inline_from_guest(void *to, cons > src_maddr = paddr_to_maddr(src_paddr); > if (xencomm_debug) > printk("%lx[%d] -> %lx\n", src_maddr, bytes, (unsigned long)to); > -memcpy(to, (void *)src_maddr, bytes); > +memcpy(to, maddr_to_virt(src_maddr), bytes); > src_paddr += bytes; > to += bytes; > n -= bytes; > @@ -89,7 +98,8 @@ xencomm_copy_from_guest(void *to, const > return xencomm_inline_from_guest(to, from, n, skip); > > /* first we need to access the descriptor */ > -desc = (struct xencomm_desc *)paddr_to_maddr((unsigned long)from); > +desc = (struct xencomm_desc *) > +xencomm_maddr_to_vaddr(paddr_to_maddr((unsigned long)from)); > if (desc == NULL) > return n; > > @@ -130,7 +140,7 @@ xencomm_copy_from_guest(void *to, const > > if (xencomm_debug) > printk("%lx[%d] -> %lx\n", src_maddr, bytes, dest); > -memcpy((void *)dest, (void *)src_maddr, bytes); > +memcpy((void *)dest, maddr_to_virt(src_maddr), bytes); > from_pos += bytes; > to_pos += bytes; > } > @@ -161,7 +171,7 @@ xencomm_inline_to_guest(void *to, const > dest_maddr = paddr_to_maddr(dest_paddr); > if (xencomm_debug) > printk("%lx[%d] -> %lx\n", (unsigned long)from, bytes, > dest_maddr); > -memcpy((void *)dest_maddr, (void *)from, bytes); > +memcpy(maddr_to_virt(dest_maddr), (void *)from, bytes); > dest_paddr += bytes; > from += bytes; > n -= bytes; > @@ -196,7 +206,8 @@ xencomm_copy_to_guest(void *to, const vo > return xencomm_inline_to_guest(to, from, n, skip); > > /* first we need to access the descriptor */ > -desc = (struct xencomm_desc *)paddr_to_maddr((unsigned long)to); > +desc = (struct xencomm_desc *) > +xencomm_maddr_to_vaddr(paddr_to_maddr((unsigned long)to)); > if (desc == NULL) > return n; > > @@ -236,7 +247,7 @@ xencomm_copy_to_guest(void *to, const vo > > if (xencomm_debug) > printk("%lx[%d] -> %lx\n", source, bytes, dest_maddr); > -memcpy((void *)dest_maddr, (void *)source, bytes); > +memcpy(maddr_to_virt(dest_maddr), (void *)source, bytes); > from_pos += bytes; > to_pos += bytes; > } > @@ -264,7 +275,8 @@ int xencomm_add_offset(void **handle, un > return xencomm_inline_add_offset(handle, bytes); > > /* first we need to access the descriptor */ > -desc = (struct xencomm_desc *)paddr_to_maddr((unsigned long)*handle); > +desc = (struct xencomm_desc *) > +xencomm_maddr_to_vaddr(paddr_to_maddr((unsigned long)*handle)); > if (desc == NULL) > return -1; > > @@ -310,7 +322,8 @@ int xencomm_handle_is_null(void *handle) > if (xencomm_is_inline(handle)) > return xencomm_inline_addr(handle) == 0; > > -desc = (struct xencomm_desc *)paddr_to_maddr((unsigned long)handle); > +desc = (struct xencomm_desc *) > +xencomm_maddr_to_vaddr(paddr_to_maddr((unsigned long)handle)); > if (desc == NULL) > return 1; > > ___ > Xen-ppc-devel mailing list > Xen-ppc-devel@lists.xe
[XenPPC] [PATCH 5/7] xencomm take 3: linux side various fixes and preparation for consolidation
# HG changeset patch # User [EMAIL PROTECTED] # Date 1186975024 -32400 # Node ID 06791935b2cb9d69e94ca89ca8febcda627017b2 # Parent d2f9b7e3623114e6a45c916f21b348fda122fa8e [linux, xencomm]Various fixes common xencomm.c for ia64 xencomm consolidation - move xen_guest_handle() macro into include/xen/xencomm.h ia64 also uses it. - is_kern_addr() is powerpc specific. and other arch doesn't implement it. It will be defined in linux/include/asm-ia64/xen/xencomm.h - fix error recovery path of xencomm_create() xencomm_free() requires pseudo physical address, not virtual address. - add one BUG_ON() to xencomm_create_mini() for alignment requirement - use xencomm_pa() instead of __pa() in xencomm_map() and xencomm_map_no_alloc(). They should work for statically allocated area. On ia64 it isn't in straight mapping area so that xencomm_pa() is necessary. - add gcc bug work around. gcc 4.1.2 doesn't handle properly variables on stack with align attribute. http://gcc.gnu.org/bugzilla/show_bug.cgi?id=16660 PATCHNAME: fix_xencomm_create_in_common_code Signed-off-by: Isaku Yamahata <[EMAIL PROTECTED]>) diff -r d2f9b7e36231 -r 06791935b2cb arch/powerpc/platforms/xen/setup.h --- a/arch/powerpc/platforms/xen/setup.hThu Aug 09 16:16:28 2007 +0100 +++ b/arch/powerpc/platforms/xen/setup.hMon Aug 13 12:17:04 2007 +0900 @@ -40,8 +40,6 @@ static inline u64 jiffies_to_ns(unsigned return j * (10UL / HZ); } -#define xen_guest_handle(hnd) ((hnd).p) - extern struct page *alloc_foreign_page(void); extern void free_foreign_page(struct page *page); diff -r d2f9b7e36231 -r 06791935b2cb drivers/xen/core/xencomm.c --- a/drivers/xen/core/xencomm.cThu Aug 09 16:16:28 2007 +0100 +++ b/drivers/xen/core/xencomm.cMon Aug 13 12:17:04 2007 +0900 @@ -23,6 +23,9 @@ #include #include #include +#ifdef __ia64__ +#include/* for is_kern_addr() */ +#endif static int xencomm_init(struct xencomm_desc *desc, void *buffer, unsigned long bytes) @@ -111,7 +114,7 @@ static int xencomm_create(void *buffer, rc = xencomm_init(desc, buffer, bytes); if (rc) { printk("%s failure: %d\n", "xencomm_init", rc); - xencomm_free(desc); + xencomm_free((void *)__pa(desc)); return rc; } @@ -146,6 +149,7 @@ static int xencomm_create_mini(void *buf { int rc = 0; struct xencomm_desc *desc; + BUG_ON(((unsigned long)xc_desc) % sizeof(*xc_desc) != 0); desc = (void *)xc_desc; @@ -170,7 +174,7 @@ void *xencomm_map(void *ptr, unsigned lo if (rc || desc == NULL) return NULL; - return (void *)__pa(desc); + return xencomm_pa(desc); } void *__xencomm_map_no_alloc(void *ptr, unsigned long bytes, @@ -188,5 +192,5 @@ void *__xencomm_map_no_alloc(void *ptr, if (rc) return NULL; - return (void *)__pa(desc); + return xencomm_pa(desc); } diff -r d2f9b7e36231 -r 06791935b2cb include/xen/xencomm.h --- a/include/xen/xencomm.h Thu Aug 09 16:16:28 2007 +0100 +++ b/include/xen/xencomm.h Mon Aug 13 12:17:04 2007 +0900 @@ -35,10 +35,30 @@ extern void *__xencomm_map_no_alloc(void extern void *__xencomm_map_no_alloc(void *ptr, unsigned long bytes, struct xencomm_mini *xc_area); -#define xencomm_map_no_alloc(ptr, bytes) \ - ({struct xencomm_mini xc_desc\ - __attribute__((__aligned__(sizeof(struct xencomm_mini;\ - __xencomm_map_no_alloc(ptr, bytes, &xc_desc);}) +#if 0 +#define XENCOMM_MINI_ALIGNED(xc_desc, n) \ + struct xencomm_mini xc_desc ## _base[(n)] \ + __attribute__((__aligned__(sizeof(struct xencomm_mini; \ + struct xencomm_mini* xc_desc = &xc_desc ## _base[0]; +#else +/* + * gcc bug workaround: + * http://gcc.gnu.org/bugzilla/show_bug.cgi?id=16660 + * gcc doesn't handle properly stack variable with + * __attribute__((__align__(sizeof(struct xencomm_mini + */ +#define XENCOMM_MINI_ALIGNED(xc_desc, n) \ + unsigned char xc_desc ## _base[((n) + 1 ) * \ + sizeof(struct xencomm_mini)];\ + struct xencomm_mini *xc_desc = (struct xencomm_mini*) \ + ((unsigned long)xc_desc ## _base + \ +(sizeof(struct xencomm_mini) - \ + ((unsigned long)xc_desc ## _base) % \ + sizeof(struct xencomm_mini))); +#endif +#define xencomm_map_no_alloc(ptr, bytes) \ + ({XENCOMM_MINI_ALIGNED(xc_desc, 1); \ + __xencomm_map_no_alloc(ptr, bytes, xc_desc);}) /* provided by architecture code: */ extern unsigned long xencomm_vtop(unsigned long
[XenPPC] [PATCH 6/7] xencomm take 3: linux side introduce struct xencomm_handle *
# HG changeset patch # User [EMAIL PROTECTED] # Date 1186473244 -32400 # Node ID 7b88b56c310b11abe510cf32fc5095bc98979168 # Parent 06791935b2cb9d69e94ca89ca8febcda627017b2 [xencomm] introduce opaque type struct xencomm_handle* for xencommized value This patch is preparation for xencomm consolidation. powerpc uses void * for xencommized value, on the other hand IA64 uses struct xencomm_handle *. Unify it with struct xencomm_handle *. PATCHNAME: introduce_xencomm_handle Signed-off-by: Isaku Yamahata <[EMAIL PROTECTED]> diff -r 06791935b2cb -r 7b88b56c310b arch/powerpc/platforms/xen/hcall.c --- a/arch/powerpc/platforms/xen/hcall.cMon Aug 13 12:17:04 2007 +0900 +++ b/arch/powerpc/platforms/xen/hcall.cTue Aug 07 16:54:04 2007 +0900 @@ -56,7 +56,7 @@ int HYPERVISOR_console_io(int cmd, int count, char *str) { - void *desc; + struct xencomm_handle *desc; int rc; desc = xencomm_map_no_alloc(str, count); @@ -76,7 +76,8 @@ int HYPERVISOR_event_channel_op(int cmd, { int rc; - void *desc = xencomm_map_no_alloc(op, sizeof(evtchn_op_t)); + struct xencomm_handle *desc = + xencomm_map_no_alloc(op, sizeof(evtchn_op_t)); if (desc == NULL) return -EINVAL; @@ -92,7 +93,7 @@ EXPORT_SYMBOL(HYPERVISOR_event_channel_o int HYPERVISOR_xen_version(int cmd, void *arg) { - void *desc; + struct xencomm_handle *desc; const unsigned long hcall = __HYPERVISOR_xen_version; int argsize; int rc; @@ -144,7 +145,8 @@ EXPORT_SYMBOL(HYPERVISOR_xen_version); int HYPERVISOR_physdev_op(int cmd, void *op) { - void *desc = xencomm_map_no_alloc(op, sizeof(physdev_op_t)); + struct xencomm_handle *desc = + xencomm_map_no_alloc(op, sizeof(physdev_op_t)); int rc; if (desc == NULL) @@ -163,8 +165,8 @@ int HYPERVISOR_sched_op(int cmd, void *a { int argsize = 0; int rc = -EINVAL; - void *desc; - evtchn_port_t *ports = NULL; + struct xencomm_handle *desc; + struct xencomm_handle *ports = NULL; switch (cmd) { case SCHEDOP_yield: @@ -187,7 +189,7 @@ int HYPERVISOR_sched_op(int cmd, void *a if (ports == NULL) return -ENOMEM; - set_xen_guest_handle(sched_poll.ports, ports); + set_xen_guest_handle(sched_poll.ports, (evtchn_port_t *)ports); memcpy(arg, &sched_poll, sizeof(sched_poll)); } @@ -222,7 +224,7 @@ int HYPERVISOR_suspend(unsigned long sre struct sched_shutdown sched_shutdown = { .reason = SHUTDOWN_suspend, }; - void *desc; + struct xencomm_handle *desc; desc = xencomm_map_no_alloc(&sched_shutdown, sizeof(struct sched_shutdown)); @@ -234,7 +236,7 @@ int HYPERVISOR_kexec_op(unsigned long op int HYPERVISOR_kexec_op(unsigned long op, void *args) { unsigned long argsize; - void *desc; + struct xencomm_handle *desc; switch (op) { case KEXEC_CMD_kexec_get_range: @@ -316,8 +318,8 @@ static int xenppc_privcmd_domctl(privcmd { xen_domctl_t kern_op; xen_domctl_t __user *user_op = (xen_domctl_t __user *)hypercall->arg[0]; - void *op_desc; - void *desc = NULL; + struct xencomm_handle *op_desc; + struct xencomm_handle *desc = NULL; int ret = 0; if (copy_from_user(&kern_op, user_op, sizeof(xen_domctl_t))) @@ -349,7 +351,7 @@ static int xenppc_privcmd_domctl(privcmd ret = -ENOMEM; set_xen_guest_handle(kern_op.u.getmemlist.buffer, -desc); +(void *)desc); break; case XEN_DOMCTL_getpageframeinfo: break; @@ -362,7 +364,7 @@ static int xenppc_privcmd_domctl(privcmd ret = -ENOMEM; set_xen_guest_handle(kern_op.u.getpageframeinfo2.array, -desc); +(void *)desc); break; case XEN_DOMCTL_shadow_op: @@ -376,7 +378,7 @@ static int xenppc_privcmd_domctl(privcmd ret = -ENOMEM; set_xen_guest_handle(kern_op.u.shadow_op.dirty_bitmap, -desc); +(void *)desc); } break; case XEN_DOMCTL_max_mem: @@ -391,7 +393,7 @@ static int xenppc_privcmd_domctl(privcmd ret = -ENOMEM; set_xen_guest_handle(kern_op.u.vcpucontext.ctxt, -desc); +(void *)desc); break; case XEN_DOMCTL_getvcpuinfo: break; @@ -405,7 +407,7 @@ static int xenppc_privcmd_domctl(privcmd
[XenPPC] [PATCH 4/7] xencomm take 3: xen side multiple page support
# HG changeset patch # User [EMAIL PROTECTED] # Date 1187077969 -32400 # Node ID b53a87fec6dc7a9f2cc3b3b7df72fa8b2c7fe081 # Parent cf7a9141e7f884a14cfab8d3785c95dea85749be [xen, xencomm] xencomm multiple page support Current implementation doesn't allow struct xencomm_desc::address array to be more than single page. On IA64 it causes 64GB+ domain creation failure. This patch generalizes xencomm to allow multipage xencomm_desc::address array. PATCHNAME: xencomm_multiple_page Signed-off-by: Isaku Yamahata <[EMAIL PROTECTED]> diff -r cf7a9141e7f8 -r b53a87fec6dc xen/common/xencomm.c --- a/xen/common/xencomm.c Tue Aug 14 17:17:01 2007 +0900 +++ b/xen/common/xencomm.c Tue Aug 14 16:52:49 2007 +0900 @@ -17,6 +17,7 @@ * * Authors: Hollis Blanchard <[EMAIL PROTECTED]> * Tristan Gingold <[EMAIL PROTECTED]> + * Isaku Yamahata <[EMAIL PROTECTED]> multiple page support */ #include @@ -81,10 +82,11 @@ xencomm_desc_cross_page_boundary(unsigne } struct xencomm_ctxt { -struct xencomm_desc *desc; - +struct xencomm_desc __user *desc_in_paddr; uint32_t nr_addrs; + struct page_info *page; +unsigned long *address; }; static uint32_t @@ -94,21 +96,9 @@ xencomm_ctxt_nr_addrs(const struct xenco } static unsigned long* -xencomm_ctxt_address(struct xencomm_ctxt *ctxt, int i) -{ -return &ctxt->desc->address[i]; -} - -/* check if struct xencomm_desc and address array cross page boundary */ -static int -xencomm_ctxt_cross_page_boundary(struct xencomm_ctxt *ctxt) -{ -unsigned long saddr = (unsigned long)ctxt->desc; -unsigned long eaddr = -(unsigned long)&ctxt->desc->address[ctxt->nr_addrs] - 1; -if ((saddr >> PAGE_SHIFT) != (eaddr >> PAGE_SHIFT)) -return 1; -return 0; +xencomm_ctxt_address(struct xencomm_ctxt *ctxt) +{ +return ctxt->address; } static int @@ -136,15 +126,38 @@ xencomm_ctxt_init(const void* handle, st return -EINVAL; } -ctxt->desc = desc; ctxt->nr_addrs = desc->nr_addrs; /* copy before use. * It is possible for a guest domain to * modify concurrently. */ +ctxt->desc_in_paddr = (struct xencomm_desc*)handle; ctxt->page = page; -if (xencomm_ctxt_cross_page_boundary(ctxt)) { -put_page(page); -return -EINVAL; +ctxt->address = &desc->address[0]; +return 0; +} + +static int +xencomm_ctxt_next(struct xencomm_ctxt *ctxt, int i) +{ +BUG_ON(i >= ctxt->nr_addrs); +/* in i == 0 case, we already calculated in xecomm_addr_init() */ +if (i != 0) +ctxt->address++; + +/* When crossing page boundary, machine address must be calculated. */ +if (((unsigned long)ctxt->address & ~PAGE_MASK) == 0) { +unsigned long paddr = +(unsigned long)&(ctxt->desc_in_paddr->address[i]); +struct page_info *page; +int ret; + +ret = xencomm_get_page(paddr, &page); +if (ret == 0) { +put_page(ctxt->page); +ctxt->page = page; +ctxt->address = xencomm_vaddr(paddr, page); +} +return ret; } return 0; } @@ -236,11 +249,14 @@ xencomm_copy_from_guest(void *to, const /* iterate through the descriptor, copying up to a page at a time */ while ((to_pos < n) && (i < xencomm_ctxt_nr_addrs(&ctxt))) { -unsigned long src_paddr = *xencomm_ctxt_address(&ctxt, i); +unsigned long src_paddr; unsigned int pgoffset; unsigned int chunksz; unsigned int chunk_skip; +if (xencomm_ctxt_next(&ctxt, i)) +goto out; +src_paddr = *xencomm_ctxt_address(&ctxt); if (src_paddr == XENCOMM_INVALID) { i++; continue; @@ -353,11 +369,14 @@ xencomm_copy_to_guest(void *to, const vo /* iterate through the descriptor, copying up to a page at a time */ while ((from_pos < n) && (i < xencomm_ctxt_nr_addrs(&ctxt))) { -unsigned long dest_paddr = *xencomm_ctxt_address(&ctxt, i); +unsigned long dest_paddr; unsigned int pgoffset; unsigned int chunksz; unsigned int chunk_skip; +if (xencomm_ctxt_next(&ctxt, i)) +goto out; +dest_paddr = *xencomm_ctxt_address(&ctxt); if (dest_paddr == XENCOMM_INVALID) { i++; continue; @@ -401,21 +420,28 @@ int xencomm_add_offset(void **handle, un { struct xencomm_ctxt ctxt; int i = 0; +int res = 0; if (xencomm_is_inline(*handle)) return xencomm_inline_add_offset(handle, bytes); -if (xencomm_ctxt_init(handle, &ctxt)) -return -1; +res = xencomm_ctxt_init(handle, &ctxt); +if (res != 0) +return res; /* iterate through the descriptor incrementing addresses */ while ((bytes > 0) && (i < xencomm_ctxt_nr_addrs(&ctxt))) { -unsigned long *addr
[XenPPC] [PATCH 7/7] xencomm take 3: linux side remove xencomm page size limit
# HG changeset patch # User [EMAIL PROTECTED] # Date 1185948239 -32400 # Node ID d2884d84d3fd5c52a6699eb7b98ffaf3b8f7a422 # Parent 7b88b56c310b11abe510cf32fc5095bc98979168 [linux, xencomm] remove xencomm page size limit. Currently xencomm has page size limit so that a domain with many memory (e.g. 100GB~) can't be created. Now that xencomm of xen side accepts struct xencomm_desc whose address array crosses page boundary. Thus it isn't necessary to allocate single page not to cross page boundary. We can allocate exact sized memory. Note that struct xencomm_desc can't cross page boundary and slab allocator returns sizeof(void*) aligned pointer. Where sizeof(*desc) > sizeof(void*), e.g. 32 bit environment, the slab allocator return pointer doesn't gurantee that struct xencomm_desc doesn't cross page boundary. So we fall back to page allocator. PATCHNAME: remove_xencomm_page_size_limit_common_code Signed-off-by: Isaku Yamahata <[EMAIL PROTECTED]> diff -r 7b88b56c310b -r d2884d84d3fd drivers/xen/core/xencomm.c --- a/drivers/xen/core/xencomm.cTue Aug 07 16:54:04 2007 +0900 +++ b/drivers/xen/core/xencomm.cWed Aug 01 15:03:59 2007 +0900 @@ -68,25 +68,54 @@ static int xencomm_init(struct xencomm_d return 0; } -/* XXX use slab allocator */ -static struct xencomm_desc *xencomm_alloc(gfp_t gfp_mask) -{ - struct xencomm_desc *desc; - - desc = (struct xencomm_desc *)__get_free_page(gfp_mask); - if (desc == NULL) - return NULL; - - desc->nr_addrs = (PAGE_SIZE - sizeof(struct xencomm_desc)) / +static struct xencomm_desc *xencomm_alloc(gfp_t gfp_mask, + void *buffer, unsigned long bytes) +{ + struct xencomm_desc *desc; + unsigned long buffer_ulong = (unsigned long)buffer; + unsigned long start = buffer_ulong & PAGE_MASK; + unsigned long end = (buffer_ulong + bytes) | ~PAGE_MASK; + unsigned long nr_addrs = (end - start + 1) >> PAGE_SHIFT; + unsigned long size = sizeof(*desc) + + sizeof(desc->address[0]) * nr_addrs; + + /* +* slab allocator returns at least sizeof(void*) aligned pointer. +* When sizeof(*desc) > sizeof(void*), struct xencomm_desc might +* cross page boundary. +*/ + if (sizeof(*desc) > sizeof(void*)) { + unsigned long order = get_order(size); + desc = (struct xencomm_desc *)__get_free_pages(gfp_mask, + order); + if (desc == NULL) + return NULL; + + desc->nr_addrs = + ((PAGE_SIZE << order) - sizeof(struct xencomm_desc)) / sizeof(*desc->address); - + } else { + desc = kmalloc(size, gfp_mask); + if (desc == NULL) + return NULL; + + desc->nr_addrs = nr_addrs; + } return desc; } void xencomm_free(struct xencomm_handle *desc) { - if (desc && !((ulong)desc & XENCOMM_INLINE_FLAG)) - free_page((unsigned long)__va(desc)); + if (desc && !((ulong)desc & XENCOMM_INLINE_FLAG)) { + struct xencomm_desc *desc__ = (struct xencomm_desc*)desc; + if (sizeof(*desc__) > sizeof(void*)) { + unsigned long size = sizeof(*desc__) + + sizeof(desc__->address[0]) * desc__->nr_addrs; + unsigned long order = get_order(size); + free_pages((unsigned long)__va(desc), order); + } else + kfree(__va(desc)); + } } static int xencomm_create(void *buffer, unsigned long bytes, struct xencomm_desc **ret, gfp_t gfp_mask) @@ -105,7 +134,7 @@ static int xencomm_create(void *buffer, BUG_ON(buffer == NULL); /* 'bytes' is non-zero */ - desc = xencomm_alloc(gfp_mask); + desc = xencomm_alloc(gfp_mask, buffer, bytes); if (!desc) { printk("%s failure\n", "xencomm_alloc"); return -ENOMEM; # HG changeset patch # User [EMAIL PROTECTED] # Date 1185948239 -32400 # Node ID d2884d84d3fd5c52a6699eb7b98ffaf3b8f7a422 # Parent 7b88b56c310b11abe510cf32fc5095bc98979168 [linux, xencomm] remove xencomm page size limit. Currently xencomm has page size limit so that a domain with many memory (e.g. 100GB~) can't be created. Now that xencomm of xen side accepts struct xencomm_desc whose address array crosses page boundary. Thus it isn't necessary to allocate single page not to cross page boundary. We can allocate exact sized memory. Note that struct xencomm_desc can't cross page boundary and slab allocator returns sizeof(void*) aligned pointer. Where sizeof(*desc) > sizeof(void*), e.g. 32 bit environment, the slab allocator return pointer doesn't gurantee that struct xencomm_desc doesn't cross page boundary. So we fall back to page allocator. PATCHNAME: r
[XenPPC] [PATCH 3/7] xencomm take 3: xen side fix invalid or racy access
# HG changeset patch # User [EMAIL PROTECTED] # Date 1187079421 -32400 # Node ID cf7a9141e7f884a14cfab8d3785c95dea85749be # Parent 4dbbedee6bb8d594287940470a61b8c0c56daf9c [xen, xencomm] fix various xencomm invalid racy access. - Xencomm should check struct xencomm_desc alignment. - Xencomm should check whether struct xencomm_desc itself (8 bytes) doesn't cross page boundary. Otherwise a hostile guest kernel can pass such a pointer that may across page boundary. Then xencomm accesses a unrelated page. - Xencomm shouldn't access struct xencomm_desc::nr_addrs multiple times. Copy it to local area and use the copy. Otherwise a hostile guest can modify at the same time. - Xencomm should check whether struct xencomm_desc::address[] array crosses page boundary. Otherwise xencomm may access unrelated pages. - Xencomm should get_page()/put_page() after address conversion from paddr to maddr because xen supports SMP and balloon driver. Otherwise another vcpu may free the page at the same time. Such a domain behaviour doesn't make sense, however nothing prevents it. PATCHNAME: fix_xencomm_invalid_racy_access Signed-off-by: Isaku Yamahata <[EMAIL PROTECTED]> diff -r 4dbbedee6bb8 -r cf7a9141e7f8 xen/common/xencomm.c --- a/xen/common/xencomm.c Tue Aug 14 16:46:23 2007 +0900 +++ b/xen/common/xencomm.c Tue Aug 14 17:17:01 2007 +0900 @@ -35,12 +35,149 @@ static int xencomm_debug = 1; /* extreme #endif static void* -xencomm_maddr_to_vaddr(unsigned long maddr) -{ +xencomm_vaddr(unsigned long paddr, struct page_info *page) +{ +return (void*)((paddr & ~PAGE_MASK) | (unsigned long)page_to_virt(page)); +} + +/* get_page() to prevent from another vcpu freeing the page */ +static int +xencomm_get_page(unsigned long paddr, struct page_info **page) +{ +unsigned long maddr = paddr_to_maddr(paddr); if (maddr == 0) -return NULL; - -return maddr_to_virt(maddr); +return -EFAULT; + +*page = maddr_to_page(maddr); +if (get_page(*page, current->domain) == 0) { +if (page_get_owner(*page) != current->domain) { +/* + * This page might be a page granted by another domain, or + * this page is freed with decrease reservation hypercall at + * the same time. + */ +gdprintk(XENLOG_WARNING, + "bad page is passed. paddr 0x%lx maddr 0x%lx\n", + paddr, maddr); +return -EFAULT; +} + +/* Try again. */ +cpu_relax(); +return -EAGAIN; +} + +return 0; +} + +/* check if struct desc doesn't cross page boundry */ +static int +xencomm_desc_cross_page_boundary(unsigned long paddr) +{ +unsigned long offset = paddr & ~PAGE_MASK; +if (offset > PAGE_SIZE - sizeof(struct xencomm_desc)) +return 1; +return 0; +} + +struct xencomm_ctxt { +struct xencomm_desc *desc; + +uint32_t nr_addrs; +struct page_info *page; +}; + +static uint32_t +xencomm_ctxt_nr_addrs(const struct xencomm_ctxt *ctxt) +{ +return ctxt->nr_addrs; +} + +static unsigned long* +xencomm_ctxt_address(struct xencomm_ctxt *ctxt, int i) +{ +return &ctxt->desc->address[i]; +} + +/* check if struct xencomm_desc and address array cross page boundary */ +static int +xencomm_ctxt_cross_page_boundary(struct xencomm_ctxt *ctxt) +{ +unsigned long saddr = (unsigned long)ctxt->desc; +unsigned long eaddr = +(unsigned long)&ctxt->desc->address[ctxt->nr_addrs] - 1; +if ((saddr >> PAGE_SHIFT) != (eaddr >> PAGE_SHIFT)) +return 1; +return 0; +} + +static int +xencomm_ctxt_init(const void* handle, struct xencomm_ctxt *ctxt) +{ +struct page_info *page; +struct xencomm_desc *desc; +int ret; + +/* avoid unaligned access */ +if ((unsigned long)handle % __alignof__(*desc) != 0) +return -EINVAL; +if (xencomm_desc_cross_page_boundary((unsigned long)handle)) +return -EINVAL; + +/* first we need to access the descriptor */ +ret = xencomm_get_page((unsigned long)handle, &page); +if (ret) +return ret; + +desc = xencomm_vaddr((unsigned long)handle, page); +if (desc->magic != XENCOMM_MAGIC) { +printk("%s: error: %p magic was 0x%x\n", __func__, desc, desc->magic); +put_page(page); +return -EINVAL; +} + +ctxt->desc = desc; +ctxt->nr_addrs = desc->nr_addrs; /* copy before use. + * It is possible for a guest domain to + * modify concurrently. + */ +ctxt->page = page; +if (xencomm_ctxt_cross_page_boundary(ctxt)) { +put_page(page); +return -EINVAL; +} +return 0; +} + +static void +xencomm_ctxt_done(struct xencomm_ctxt *ctxt) +{ +put_page(ctxt->page); +} + +static int +xencomm_copy_chunk_from(unsigned long to, unsigned long paddr, +unsigned int len) +{ +struct p
[XenPPC] [PATCH 1/7] xencomm take 3: xen side trivial bug fix
# HG changeset patch # User [EMAIL PROTECTED] # Date 1187077482 -32400 # Node ID 68867379b785a9a6fd37ca75be64fa7b5e3b8a2b # Parent cd51fa91956be20dbd744d46117f7f989e08c334 [xen, xencomm] xencomm trivial bug fix - fix return address of xencomm_copy_to_guest() - fix xencomm_add_offset() PATCHNAME: xencomm_trivial_fix Signed-off-by: Isaku Yamahata <[EMAIL PROTECTED]> diff -r cd51fa91956b -r 68867379b785 xen/common/xencomm.c --- a/xen/common/xencomm.c Sun Aug 12 14:50:02 2007 -0600 +++ b/xen/common/xencomm.c Tue Aug 14 16:44:42 2007 +0900 @@ -232,7 +232,7 @@ xencomm_copy_to_guest(void *to, const vo dest_maddr = paddr_to_maddr(dest_paddr + chunk_skip); if (dest_maddr == 0) -return -1; +return n - from_pos; if (xencomm_debug) printk("%lx[%d] -> %lx\n", source, bytes, dest_maddr); @@ -279,6 +279,11 @@ int xencomm_add_offset(void **handle, un unsigned int pgoffset; unsigned int chunksz; unsigned int chunk_skip; + +if (dest_paddr == XENCOMM_INVALID) { +i++; +continue; +} pgoffset = dest_paddr % PAGE_SIZE; chunksz = PAGE_SIZE - pgoffset; @@ -291,6 +296,8 @@ int xencomm_add_offset(void **handle, un desc->address[i] += chunk_skip; } bytes -= chunk_skip; + +i++; } return 0; } # HG changeset patch # User [EMAIL PROTECTED] # Date 1187077482 -32400 # Node ID 68867379b785a9a6fd37ca75be64fa7b5e3b8a2b # Parent cd51fa91956be20dbd744d46117f7f989e08c334 [xen, xencomm] xencomm trivial bug fix - fix return address of xencomm_copy_to_guest() - fix xencomm_add_offset() PATCHNAME: xencomm_trivial_fix Signed-off-by: Isaku Yamahata <[EMAIL PROTECTED]> diff -r cd51fa91956b -r 68867379b785 xen/common/xencomm.c --- a/xen/common/xencomm.c Sun Aug 12 14:50:02 2007 -0600 +++ b/xen/common/xencomm.c Tue Aug 14 16:44:42 2007 +0900 @@ -232,7 +232,7 @@ xencomm_copy_to_guest(void *to, const vo dest_maddr = paddr_to_maddr(dest_paddr + chunk_skip); if (dest_maddr == 0) -return -1; +return n - from_pos; if (xencomm_debug) printk("%lx[%d] -> %lx\n", source, bytes, dest_maddr); @@ -279,6 +279,11 @@ int xencomm_add_offset(void **handle, un unsigned int pgoffset; unsigned int chunksz; unsigned int chunk_skip; + +if (dest_paddr == XENCOMM_INVALID) { +i++; +continue; +} pgoffset = dest_paddr % PAGE_SIZE; chunksz = PAGE_SIZE - pgoffset; @@ -291,6 +296,8 @@ int xencomm_add_offset(void **handle, un desc->address[i] += chunk_skip; } bytes -= chunk_skip; + +i++; } return 0; } ___ Xen-ppc-devel mailing list Xen-ppc-devel@lists.xensource.com http://lists.xensource.com/xen-ppc-devel
[XenPPC] [PATCH 2/7] xencomm take 3: xen side preparetion for consolidation.
# HG changeset patch # User [EMAIL PROTECTED] # Date 1187077583 -32400 # Node ID 4dbbedee6bb8d594287940470a61b8c0c56daf9c # Parent 68867379b785a9a6fd37ca75be64fa7b5e3b8a2b [xen, xencomm] preparetion for xencomm consolidation. Xen/powerpc runs in real mode so that it uses maddr interchangably with vaddr. But it isn't the case in xen/ia64. It is necessary to convert maddr to vaddr to access the page. maddr_to_virt() doesn't convert on powerpc, so it should work on both archtechture. PATCHNAME: xencomm_consolidation_maddr_vaddr Signed-off-by: Isaku Yamahata <[EMAIL PROTECTED]> diff -r 68867379b785 -r 4dbbedee6bb8 xen/common/xencomm.c --- a/xen/common/xencomm.c Tue Aug 14 16:44:42 2007 +0900 +++ b/xen/common/xencomm.c Tue Aug 14 16:46:23 2007 +0900 @@ -34,6 +34,15 @@ static int xencomm_debug = 1; /* extreme #define xencomm_debug 0 #endif +static void* +xencomm_maddr_to_vaddr(unsigned long maddr) +{ +if (maddr == 0) +return NULL; + +return maddr_to_virt(maddr); +} + static unsigned long xencomm_inline_from_guest(void *to, const void *from, unsigned int n, unsigned int skip) @@ -54,7 +63,7 @@ xencomm_inline_from_guest(void *to, cons src_maddr = paddr_to_maddr(src_paddr); if (xencomm_debug) printk("%lx[%d] -> %lx\n", src_maddr, bytes, (unsigned long)to); -memcpy(to, (void *)src_maddr, bytes); +memcpy(to, maddr_to_virt(src_maddr), bytes); src_paddr += bytes; to += bytes; n -= bytes; @@ -89,7 +98,8 @@ xencomm_copy_from_guest(void *to, const return xencomm_inline_from_guest(to, from, n, skip); /* first we need to access the descriptor */ -desc = (struct xencomm_desc *)paddr_to_maddr((unsigned long)from); +desc = (struct xencomm_desc *) +xencomm_maddr_to_vaddr(paddr_to_maddr((unsigned long)from)); if (desc == NULL) return n; @@ -130,7 +140,7 @@ xencomm_copy_from_guest(void *to, const if (xencomm_debug) printk("%lx[%d] -> %lx\n", src_maddr, bytes, dest); -memcpy((void *)dest, (void *)src_maddr, bytes); +memcpy((void *)dest, maddr_to_virt(src_maddr), bytes); from_pos += bytes; to_pos += bytes; } @@ -161,7 +171,7 @@ xencomm_inline_to_guest(void *to, const dest_maddr = paddr_to_maddr(dest_paddr); if (xencomm_debug) printk("%lx[%d] -> %lx\n", (unsigned long)from, bytes, dest_maddr); -memcpy((void *)dest_maddr, (void *)from, bytes); +memcpy(maddr_to_virt(dest_maddr), (void *)from, bytes); dest_paddr += bytes; from += bytes; n -= bytes; @@ -196,7 +206,8 @@ xencomm_copy_to_guest(void *to, const vo return xencomm_inline_to_guest(to, from, n, skip); /* first we need to access the descriptor */ -desc = (struct xencomm_desc *)paddr_to_maddr((unsigned long)to); +desc = (struct xencomm_desc *) +xencomm_maddr_to_vaddr(paddr_to_maddr((unsigned long)to)); if (desc == NULL) return n; @@ -236,7 +247,7 @@ xencomm_copy_to_guest(void *to, const vo if (xencomm_debug) printk("%lx[%d] -> %lx\n", source, bytes, dest_maddr); -memcpy((void *)dest_maddr, (void *)source, bytes); +memcpy(maddr_to_virt(dest_maddr), (void *)source, bytes); from_pos += bytes; to_pos += bytes; } @@ -264,7 +275,8 @@ int xencomm_add_offset(void **handle, un return xencomm_inline_add_offset(handle, bytes); /* first we need to access the descriptor */ -desc = (struct xencomm_desc *)paddr_to_maddr((unsigned long)*handle); +desc = (struct xencomm_desc *) +xencomm_maddr_to_vaddr(paddr_to_maddr((unsigned long)*handle)); if (desc == NULL) return -1; @@ -310,7 +322,8 @@ int xencomm_handle_is_null(void *handle) if (xencomm_is_inline(handle)) return xencomm_inline_addr(handle) == 0; -desc = (struct xencomm_desc *)paddr_to_maddr((unsigned long)handle); +desc = (struct xencomm_desc *) +xencomm_maddr_to_vaddr(paddr_to_maddr((unsigned long)handle)); if (desc == NULL) return 1; # HG changeset patch # User [EMAIL PROTECTED] # Date 1187077583 -32400 # Node ID 4dbbedee6bb8d594287940470a61b8c0c56daf9c # Parent 68867379b785a9a6fd37ca75be64fa7b5e3b8a2b [xen, xencomm] preparetion for xencomm consolidation. Xen/powerpc runs in real mode so that it uses maddr interchangably with vaddr. But it isn't the case in xen/ia64. It is necessary to convert maddr to vaddr to access the page. maddr_to_virt() doesn't convert on powerpc, so it should work on both archtechture. PATCHNAME: xencomm_consolidation_maddr_vaddr Signed-off-by: Isaku Yamahata <[EMAIL PROTECTED]> diff -r 68867379b785 -r 4dbbedee6bb8 xen/common/xencomm.c --- a/xen/common/xencomm.c Tue Aug 14 16:44:42 2007 +0900 +++ b/xen/common/xencomm.c Tue Aug 14 16:46:23 2007 +0900 @
[XenPPC] [PATCH 0/7] xencomm take 3: preparation for consolidation and various fixes
This take 3 patch series is for xencomm consolidation and various fixes. I split up the xen side patch into 4 smaller ones and cleaned them up. Linux side patch is same as previous one. The issue is the complexity and the multi page support. So I think that 3/7, 4/7 and 7/7 are moot, other patches can be merged. [PATCH 1/7] xencomm take 3: xen side trivial bug fix [PATCH 2/7] xencomm take 3: xen side preparetion for consolidation. [PATCH 3/7] xencomm take 3: xen side fix invalid or racy access [PATCH 4/7] xencomm take 3: xen side multiple page support [PATCH 5/7] xencomm take 3: linux side various fixes and preparation for consolidation [PATCH 6/7] xencomm take 3: linux side introduce struct xencomm_handle * [PATCH 7/7] xencomm take 3: linux side remove xencomm page size limit Changes from take 2: xen side - Split up into 4 patches and clean them up. - add more race checks Linux side: - no change Changes from take 1: xen side varisous fixes and preparation for consolidation - sorted out the maddr and vaddr issue - returning an error with warning message instead of panicing domain. linux side various fixes and preparation for consolidation - update gcc work around. It is a generic issue, not ia64 specific. Thanks, ___ Xen-ppc-devel mailing list Xen-ppc-devel@lists.xensource.com http://lists.xensource.com/xen-ppc-devel