Re: [Xen-devel] Re: [XenPPC] [PATCH 2/7] xencomm take 3: xen side preparetion for consolidation.

2007-08-14 Thread Isaku Yamahata
On Tue, Aug 14, 2007 at 09:48:00AM -0500, Hollis Blanchard wrote:

> However, there are a few places below where you call memcpy() without
> checking the result of xencomm_maddr_to_vaddr(). Actually, I see the
> same issue in the original code in a few places... We should be very
> very careful here, since a guest passing a bad paddr could result in Xen
> overwriting 0x0.

Thank you for comments. The next patch (3/7) addresses those issues.
i.e. checking guest supplied values, avoiding races.
I intentionally kept this patch(2/7) as small as possible leaving them
to the next patch (3/7).

Since we can work around the populate physmap issue,
it's ok for me to drop multi page support.
But we need the next patch or something similar.
If you dislike the implementation, I'm willing to refine it.
Please suggest.
-- 
yamahata

___
Xen-ppc-devel mailing list
Xen-ppc-devel@lists.xensource.com
http://lists.xensource.com/xen-ppc-devel


[XenPPC] Re: [Xen-devel] [PATCH 0/7] xencomm take 3: preparation for consolidation and various fixes

2007-08-14 Thread Keir Fraser



On 14/8/07 10:50, "Isaku Yamahata" <[EMAIL PROTECTED]> wrote:

> This take 3 patch series is for xencomm consolidation and various fixes.
> I split up the xen side patch into 4 smaller ones and cleaned them up.
> Linux side patch is same as previous one.
> The issue is the complexity and the multi page support.
> So I think that 3/7, 4/7 and 7/7 are moot, other patches can be merged.

All applied but 3, 4, 7.

 -- Keir


___
Xen-ppc-devel mailing list
Xen-ppc-devel@lists.xensource.com
http://lists.xensource.com/xen-ppc-devel


Re: [XenPPC] [PATCH 2/7] xencomm take 3: xen side preparetion for consolidation.

2007-08-14 Thread Hollis Blanchard
This is definitely needed and I apologize for my maddr/vaddr confusion
in the first place.

However, there are a few places below where you call memcpy() without
checking the result of xencomm_maddr_to_vaddr(). Actually, I see the
same issue in the original code in a few places... We should be very
very careful here, since a guest passing a bad paddr could result in Xen
overwriting 0x0.

On Tue, 2007-08-14 at 18:50 +0900, Isaku Yamahata wrote:
> # HG changeset patch
> # User [EMAIL PROTECTED]
> # Date 1187077583 -32400
> # Node ID 4dbbedee6bb8d594287940470a61b8c0c56daf9c
> # Parent  68867379b785a9a6fd37ca75be64fa7b5e3b8a2b
> [xen, xencomm] preparetion for xencomm consolidation.
> Xen/powerpc runs in real mode so that it uses maddr interchangably with
> vaddr.
> But it isn't the case in xen/ia64. It is necessary to convert maddr to vaddr
> to access the page. maddr_to_virt() doesn't convert on powerpc, so it should
> work on both archtechture.
> PATCHNAME: xencomm_consolidation_maddr_vaddr
> 
> Signed-off-by: Isaku Yamahata <[EMAIL PROTECTED]>
> 
> diff -r 68867379b785 -r 4dbbedee6bb8 xen/common/xencomm.c
> --- a/xen/common/xencomm.cTue Aug 14 16:44:42 2007 +0900
> +++ b/xen/common/xencomm.cTue Aug 14 16:46:23 2007 +0900
> @@ -34,6 +34,15 @@ static int xencomm_debug = 1; /* extreme
>  #define xencomm_debug 0
>  #endif
> 
> +static void*
> +xencomm_maddr_to_vaddr(unsigned long maddr)
> +{
> +if (maddr == 0)
> +return NULL;
> +
> +return maddr_to_virt(maddr);
> +}
> +
>  static unsigned long
>  xencomm_inline_from_guest(void *to, const void *from, unsigned int n,
>  unsigned int skip)
> @@ -54,7 +63,7 @@ xencomm_inline_from_guest(void *to, cons
>  src_maddr = paddr_to_maddr(src_paddr);
>  if (xencomm_debug)
>  printk("%lx[%d] -> %lx\n", src_maddr, bytes, (unsigned long)to);
> -memcpy(to, (void *)src_maddr, bytes);
> +memcpy(to, maddr_to_virt(src_maddr), bytes);
>  src_paddr += bytes;
>  to += bytes;
>  n -= bytes;
> @@ -89,7 +98,8 @@ xencomm_copy_from_guest(void *to, const 
>  return xencomm_inline_from_guest(to, from, n, skip);
> 
>  /* first we need to access the descriptor */
> -desc = (struct xencomm_desc *)paddr_to_maddr((unsigned long)from);
> +desc = (struct xencomm_desc *)
> +xencomm_maddr_to_vaddr(paddr_to_maddr((unsigned long)from));
>  if (desc == NULL)
>  return n;
> 
> @@ -130,7 +140,7 @@ xencomm_copy_from_guest(void *to, const 
> 
>  if (xencomm_debug)
>  printk("%lx[%d] -> %lx\n", src_maddr, bytes, dest);
> -memcpy((void *)dest, (void *)src_maddr, bytes);
> +memcpy((void *)dest, maddr_to_virt(src_maddr), bytes);
>  from_pos += bytes;
>  to_pos += bytes;
>  }
> @@ -161,7 +171,7 @@ xencomm_inline_to_guest(void *to, const 
>  dest_maddr = paddr_to_maddr(dest_paddr);
>  if (xencomm_debug)
>  printk("%lx[%d] -> %lx\n", (unsigned long)from, bytes, 
> dest_maddr);
> -memcpy((void *)dest_maddr, (void *)from, bytes);
> +memcpy(maddr_to_virt(dest_maddr), (void *)from, bytes);
>  dest_paddr += bytes;
>  from += bytes;
>  n -= bytes;
> @@ -196,7 +206,8 @@ xencomm_copy_to_guest(void *to, const vo
>  return xencomm_inline_to_guest(to, from, n, skip);
> 
>  /* first we need to access the descriptor */
> -desc = (struct xencomm_desc *)paddr_to_maddr((unsigned long)to);
> +desc = (struct xencomm_desc *)
> +xencomm_maddr_to_vaddr(paddr_to_maddr((unsigned long)to));
>  if (desc == NULL)
>  return n;
> 
> @@ -236,7 +247,7 @@ xencomm_copy_to_guest(void *to, const vo
> 
>  if (xencomm_debug)
>  printk("%lx[%d] -> %lx\n", source, bytes, dest_maddr);
> -memcpy((void *)dest_maddr, (void *)source, bytes);
> +memcpy(maddr_to_virt(dest_maddr), (void *)source, bytes);
>  from_pos += bytes;
>  to_pos += bytes;
>  }
> @@ -264,7 +275,8 @@ int xencomm_add_offset(void **handle, un
>  return xencomm_inline_add_offset(handle, bytes);
> 
>  /* first we need to access the descriptor */
> -desc = (struct xencomm_desc *)paddr_to_maddr((unsigned long)*handle);
> +desc = (struct xencomm_desc *)
> +xencomm_maddr_to_vaddr(paddr_to_maddr((unsigned long)*handle));
>  if (desc == NULL)
>  return -1;
> 
> @@ -310,7 +322,8 @@ int xencomm_handle_is_null(void *handle)
>  if (xencomm_is_inline(handle))
>  return xencomm_inline_addr(handle) == 0;
> 
> -desc = (struct xencomm_desc *)paddr_to_maddr((unsigned long)handle);
> +desc = (struct xencomm_desc *)
> +xencomm_maddr_to_vaddr(paddr_to_maddr((unsigned long)handle));
>  if (desc == NULL)
>  return 1;
> 
> ___
> Xen-ppc-devel mailing list
> Xen-ppc-devel@lists.xe

[XenPPC] [PATCH 5/7] xencomm take 3: linux side various fixes and preparation for consolidation

2007-08-14 Thread Isaku Yamahata
# HG changeset patch
# User [EMAIL PROTECTED]
# Date 1186975024 -32400
# Node ID 06791935b2cb9d69e94ca89ca8febcda627017b2
# Parent  d2f9b7e3623114e6a45c916f21b348fda122fa8e
[linux, xencomm]Various fixes common xencomm.c for ia64 xencomm consolidation
  - move xen_guest_handle() macro into include/xen/xencomm.h
ia64 also uses it.
  - is_kern_addr() is powerpc specific. and other arch doesn't implement it.
It will be defined in linux/include/asm-ia64/xen/xencomm.h
  - fix error recovery path of xencomm_create()
xencomm_free() requires pseudo physical address, not virtual address.
  - add one BUG_ON() to xencomm_create_mini() for alignment requirement
  - use xencomm_pa() instead of __pa() in xencomm_map() and
xencomm_map_no_alloc().
They should work for statically allocated area. On ia64 it isn't in
straight mapping area so that xencomm_pa() is necessary.
  - add gcc bug work around. gcc 4.1.2 doesn't handle properly
variables on stack with align attribute.
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=16660

PATCHNAME: fix_xencomm_create_in_common_code

Signed-off-by: Isaku Yamahata <[EMAIL PROTECTED]>)

diff -r d2f9b7e36231 -r 06791935b2cb arch/powerpc/platforms/xen/setup.h
--- a/arch/powerpc/platforms/xen/setup.hThu Aug 09 16:16:28 2007 +0100
+++ b/arch/powerpc/platforms/xen/setup.hMon Aug 13 12:17:04 2007 +0900
@@ -40,8 +40,6 @@ static inline u64 jiffies_to_ns(unsigned
return j * (10UL / HZ);
 }
 
-#define xen_guest_handle(hnd)  ((hnd).p)
-
 extern struct page *alloc_foreign_page(void);
 extern void free_foreign_page(struct page *page);
 
diff -r d2f9b7e36231 -r 06791935b2cb drivers/xen/core/xencomm.c
--- a/drivers/xen/core/xencomm.cThu Aug 09 16:16:28 2007 +0100
+++ b/drivers/xen/core/xencomm.cMon Aug 13 12:17:04 2007 +0900
@@ -23,6 +23,9 @@
 #include 
 #include 
 #include 
+#ifdef __ia64__
+#include/* for is_kern_addr() */
+#endif
 
 static int xencomm_init(struct xencomm_desc *desc,
void *buffer, unsigned long bytes)
@@ -111,7 +114,7 @@ static int xencomm_create(void *buffer, 
rc = xencomm_init(desc, buffer, bytes);
if (rc) {
printk("%s failure: %d\n", "xencomm_init", rc);
-   xencomm_free(desc);
+   xencomm_free((void *)__pa(desc));
return rc;
}
 
@@ -146,6 +149,7 @@ static int xencomm_create_mini(void *buf
 {
int rc = 0;
struct xencomm_desc *desc;
+   BUG_ON(((unsigned long)xc_desc) % sizeof(*xc_desc) != 0);
 
desc = (void *)xc_desc; 
 
@@ -170,7 +174,7 @@ void *xencomm_map(void *ptr, unsigned lo
if (rc || desc == NULL)
return NULL;
 
-   return (void *)__pa(desc);
+   return xencomm_pa(desc);
 }
 
 void *__xencomm_map_no_alloc(void *ptr, unsigned long bytes, 
@@ -188,5 +192,5 @@ void *__xencomm_map_no_alloc(void *ptr, 
if (rc)
return NULL;
  
-   return (void *)__pa(desc);
+   return xencomm_pa(desc);
 }
diff -r d2f9b7e36231 -r 06791935b2cb include/xen/xencomm.h
--- a/include/xen/xencomm.h Thu Aug 09 16:16:28 2007 +0100
+++ b/include/xen/xencomm.h Mon Aug 13 12:17:04 2007 +0900
@@ -35,10 +35,30 @@ extern void *__xencomm_map_no_alloc(void
 extern void *__xencomm_map_no_alloc(void *ptr, unsigned long bytes, 
struct xencomm_mini *xc_area);
 
-#define xencomm_map_no_alloc(ptr, bytes) \
-   ({struct xencomm_mini xc_desc\
-   __attribute__((__aligned__(sizeof(struct xencomm_mini;\
-   __xencomm_map_no_alloc(ptr, bytes, &xc_desc);})
+#if 0
+#define XENCOMM_MINI_ALIGNED(xc_desc, n)   \
+   struct xencomm_mini xc_desc ## _base[(n)]   \
+   __attribute__((__aligned__(sizeof(struct xencomm_mini;  \
+   struct xencomm_mini* xc_desc = &xc_desc ## _base[0];
+#else
+/*
+ * gcc bug workaround:
+ * http://gcc.gnu.org/bugzilla/show_bug.cgi?id=16660
+ * gcc doesn't handle properly stack variable with
+ * __attribute__((__align__(sizeof(struct xencomm_mini
+ */
+#define XENCOMM_MINI_ALIGNED(xc_desc, n)   \
+   unsigned char xc_desc ## _base[((n) + 1 ) * \
+  sizeof(struct xencomm_mini)];\
+   struct xencomm_mini *xc_desc = (struct xencomm_mini*)   \
+   ((unsigned long)xc_desc ## _base +  \
+(sizeof(struct xencomm_mini) - \
+ ((unsigned long)xc_desc ## _base) %   \
+ sizeof(struct xencomm_mini)));
+#endif
+#define xencomm_map_no_alloc(ptr, bytes)   \
+   ({XENCOMM_MINI_ALIGNED(xc_desc, 1); \
+   __xencomm_map_no_alloc(ptr, bytes, xc_desc);})
 
 /* provided by architecture code: */
 extern unsigned long xencomm_vtop(unsigned long 

[XenPPC] [PATCH 6/7] xencomm take 3: linux side introduce struct xencomm_handle *

2007-08-14 Thread Isaku Yamahata
# HG changeset patch
# User [EMAIL PROTECTED]
# Date 1186473244 -32400
# Node ID 7b88b56c310b11abe510cf32fc5095bc98979168
# Parent  06791935b2cb9d69e94ca89ca8febcda627017b2
[xencomm] introduce opaque type struct xencomm_handle* for xencommized value
This patch is preparation for xencomm consolidation.
powerpc uses void * for xencommized value, on the other hand IA64 uses
struct xencomm_handle *. Unify it with struct xencomm_handle *.
PATCHNAME: introduce_xencomm_handle

Signed-off-by: Isaku Yamahata <[EMAIL PROTECTED]>

diff -r 06791935b2cb -r 7b88b56c310b arch/powerpc/platforms/xen/hcall.c
--- a/arch/powerpc/platforms/xen/hcall.cMon Aug 13 12:17:04 2007 +0900
+++ b/arch/powerpc/platforms/xen/hcall.cTue Aug 07 16:54:04 2007 +0900
@@ -56,7 +56,7 @@
 
 int HYPERVISOR_console_io(int cmd, int count, char *str)
 {
-   void *desc;
+   struct xencomm_handle *desc;
int rc;
 
desc = xencomm_map_no_alloc(str, count);
@@ -76,7 +76,8 @@ int HYPERVISOR_event_channel_op(int cmd,
 {
int rc;
 
-   void *desc = xencomm_map_no_alloc(op, sizeof(evtchn_op_t));
+   struct xencomm_handle *desc =
+   xencomm_map_no_alloc(op, sizeof(evtchn_op_t));
if (desc == NULL)
return -EINVAL;
 
@@ -92,7 +93,7 @@ EXPORT_SYMBOL(HYPERVISOR_event_channel_o
 
 int HYPERVISOR_xen_version(int cmd, void *arg)
 {
-   void *desc;
+   struct xencomm_handle *desc;
const unsigned long hcall = __HYPERVISOR_xen_version;
int argsize;
int rc;
@@ -144,7 +145,8 @@ EXPORT_SYMBOL(HYPERVISOR_xen_version);
 
 int HYPERVISOR_physdev_op(int cmd, void *op)
 {
-   void *desc = xencomm_map_no_alloc(op, sizeof(physdev_op_t));
+   struct xencomm_handle *desc =
+   xencomm_map_no_alloc(op, sizeof(physdev_op_t));
int rc;
 
if (desc == NULL)
@@ -163,8 +165,8 @@ int HYPERVISOR_sched_op(int cmd, void *a
 {
int argsize = 0;
int rc = -EINVAL;
-   void *desc;
-   evtchn_port_t *ports = NULL;
+   struct xencomm_handle *desc;
+   struct xencomm_handle *ports = NULL;
 
switch (cmd) {
case SCHEDOP_yield:
@@ -187,7 +189,7 @@ int HYPERVISOR_sched_op(int cmd, void *a
if (ports == NULL)
return -ENOMEM;
 
-   set_xen_guest_handle(sched_poll.ports, ports);
+   set_xen_guest_handle(sched_poll.ports, (evtchn_port_t *)ports);
memcpy(arg, &sched_poll, sizeof(sched_poll));
 
}
@@ -222,7 +224,7 @@ int HYPERVISOR_suspend(unsigned long sre
struct sched_shutdown sched_shutdown = {
.reason = SHUTDOWN_suspend,
};
-   void *desc;
+   struct xencomm_handle *desc;
 
desc = xencomm_map_no_alloc(&sched_shutdown, sizeof(struct 
sched_shutdown));
 
@@ -234,7 +236,7 @@ int HYPERVISOR_kexec_op(unsigned long op
 int HYPERVISOR_kexec_op(unsigned long op, void *args)
 {
unsigned long argsize;
-   void *desc;
+   struct xencomm_handle *desc;
 
switch (op) {
case KEXEC_CMD_kexec_get_range:
@@ -316,8 +318,8 @@ static int xenppc_privcmd_domctl(privcmd
 {
xen_domctl_t kern_op;
xen_domctl_t __user *user_op = (xen_domctl_t __user *)hypercall->arg[0];
-   void *op_desc;
-   void *desc = NULL;
+   struct xencomm_handle *op_desc;
+   struct xencomm_handle *desc = NULL;
int ret = 0;
 
if (copy_from_user(&kern_op, user_op, sizeof(xen_domctl_t)))
@@ -349,7 +351,7 @@ static int xenppc_privcmd_domctl(privcmd
ret = -ENOMEM;
 
set_xen_guest_handle(kern_op.u.getmemlist.buffer,
-desc);
+(void *)desc);
break;
case XEN_DOMCTL_getpageframeinfo:
break;
@@ -362,7 +364,7 @@ static int xenppc_privcmd_domctl(privcmd
ret = -ENOMEM;
 
set_xen_guest_handle(kern_op.u.getpageframeinfo2.array,
-desc);
+(void *)desc);
break;
case XEN_DOMCTL_shadow_op:
 
@@ -376,7 +378,7 @@ static int xenppc_privcmd_domctl(privcmd
ret = -ENOMEM;
 
set_xen_guest_handle(kern_op.u.shadow_op.dirty_bitmap,
-desc);
+(void *)desc);
}
break;
case XEN_DOMCTL_max_mem:
@@ -391,7 +393,7 @@ static int xenppc_privcmd_domctl(privcmd
ret = -ENOMEM;
 
set_xen_guest_handle(kern_op.u.vcpucontext.ctxt,
-desc);
+(void *)desc);
break;
case XEN_DOMCTL_getvcpuinfo:
break;
@@ -405,7 +407,7 @@ static int xenppc_privcmd_domctl(privcmd
  

[XenPPC] [PATCH 4/7] xencomm take 3: xen side multiple page support

2007-08-14 Thread Isaku Yamahata
# HG changeset patch
# User [EMAIL PROTECTED]
# Date 1187077969 -32400
# Node ID b53a87fec6dc7a9f2cc3b3b7df72fa8b2c7fe081
# Parent  cf7a9141e7f884a14cfab8d3785c95dea85749be
[xen, xencomm] xencomm multiple page support
Current implementation doesn't allow struct xencomm_desc::address array
to be more than single page. On IA64 it causes 64GB+ domain creation
failure. This patch generalizes xencomm to allow multipage
xencomm_desc::address array.
PATCHNAME: xencomm_multiple_page

Signed-off-by: Isaku Yamahata <[EMAIL PROTECTED]>

diff -r cf7a9141e7f8 -r b53a87fec6dc xen/common/xencomm.c
--- a/xen/common/xencomm.c  Tue Aug 14 17:17:01 2007 +0900
+++ b/xen/common/xencomm.c  Tue Aug 14 16:52:49 2007 +0900
@@ -17,6 +17,7 @@
  *
  * Authors: Hollis Blanchard <[EMAIL PROTECTED]>
  *  Tristan Gingold <[EMAIL PROTECTED]>
+ *  Isaku Yamahata <[EMAIL PROTECTED]> multiple page support
  */
 
 #include 
@@ -81,10 +82,11 @@ xencomm_desc_cross_page_boundary(unsigne
 }
 
 struct xencomm_ctxt {
-struct xencomm_desc *desc;
-
+struct xencomm_desc __user *desc_in_paddr;
 uint32_t nr_addrs;
+
 struct page_info *page;
+unsigned long *address;
 };
 
 static uint32_t
@@ -94,21 +96,9 @@ xencomm_ctxt_nr_addrs(const struct xenco
 }
 
 static unsigned long*
-xencomm_ctxt_address(struct xencomm_ctxt *ctxt, int i)
-{
-return &ctxt->desc->address[i];
-}
-
-/* check if struct xencomm_desc and address array cross page boundary */
-static int
-xencomm_ctxt_cross_page_boundary(struct xencomm_ctxt *ctxt)
-{
-unsigned long saddr = (unsigned long)ctxt->desc;
-unsigned long eaddr =
-(unsigned long)&ctxt->desc->address[ctxt->nr_addrs] - 1;
-if ((saddr >> PAGE_SHIFT) != (eaddr >> PAGE_SHIFT))
-return 1;
-return 0;
+xencomm_ctxt_address(struct xencomm_ctxt *ctxt)
+{
+return ctxt->address;
 }
 
 static int
@@ -136,15 +126,38 @@ xencomm_ctxt_init(const void* handle, st
 return -EINVAL;
 }
 
-ctxt->desc = desc;
 ctxt->nr_addrs = desc->nr_addrs; /* copy before use.
   * It is possible for a guest domain to
   * modify concurrently.
   */
+ctxt->desc_in_paddr = (struct xencomm_desc*)handle;
 ctxt->page = page;
-if (xencomm_ctxt_cross_page_boundary(ctxt)) {
-put_page(page);
-return -EINVAL;
+ctxt->address = &desc->address[0];
+return 0;
+}
+
+static int
+xencomm_ctxt_next(struct xencomm_ctxt *ctxt, int i)
+{
+BUG_ON(i >= ctxt->nr_addrs);
+/* in i == 0 case, we already calculated in xecomm_addr_init() */
+if (i != 0)
+ctxt->address++;
+
+/* When crossing page boundary, machine address must be calculated. */
+if (((unsigned long)ctxt->address & ~PAGE_MASK) == 0) {
+unsigned long paddr =
+(unsigned long)&(ctxt->desc_in_paddr->address[i]);
+struct page_info *page;
+int ret;
+
+ret = xencomm_get_page(paddr, &page);
+if (ret == 0) {
+put_page(ctxt->page);
+ctxt->page = page;
+ctxt->address = xencomm_vaddr(paddr, page);
+}
+return ret;
 }
 return 0;
 }
@@ -236,11 +249,14 @@ xencomm_copy_from_guest(void *to, const 
 
 /* iterate through the descriptor, copying up to a page at a time */
 while ((to_pos < n) && (i < xencomm_ctxt_nr_addrs(&ctxt))) {
-unsigned long src_paddr = *xencomm_ctxt_address(&ctxt, i);
+unsigned long src_paddr;
 unsigned int pgoffset;
 unsigned int chunksz;
 unsigned int chunk_skip;
 
+if (xencomm_ctxt_next(&ctxt, i))
+goto out;
+src_paddr = *xencomm_ctxt_address(&ctxt);
 if (src_paddr == XENCOMM_INVALID) {
 i++;
 continue;
@@ -353,11 +369,14 @@ xencomm_copy_to_guest(void *to, const vo
 
 /* iterate through the descriptor, copying up to a page at a time */
 while ((from_pos < n) && (i < xencomm_ctxt_nr_addrs(&ctxt))) {
-unsigned long dest_paddr = *xencomm_ctxt_address(&ctxt, i);
+unsigned long dest_paddr;
 unsigned int pgoffset;
 unsigned int chunksz;
 unsigned int chunk_skip;
 
+if (xencomm_ctxt_next(&ctxt, i))
+goto out;
+dest_paddr = *xencomm_ctxt_address(&ctxt);
 if (dest_paddr == XENCOMM_INVALID) {
 i++;
 continue;
@@ -401,21 +420,28 @@ int xencomm_add_offset(void **handle, un
 {
 struct xencomm_ctxt ctxt;
 int i = 0;
+int res = 0;
 
 if (xencomm_is_inline(*handle))
 return xencomm_inline_add_offset(handle, bytes);
 
-if (xencomm_ctxt_init(handle, &ctxt))
-return -1;
+res = xencomm_ctxt_init(handle, &ctxt);
+if (res != 0)
+return res;
 
 /* iterate through the descriptor incrementing addresses */
 while ((bytes > 0) && (i < xencomm_ctxt_nr_addrs(&ctxt))) {
-unsigned long *addr

[XenPPC] [PATCH 7/7] xencomm take 3: linux side remove xencomm page size limit

2007-08-14 Thread Isaku Yamahata
# HG changeset patch
# User [EMAIL PROTECTED]
# Date 1185948239 -32400
# Node ID d2884d84d3fd5c52a6699eb7b98ffaf3b8f7a422
# Parent  7b88b56c310b11abe510cf32fc5095bc98979168
[linux, xencomm] remove xencomm page size limit.
Currently xencomm has page size limit so that a domain with many memory
(e.g. 100GB~) can't be created.

Now that xencomm of xen side accepts struct xencomm_desc whose address array
crosses page boundary. Thus it isn't necessary to allocate single page
not to cross page boundary. We can allocate exact sized memory.
Note that struct xencomm_desc can't cross page boundary and slab allocator
returns sizeof(void*) aligned pointer.
Where sizeof(*desc) > sizeof(void*), e.g. 32 bit environment,
the slab allocator return pointer doesn't gurantee that
struct xencomm_desc doesn't cross page boundary. So we fall back to
page allocator.
PATCHNAME: remove_xencomm_page_size_limit_common_code

Signed-off-by: Isaku Yamahata <[EMAIL PROTECTED]>

diff -r 7b88b56c310b -r d2884d84d3fd drivers/xen/core/xencomm.c
--- a/drivers/xen/core/xencomm.cTue Aug 07 16:54:04 2007 +0900
+++ b/drivers/xen/core/xencomm.cWed Aug 01 15:03:59 2007 +0900
@@ -68,25 +68,54 @@ static int xencomm_init(struct xencomm_d
return 0;
 }
 
-/* XXX use slab allocator */
-static struct xencomm_desc *xencomm_alloc(gfp_t gfp_mask)
-{
-   struct xencomm_desc *desc;
-
-   desc = (struct xencomm_desc *)__get_free_page(gfp_mask);
-   if (desc == NULL)
-   return NULL;
-
-   desc->nr_addrs = (PAGE_SIZE - sizeof(struct xencomm_desc)) /
+static struct xencomm_desc *xencomm_alloc(gfp_t gfp_mask,
+ void *buffer, unsigned long bytes)
+{
+   struct xencomm_desc *desc;
+   unsigned long buffer_ulong = (unsigned long)buffer;
+   unsigned long start = buffer_ulong & PAGE_MASK;
+   unsigned long end = (buffer_ulong + bytes) | ~PAGE_MASK;
+   unsigned long nr_addrs = (end - start + 1) >> PAGE_SHIFT;
+   unsigned long size = sizeof(*desc) +
+   sizeof(desc->address[0]) * nr_addrs;
+
+   /*
+* slab allocator returns at least sizeof(void*) aligned pointer.
+* When sizeof(*desc) > sizeof(void*), struct xencomm_desc might
+* cross page boundary.
+*/
+   if (sizeof(*desc) > sizeof(void*)) {
+   unsigned long order = get_order(size);
+   desc = (struct xencomm_desc *)__get_free_pages(gfp_mask,
+  order);
+   if (desc == NULL)
+   return NULL;
+
+   desc->nr_addrs =
+   ((PAGE_SIZE << order) - sizeof(struct xencomm_desc)) /
sizeof(*desc->address);
-
+   } else {
+   desc = kmalloc(size, gfp_mask);
+   if (desc == NULL)
+   return NULL;
+
+   desc->nr_addrs = nr_addrs;
+   }
return desc;
 }
 
 void xencomm_free(struct xencomm_handle *desc)
 {
-   if (desc && !((ulong)desc & XENCOMM_INLINE_FLAG))
-   free_page((unsigned long)__va(desc));
+   if (desc && !((ulong)desc & XENCOMM_INLINE_FLAG)) {
+   struct xencomm_desc *desc__ = (struct xencomm_desc*)desc;
+   if (sizeof(*desc__) > sizeof(void*)) {
+   unsigned long size = sizeof(*desc__) +
+   sizeof(desc__->address[0]) * desc__->nr_addrs;
+   unsigned long order = get_order(size);
+   free_pages((unsigned long)__va(desc), order);
+   } else
+   kfree(__va(desc));
+   }
 }
 
 static int xencomm_create(void *buffer, unsigned long bytes, struct 
xencomm_desc **ret, gfp_t gfp_mask)
@@ -105,7 +134,7 @@ static int xencomm_create(void *buffer, 
 
BUG_ON(buffer == NULL); /* 'bytes' is non-zero */
 
-   desc = xencomm_alloc(gfp_mask);
+   desc = xencomm_alloc(gfp_mask, buffer, bytes);
if (!desc) {
printk("%s failure\n", "xencomm_alloc");
return -ENOMEM;
# HG changeset patch
# User [EMAIL PROTECTED]
# Date 1185948239 -32400
# Node ID d2884d84d3fd5c52a6699eb7b98ffaf3b8f7a422
# Parent  7b88b56c310b11abe510cf32fc5095bc98979168
[linux, xencomm] remove xencomm page size limit.
Currently xencomm has page size limit so that a domain with many memory
(e.g. 100GB~) can't be created.

Now that xencomm of xen side accepts struct xencomm_desc whose address array
crosses page boundary. Thus it isn't necessary to allocate single page
not to cross page boundary. We can allocate exact sized memory.
Note that struct xencomm_desc can't cross page boundary and slab allocator
returns sizeof(void*) aligned pointer.
Where sizeof(*desc) > sizeof(void*), e.g. 32 bit environment,
the slab allocator return pointer doesn't gurantee that
struct xencomm_desc doesn't cross page boundary. So we fall back to
page allocator.
PATCHNAME: r

[XenPPC] [PATCH 3/7] xencomm take 3: xen side fix invalid or racy access

2007-08-14 Thread Isaku Yamahata
# HG changeset patch
# User [EMAIL PROTECTED]
# Date 1187079421 -32400
# Node ID cf7a9141e7f884a14cfab8d3785c95dea85749be
# Parent  4dbbedee6bb8d594287940470a61b8c0c56daf9c
[xen, xencomm] fix various xencomm invalid racy access.
- Xencomm should check struct xencomm_desc alignment.
- Xencomm should check whether struct xencomm_desc itself (8 bytes) doesn't
  cross page boundary. Otherwise a hostile guest kernel can pass such
  a pointer that may across page boundary. Then xencomm accesses a unrelated
  page.
- Xencomm shouldn't access struct xencomm_desc::nr_addrs multiple times.
  Copy it to local area and use the copy.
  Otherwise a hostile guest can modify at the same time.
- Xencomm should check whether struct xencomm_desc::address[] array crosses
  page boundary. Otherwise xencomm may access unrelated pages.
- Xencomm should get_page()/put_page() after address conversion from paddr
  to maddr because xen supports SMP and balloon driver.
  Otherwise another vcpu may free the page at the same time.
  Such a domain behaviour doesn't make sense, however nothing prevents it.

PATCHNAME: fix_xencomm_invalid_racy_access

Signed-off-by: Isaku Yamahata <[EMAIL PROTECTED]>

diff -r 4dbbedee6bb8 -r cf7a9141e7f8 xen/common/xencomm.c
--- a/xen/common/xencomm.c  Tue Aug 14 16:46:23 2007 +0900
+++ b/xen/common/xencomm.c  Tue Aug 14 17:17:01 2007 +0900
@@ -35,12 +35,149 @@ static int xencomm_debug = 1; /* extreme
 #endif
 
 static void*
-xencomm_maddr_to_vaddr(unsigned long maddr)
-{
+xencomm_vaddr(unsigned long paddr, struct page_info *page)
+{
+return (void*)((paddr & ~PAGE_MASK) | (unsigned long)page_to_virt(page));
+}
+
+/* get_page() to prevent from another vcpu freeing the page */
+static int
+xencomm_get_page(unsigned long paddr, struct page_info **page)
+{
+unsigned long maddr = paddr_to_maddr(paddr);
 if (maddr == 0)
-return NULL;
-
-return maddr_to_virt(maddr);
+return -EFAULT;
+
+*page = maddr_to_page(maddr);
+if (get_page(*page, current->domain) == 0) {
+if (page_get_owner(*page) != current->domain) {
+/*
+ * This page might be a page granted by another domain, or
+ * this page is freed with decrease reservation hypercall at
+ * the same time.
+ */
+gdprintk(XENLOG_WARNING,
+ "bad page is passed. paddr 0x%lx maddr 0x%lx\n",
+ paddr, maddr);
+return -EFAULT;
+}
+
+/* Try again. */
+cpu_relax();
+return -EAGAIN;
+}
+
+return 0;
+}
+
+/* check if struct desc doesn't cross page boundry */
+static int
+xencomm_desc_cross_page_boundary(unsigned long paddr)
+{
+unsigned long offset = paddr & ~PAGE_MASK;
+if (offset > PAGE_SIZE - sizeof(struct xencomm_desc))
+return 1;
+return 0;
+}
+
+struct xencomm_ctxt {
+struct xencomm_desc *desc;
+
+uint32_t nr_addrs;
+struct page_info *page;
+};
+
+static uint32_t
+xencomm_ctxt_nr_addrs(const struct xencomm_ctxt *ctxt)
+{
+return ctxt->nr_addrs;
+}
+
+static unsigned long*
+xencomm_ctxt_address(struct xencomm_ctxt *ctxt, int i)
+{
+return &ctxt->desc->address[i];
+}
+
+/* check if struct xencomm_desc and address array cross page boundary */
+static int
+xencomm_ctxt_cross_page_boundary(struct xencomm_ctxt *ctxt)
+{
+unsigned long saddr = (unsigned long)ctxt->desc;
+unsigned long eaddr =
+(unsigned long)&ctxt->desc->address[ctxt->nr_addrs] - 1;
+if ((saddr >> PAGE_SHIFT) != (eaddr >> PAGE_SHIFT))
+return 1;
+return 0;
+}
+
+static int
+xencomm_ctxt_init(const void* handle, struct xencomm_ctxt *ctxt)
+{
+struct page_info *page;
+struct xencomm_desc *desc;
+int ret;
+
+/* avoid unaligned access */
+if ((unsigned long)handle % __alignof__(*desc) != 0)
+return -EINVAL;
+if (xencomm_desc_cross_page_boundary((unsigned long)handle))
+return -EINVAL;
+
+/* first we need to access the descriptor */
+ret = xencomm_get_page((unsigned long)handle, &page);
+if (ret)
+return ret;
+
+desc = xencomm_vaddr((unsigned long)handle, page);
+if (desc->magic != XENCOMM_MAGIC) {
+printk("%s: error: %p magic was 0x%x\n", __func__, desc, desc->magic);
+put_page(page);
+return -EINVAL;
+}
+
+ctxt->desc = desc;
+ctxt->nr_addrs = desc->nr_addrs; /* copy before use.
+  * It is possible for a guest domain to
+  * modify concurrently.
+  */
+ctxt->page = page;
+if (xencomm_ctxt_cross_page_boundary(ctxt)) {
+put_page(page);
+return -EINVAL;
+}
+return 0;
+}
+
+static void
+xencomm_ctxt_done(struct xencomm_ctxt *ctxt)
+{
+put_page(ctxt->page);
+}
+
+static int
+xencomm_copy_chunk_from(unsigned long to, unsigned long paddr,
+unsigned int  len)
+{
+struct p

[XenPPC] [PATCH 1/7] xencomm take 3: xen side trivial bug fix

2007-08-14 Thread Isaku Yamahata
# HG changeset patch
# User [EMAIL PROTECTED]
# Date 1187077482 -32400
# Node ID 68867379b785a9a6fd37ca75be64fa7b5e3b8a2b
# Parent  cd51fa91956be20dbd744d46117f7f989e08c334
[xen, xencomm] xencomm trivial bug fix
- fix return address of xencomm_copy_to_guest()
- fix xencomm_add_offset()
PATCHNAME: xencomm_trivial_fix

Signed-off-by: Isaku Yamahata <[EMAIL PROTECTED]>

diff -r cd51fa91956b -r 68867379b785 xen/common/xencomm.c
--- a/xen/common/xencomm.c  Sun Aug 12 14:50:02 2007 -0600
+++ b/xen/common/xencomm.c  Tue Aug 14 16:44:42 2007 +0900
@@ -232,7 +232,7 @@ xencomm_copy_to_guest(void *to, const vo
 
 dest_maddr = paddr_to_maddr(dest_paddr + chunk_skip);
 if (dest_maddr == 0)
-return -1;
+return n - from_pos;
 
 if (xencomm_debug)
 printk("%lx[%d] -> %lx\n", source, bytes, dest_maddr);
@@ -279,6 +279,11 @@ int xencomm_add_offset(void **handle, un
 unsigned int pgoffset;
 unsigned int chunksz;
 unsigned int chunk_skip;
+
+if (dest_paddr == XENCOMM_INVALID) {
+i++;
+continue;
+}
 
 pgoffset = dest_paddr % PAGE_SIZE;
 chunksz = PAGE_SIZE - pgoffset;
@@ -291,6 +296,8 @@ int xencomm_add_offset(void **handle, un
 desc->address[i] += chunk_skip;
 }
 bytes -= chunk_skip;
+
+i++;
 }
 return 0;
 }
# HG changeset patch
# User [EMAIL PROTECTED]
# Date 1187077482 -32400
# Node ID 68867379b785a9a6fd37ca75be64fa7b5e3b8a2b
# Parent  cd51fa91956be20dbd744d46117f7f989e08c334
[xen, xencomm] xencomm trivial bug fix
- fix return address of xencomm_copy_to_guest()
- fix xencomm_add_offset()
PATCHNAME: xencomm_trivial_fix

Signed-off-by: Isaku Yamahata <[EMAIL PROTECTED]>

diff -r cd51fa91956b -r 68867379b785 xen/common/xencomm.c
--- a/xen/common/xencomm.c	Sun Aug 12 14:50:02 2007 -0600
+++ b/xen/common/xencomm.c	Tue Aug 14 16:44:42 2007 +0900
@@ -232,7 +232,7 @@ xencomm_copy_to_guest(void *to, const vo
 
 dest_maddr = paddr_to_maddr(dest_paddr + chunk_skip);
 if (dest_maddr == 0)
-return -1;
+return n - from_pos;
 
 if (xencomm_debug)
 printk("%lx[%d] -> %lx\n", source, bytes, dest_maddr);
@@ -279,6 +279,11 @@ int xencomm_add_offset(void **handle, un
 unsigned int pgoffset;
 unsigned int chunksz;
 unsigned int chunk_skip;
+
+if (dest_paddr == XENCOMM_INVALID) {
+i++;
+continue;
+}
 
 pgoffset = dest_paddr % PAGE_SIZE;
 chunksz = PAGE_SIZE - pgoffset;
@@ -291,6 +296,8 @@ int xencomm_add_offset(void **handle, un
 desc->address[i] += chunk_skip;
 }
 bytes -= chunk_skip;
+
+i++;
 }
 return 0;
 }
___
Xen-ppc-devel mailing list
Xen-ppc-devel@lists.xensource.com
http://lists.xensource.com/xen-ppc-devel

[XenPPC] [PATCH 2/7] xencomm take 3: xen side preparetion for consolidation.

2007-08-14 Thread Isaku Yamahata
# HG changeset patch
# User [EMAIL PROTECTED]
# Date 1187077583 -32400
# Node ID 4dbbedee6bb8d594287940470a61b8c0c56daf9c
# Parent  68867379b785a9a6fd37ca75be64fa7b5e3b8a2b
[xen, xencomm] preparetion for xencomm consolidation.
Xen/powerpc runs in real mode so that it uses maddr interchangably with
vaddr.
But it isn't the case in xen/ia64. It is necessary to convert maddr to vaddr
to access the page. maddr_to_virt() doesn't convert on powerpc, so it should
work on both archtechture.
PATCHNAME: xencomm_consolidation_maddr_vaddr

Signed-off-by: Isaku Yamahata <[EMAIL PROTECTED]>

diff -r 68867379b785 -r 4dbbedee6bb8 xen/common/xencomm.c
--- a/xen/common/xencomm.c  Tue Aug 14 16:44:42 2007 +0900
+++ b/xen/common/xencomm.c  Tue Aug 14 16:46:23 2007 +0900
@@ -34,6 +34,15 @@ static int xencomm_debug = 1; /* extreme
 #define xencomm_debug 0
 #endif
 
+static void*
+xencomm_maddr_to_vaddr(unsigned long maddr)
+{
+if (maddr == 0)
+return NULL;
+
+return maddr_to_virt(maddr);
+}
+
 static unsigned long
 xencomm_inline_from_guest(void *to, const void *from, unsigned int n,
 unsigned int skip)
@@ -54,7 +63,7 @@ xencomm_inline_from_guest(void *to, cons
 src_maddr = paddr_to_maddr(src_paddr);
 if (xencomm_debug)
 printk("%lx[%d] -> %lx\n", src_maddr, bytes, (unsigned long)to);
-memcpy(to, (void *)src_maddr, bytes);
+memcpy(to, maddr_to_virt(src_maddr), bytes);
 src_paddr += bytes;
 to += bytes;
 n -= bytes;
@@ -89,7 +98,8 @@ xencomm_copy_from_guest(void *to, const 
 return xencomm_inline_from_guest(to, from, n, skip);
 
 /* first we need to access the descriptor */
-desc = (struct xencomm_desc *)paddr_to_maddr((unsigned long)from);
+desc = (struct xencomm_desc *)
+xencomm_maddr_to_vaddr(paddr_to_maddr((unsigned long)from));
 if (desc == NULL)
 return n;
 
@@ -130,7 +140,7 @@ xencomm_copy_from_guest(void *to, const 
 
 if (xencomm_debug)
 printk("%lx[%d] -> %lx\n", src_maddr, bytes, dest);
-memcpy((void *)dest, (void *)src_maddr, bytes);
+memcpy((void *)dest, maddr_to_virt(src_maddr), bytes);
 from_pos += bytes;
 to_pos += bytes;
 }
@@ -161,7 +171,7 @@ xencomm_inline_to_guest(void *to, const 
 dest_maddr = paddr_to_maddr(dest_paddr);
 if (xencomm_debug)
 printk("%lx[%d] -> %lx\n", (unsigned long)from, bytes, dest_maddr);
-memcpy((void *)dest_maddr, (void *)from, bytes);
+memcpy(maddr_to_virt(dest_maddr), (void *)from, bytes);
 dest_paddr += bytes;
 from += bytes;
 n -= bytes;
@@ -196,7 +206,8 @@ xencomm_copy_to_guest(void *to, const vo
 return xencomm_inline_to_guest(to, from, n, skip);
 
 /* first we need to access the descriptor */
-desc = (struct xencomm_desc *)paddr_to_maddr((unsigned long)to);
+desc = (struct xencomm_desc *)
+xencomm_maddr_to_vaddr(paddr_to_maddr((unsigned long)to));
 if (desc == NULL)
 return n;
 
@@ -236,7 +247,7 @@ xencomm_copy_to_guest(void *to, const vo
 
 if (xencomm_debug)
 printk("%lx[%d] -> %lx\n", source, bytes, dest_maddr);
-memcpy((void *)dest_maddr, (void *)source, bytes);
+memcpy(maddr_to_virt(dest_maddr), (void *)source, bytes);
 from_pos += bytes;
 to_pos += bytes;
 }
@@ -264,7 +275,8 @@ int xencomm_add_offset(void **handle, un
 return xencomm_inline_add_offset(handle, bytes);
 
 /* first we need to access the descriptor */
-desc = (struct xencomm_desc *)paddr_to_maddr((unsigned long)*handle);
+desc = (struct xencomm_desc *)
+xencomm_maddr_to_vaddr(paddr_to_maddr((unsigned long)*handle));
 if (desc == NULL)
 return -1;
 
@@ -310,7 +322,8 @@ int xencomm_handle_is_null(void *handle)
 if (xencomm_is_inline(handle))
 return xencomm_inline_addr(handle) == 0;
 
-desc = (struct xencomm_desc *)paddr_to_maddr((unsigned long)handle);
+desc = (struct xencomm_desc *)
+xencomm_maddr_to_vaddr(paddr_to_maddr((unsigned long)handle));
 if (desc == NULL)
 return 1;
 
# HG changeset patch
# User [EMAIL PROTECTED]
# Date 1187077583 -32400
# Node ID 4dbbedee6bb8d594287940470a61b8c0c56daf9c
# Parent  68867379b785a9a6fd37ca75be64fa7b5e3b8a2b
[xen, xencomm] preparetion for xencomm consolidation.
Xen/powerpc runs in real mode so that it uses maddr interchangably with
vaddr.
But it isn't the case in xen/ia64. It is necessary to convert maddr to vaddr
to access the page. maddr_to_virt() doesn't convert on powerpc, so it should
work on both archtechture.
PATCHNAME: xencomm_consolidation_maddr_vaddr

Signed-off-by: Isaku Yamahata <[EMAIL PROTECTED]>

diff -r 68867379b785 -r 4dbbedee6bb8 xen/common/xencomm.c
--- a/xen/common/xencomm.c	Tue Aug 14 16:44:42 2007 +0900
+++ b/xen/common/xencomm.c	Tue Aug 14 16:46:23 2007 +0900
@

[XenPPC] [PATCH 0/7] xencomm take 3: preparation for consolidation and various fixes

2007-08-14 Thread Isaku Yamahata

This take 3 patch series is for xencomm consolidation and various fixes.
I split up the xen side patch into 4 smaller ones and cleaned them up.
Linux side patch is same as previous one.
The issue is the complexity and the multi page support.
So I think that 3/7, 4/7 and 7/7 are moot, other patches can be merged.


[PATCH 1/7] xencomm take 3: xen side trivial bug fix
[PATCH 2/7] xencomm take 3: xen side preparetion for consolidation.
[PATCH 3/7] xencomm take 3: xen side fix invalid or racy access
[PATCH 4/7] xencomm take 3: xen side multiple page support
[PATCH 5/7] xencomm take 3: linux side various fixes and preparation for 
consolidation
[PATCH 6/7] xencomm take 3: linux side introduce struct xencomm_handle *
[PATCH 7/7] xencomm take 3: linux side remove xencomm page size limit

Changes from take 2:
xen side
  - Split up into 4 patches and clean them up.
  - add more race checks
Linux side:
  - no change

Changes from take 1:
xen side varisous fixes and preparation for consolidation
  - sorted out the maddr and vaddr issue
  - returning an error with warning message instead of panicing domain.
linux side various fixes and preparation for consolidation
  - update gcc work around.
It is a generic issue, not ia64 specific.

Thanks,

___
Xen-ppc-devel mailing list
Xen-ppc-devel@lists.xensource.com
http://lists.xensource.com/xen-ppc-devel