[Xen-ia64-devel] RE: [PATCH 04/15] ia64/pv_ops: introduce pv_info which describes some random info.

2008-04-22 Thread Dong, Eddie

 
 Rather than making these binary patches, why not make them fast
 syscalls and using a vdso page. Some of the priviledged instructions
 are simply reads and we could have that information in a read-only
 data page, so there is no need to do a context switch at all. Others
 could benefit from a fast system call that doesn't do a full context
 switch. 

The issue is we don't want to change Linux code a lot, otherwise it
won't be accepted. If we use same logic with original Linux,
but it is using indirect function call now, binary patching could help
in performance.

 
 It would be nice if we could come up with a generic implementation for
 such a vdso style interface that could be shared between
 xen/kvm/lguest. 
 
Introducing a new mechanism to use it to replace Linux code to use them
is another different story.

Eddie

___
Xen-ia64-devel mailing list
Xen-ia64-devel@lists.xensource.com
http://lists.xensource.com/xen-ia64-devel


[Xen-ia64-devel] RE: [PATCH 04/15] ia64/pv_ops: introduce pv_info which describes some random info.

2008-04-22 Thread Dong, Eddie
Jes Sorensen wrote:
 Dong, Eddie wrote:
 Rather than making these binary patches, why not make them fast
 syscalls and using a vdso page. Some of the priviledged instructions
 are simply reads and we could have that information in a read-only
 data page, so there is no need to do a context switch at all. Others
 could benefit from a fast system call that doesn't do a full context
 switch.
 
 The issue is we don't want to change Linux code a lot, otherwise it
 won't be accepted. If we use same logic with original Linux,
 but it is using indirect function call now, binary patching could
 help in performance.
 
 Hi Eddie,
 
 Sorry but this is a wrong assumption. If the code is correct then
 there is no reason why it will not be accepted. It's far more
 important to avoid ugly clutter that makes the code hard to maintain.
 

My understanding is that code such as IVT table are well tuned and you
are really 
difficult to pursuade people to replace those privilege resource access
instruction
to use vdso or something equalvalent such as mov GRx=CRy.  For those C
code
previlige resource access, like Isaku mentioned, we need to consider
native too.

Anyway binary patching is just an optimization that X86 used and there
is no 
reason IA64 can't take. At least replacing indirect function with direct
function
call. 

Eddie

___
Xen-ia64-devel mailing list
Xen-ia64-devel@lists.xensource.com
http://lists.xensource.com/xen-ia64-devel


RE: [Xen-ia64-devel] pv_ops: imntrinsic pv_ops

2008-04-07 Thread Dong, Eddie
Isaku Yamahata wrote:
 Hi Eddie.
 
 I commited some clean ups based on your patch.
 Could you please review it?
 

It looks like you still prefer to use intermediate symbol 
paravirt_/ia64_native_xxx to wrap ia64_xxx. In some sense, when
I saw many similar and bulk code in the patch, I feel dirty. I prefer
we don't touch 
those gcc_intrinsic.h MACROs.

If you don't like the way I wrap it, i.e. MACRO level ifdef /else.
you may just simply provide a new head file
for CONFIG_PARAVIRT which simply use pv_ops.name for all those same 
MACROs.

such as in intrinsic.h:

#ifdef CONFIG_PARAVIRT_GUEST
#include paravirt_intrinsic.h
#else 
 gcc_intrinsic.h or intel_intrinsic.h.


Eddie

___
Xen-ia64-devel mailing list
Xen-ia64-devel@lists.xensource.com
http://lists.xensource.com/xen-ia64-devel


[Xen-ia64-devel] pv_ops progress ask for suggestion

2008-04-07 Thread Dong, Eddie
Tony  all:
Recently we have completed the IVT.s pv_ops by using dual
compile, and also many cleanups to simplify the changes to upstream
code. All the C code touching privilege instruction is replaced with
indirect function call (will be binary patched to use direct function
call in future), and IVT table is dual compiled to minimize impact to
native IVT table, but we get some dilemma in handling kernel/entry.S and
also generic policy for other ASM files.

In entry.S, there are around 17 privilege instructions, some of
them must be paravirtualized including 2 cover instructions, and 1 RFI
(this one is due to Xen hypervisor issue). There are other 15 privilege
instructions (In Xen) such as CR access that could be paravirtualized
for performance reason.

Now we have 2 choices:
Alt1:  Dual compile entry.S like IVT.s (dual compile all ASM
files if it needs virtualization)
pros: Same policy with iVT, use same MACRO to
replacement.
cons: There are other ASM files such as
sn/kernel/pio_phys.S need to be dual compiled too.
And unlike IVT table, the memory occupied by
dual compiled code won't be able to be freed easily since the size is
not fixed. Also all future ASM code touch privilege instruction may need
to be dual compiled too.

Alt2: Use indirect call like C code for non IVT nor gate page
code (dual compile only for IVT  gate page which has fixed size and
performance killer)
Pros: flexible for future ASM code (just use same MACRO,
no dual compile requirement).
Cons: 2 sets of solution for ASM code, and also slightly
performance lose due to indirect function call (future patching will
convert it to direct function call, or in place code.)


Any suggestions?

Thanks, eddie

___
Xen-ia64-devel mailing list
Xen-ia64-devel@lists.xensource.com
http://lists.xensource.com/xen-ia64-devel


[Xen-ia64-devel] cpu ops

2008-04-07 Thread Dong, Eddie
In current approach, we have cpu ops like eoi/set_tpr/get_tpr,/set_itm
/set_kr0/set_kr2.../set_kr7 etc.
I think there is another simple alternative is to simply export
setreg/getreg 
for cpu ops.

The benefit of this could be:
1: Simple in pv_ops I/F
2: hypervisor neutral. Today we only virtualize around 15 AR/CR
read/write,
But future it may extends since different Hypervisor may do in
different way.

Do u like to see this one happen?
thanks, eddie

___
Xen-ia64-devel mailing list
Xen-ia64-devel@lists.xensource.com
http://lists.xensource.com/xen-ia64-devel


RE: [Xen-ia64-devel] cpu ops

2008-04-07 Thread Dong, Eddie
Isaku Yamahata wrote:
 On Mon, Apr 07, 2008 at 05:25:54PM +0800, Dong, Eddie wrote:
 In current approach, we have cpu ops like
 eoi/set_tpr/get_tpr,/set_itm /set_kr0/set_kr2.../set_kr7 etc.
 I think there is another simple alternative is to simply export
 setreg/getreg for cpu ops.
 
 The benefit of this could be:
  1: Simple in pv_ops I/F
  2: hypervisor neutral. Today we only virtualize around 15 AR/CR
  read/write, But future it may extends since different Hypervisor
 may do in different way.
 
 Sounds reasonable.

I have a patch ready for this based on my previous removed
paravirt_xxx/ia64_native_xxx
version, I can rebase.

 Although it would result in big switch, presumably we can eliminate

Yes, many switch case. May hurt performance before patching.

 runtime switch cost by specifically handling setreg/getreg which
 might complicate binay patch slightly.
Yes, but eventually the performance will come back after patching.
Eddie.

___
Xen-ia64-devel mailing list
Xen-ia64-devel@lists.xensource.com
http://lists.xensource.com/xen-ia64-devel


[Xen-ia64-devel] RE: pv_ops progress ask for suggestion

2008-04-07 Thread Dong, Eddie
Isaku Yamahata wrote:
 On Mon, Apr 07, 2008 at 05:47:38PM +0800, Dong, Eddie wrote:
 Tony  all:
  Recently we have completed the IVT.s pv_ops by using dual
 compile, and also many cleanups to simplify the changes to upstream
 code. All the C code touching privilege instruction is replaced with
 indirect function call (will be binary patched to use direct function
 call in future), and IVT table is dual compiled to minimize impact to
 native IVT table, but we get some dilemma in handling kernel/entry.S
 and also generic policy for other ASM files.
 
  In entry.S, there are around 17 privilege instructions, some of
 them must be paravirtualized including 2 cover instructions, and 1
 RFI (this one is due to Xen hypervisor issue). There are other 15
 privilege instructions (In Xen) such as CR access that could be
 paravirtualized for performance reason.
 
 Probably we can discusse well with the concrete patch.
 So I'll post the patches.
 (Creating the reviewable patch set may take a while though.)

If it is 200 lines of patch, that is perfect. If it is a 2000+ lines of
patch, I prefer a 200 lines of pseudo code.

 
 
  Now we have 2 choices:
  Alt1:  Dual compile entry.S like IVT.s (dual compile all ASM
 files if it needs virtualization)
  pros: Same policy with iVT, use same MACRO to
 replacement.
  cons: There are other ASM files such as
 sn/kernel/pio_phys.S need to be dual compiled too.
  And unlike IVT table, the memory occupied by
 dual compiled code won't be able to be freed easily since the size is
 not fixed. Also all future ASM code touch privilege instruction may
 need to be dual compiled too.
 
 I suppose the more generalized problem is
 - The memory for unused pv code/data won't be executed/referenced
   so that it can be freed somehow.
   Is it worth while to do that? And how to do it?

For IVT table (64K)  gate page (1 page), it can be done except
relocating
those IP relative symbols.

 Looking at current xen code size it might be worth while,
 but not so big win.

Agree in some level. Depend on how strictly we want the code to be
perfect.


 This is not ia64 specific issues, and should be addressed
 in arch generic way. This hasn't been addressed even on x86.

X86 doesn't use dual compile.

Eddie

___
Xen-ia64-devel mailing list
Xen-ia64-devel@lists.xensource.com
http://lists.xensource.com/xen-ia64-devel


[Xen-ia64-devel] pv_ops: intrinsic ops2

2008-04-02 Thread Dong, Eddie
In current patch series, we have many definition:

+#define ia64_itci  ia64_native_itci
+#define ia64_itcd  ia64_native_itcd
+#define ia64_itri  ia64_native_itri
+#define ia64_itrd  ia64_native_itrd
+#define ia64_tpa   ia64_native_tpa
+#define ia64_set_ibr   ia64_native_set_ibr
+#define ia64_set_pkr   ia64_native_set_pkr
+#define ia64_set_pmc   ia64_native_set_pmc
+#define ia64_set_pmd   ia64_native_set_pmd
+#define ia64_set_rria64_native_set_rr
+#define ia64_get_cpuid ia64_native_get_cpuid
+#define ia64_get_ibr   ia64_native_get_ibr
+#define ia64_get_pkr   ia64_native_get_pkr
+#define ia64_get_pmc   ia64_native_get_pmc
+#define ia64_get_pmd   ia64_native_get_pmd



Which comes from gcc_intrin.h such as:

-#define ia64_itci(addr)asm volatile (itc.i %0;; :: r(addr)
: memory)
+#define ia64_native_itci(addr) asm volatile (itc.i %0;; :: r(addr)
: memory)



The question is we actually don't have xen_itci in the whole
patch, should
we remove the indirect reference of new
ia64_itci--ia64_native_itci-old_ia64_itci.
It is just identical and the change is redunadnt at least for
this moment.

Thanks, eddie

___
Xen-ia64-devel mailing list
Xen-ia64-devel@lists.xensource.com
http://lists.xensource.com/xen-ia64-devel


RE: [Xen-ia64-devel] pv_ops: imntrinsic pv_ops

2008-04-02 Thread Dong, Eddie
Isaku Yamahata wrote:
 On Wed, Apr 02, 2008 at 01:51:28PM +0800, Dong, Eddie wrote:
 Current definition of intrinsic APIs seems to be too expansive, this
 one 
 
 give alternative way to do simply and reduce some changes.
 If this applies, further simplification can be applied.
 Thx, eddie
 
 Interesting approach.
 If we can replace most of them, I'll apply.
 But half converted state is inconsistent.

Can't it replace others? All of them can be done
in this way.

 
 Defining those function by macro is a good idea.
 But, undef/redefine CONFIG_PARAVIRT looks ugly and
 defining conflicting name would be confusing.

Ideally they should be in a seperate file, or at end of paravirt.c
where #undef is clean.

 
 I guess your concern is removing bunch of #define ia64_xxx ...

Not exactly. I just think it is cleaner and smaller in patch size.

 (And yes, I agree with you to clean them up.)
 So how about something like the following?
 
 in intrinsic.h
 
 #ifdef CONFIG_APRAVIRT
 #define IA64_INTRINSIC_API(name)  paravirt. ## name

Do u mean pv_cpu_ops. ## name ?

 #else
 #define IA64_INTRINSIC_API(name)  ia64_native_ ## name
 #endif
 
 #define ia64_fc   IA64_INTRINSIC(fc)
 ...
 
 and keep ia64_native_xxx definitions.

I want to keep ia64_xxx definition since it is an unnecessary change.
BTW, if we review at diff against original one, it looks better.

 This doesn't depend on the number of arguments.

??? It is always one parameter.

Thanks, eddie

___
Xen-ia64-devel mailing list
Xen-ia64-devel@lists.xensource.com
http://lists.xensource.com/xen-ia64-devel


[Xen-ia64-devel] pv_ops: imntrinsic pv_ops

2008-04-01 Thread Dong, Eddie
Current definition of intrinsic APIs seems to be too expansive, this one

give alternative way to do simply and reduce some changes.
If this applies, further simplification can be applied.
Thx, eddie



Simplify intrinsic API handling.

Signed-off-by: Yaozu (Eddie) Dong [EMAIL PROTECTED]

diff --git a/arch/ia64/kernel/paravirt.c b/arch/ia64/kernel/paravirt.c
index 4b01c44..6ce4f60 100644
--- a/arch/ia64/kernel/paravirt.c
+++ b/arch/ia64/kernel/paravirt.c
@@ -3,6 +3,7 @@
  *
  * Copyright (c) 2008 Isaku Yamahata yamahata at valinux co jp
  *VA Linux Systems Japan K.K.
+ * Yaozu (Eddie) Dong [EMAIL PROTECTED]
  *
  * This program is free software; you can redistribute it and/or modify
  * it under the terms of the GNU General Public License as published by
@@ -53,29 +54,18 @@ struct pv_init_ops pv_init_ops;
 
 /* ia64_native_xxx are macros so that we have to make them real
functions */
 
-static void
-ia64_native_fc_func(unsigned long addr)
-{
-   ia64_native_fc(addr);
-}
-
-static unsigned long
-ia64_native_thash_func(unsigned long addr)
-{
-   return ia64_native_thash(addr);
+#define NATIVE_INTRINSIC_API1(type, name, para1)   \
+static type native_ ## name(unsigned long para1) { \
+   return name(para1); \
 }
 
-static unsigned long
-ia64_native_get_cpuid_func(int index)
-{
-   return ia64_native_get_cpuid(index);
-}
-
-static unsigned long
-ia64_native_get_pmd_func(int index)
-{
-   return ia64_native_get_pmd(index);
-}
+#undef CONFIG_PARAVIRT
+NATIVE_INTRINSIC_API1(void, ia64_fc, addr)
+NATIVE_INTRINSIC_API1(unsigned long, ia64_thash, addr)
+NATIVE_INTRINSIC_API1(unsigned long, ia64_get_cpuid, addr)
+NATIVE_INTRINSIC_API1(unsigned long, ia64_get_pmd, index)
+NATIVE_INTRINSIC_API1(void, ia64_intrin_local_irq_restore, flags)
+#define CONFIG_PARAVIRT
 
 static unsigned long
 ia64_native_get_eflag_func(void)
@@ -217,17 +207,11 @@ ia64_native_get_psr_i_func(void)
return ia64_native_get_psr_i();
 }
 
-static void
-ia64_native_intrin_local_irq_restore_func(unsigned long flags)
-{
-   ia64_native_intrin_local_irq_restore(flags);
-}
-
 struct pv_cpu_ops pv_cpu_ops = {
-   .fc = ia64_native_fc_func,
-   .thash  = ia64_native_thash_func,
-   .get_cpuid  = ia64_native_get_cpuid_func,
-   .get_pmd= ia64_native_get_pmd_func,
+   .fc = native_ia64_fc,
+   .thash  = native_ia64_thash,
+   .get_cpuid  = native_ia64_get_cpuid,
+   .get_pmd= native_ia64_get_pmd,
.get_eflag  = ia64_native_get_eflag_func,
.set_eflag  = ia64_native_set_eflag_func,
.get_psr= ia64_native_get_psr_func,
@@ -252,7 +236,7 @@ struct pv_cpu_ops pv_cpu_ops = {
.rsm_i  = ia64_native_rsm_i_func,
.get_psr_i  = ia64_native_get_psr_i_func,
.intrin_local_irq_restore
-   = ia64_native_intrin_local_irq_restore_func,
+   = native_ia64_intrin_local_irq_restore,
 };
 
 
/***
***
@@ -335,3 +319,5 @@ ia64_native_do_steal_accounting(unsigned long
*new_itm)
 struct pv_time_ops pv_time_ops = {
.do_steal_accounting = ia64_native_do_steal_accounting,
 };
+
+
diff --git a/include/asm-ia64/gcc_intrin.h
b/include/asm-ia64/gcc_intrin.h
index b9fa3f4..fda12e7 100644
--- a/include/asm-ia64/gcc_intrin.h
+++ b/include/asm-ia64/gcc_intrin.h
@@ -4,6 +4,7 @@
  *
  * Copyright (C) 2002,2003 Jun Nakajima [EMAIL PROTECTED]
  * Copyright (C) 2002,2003 Suresh Siddha [EMAIL PROTECTED]
+ * Copyright (C) 2008 Yaozu (Eddie) Dong [EMAIL PROTECTED]
  */
 
 #include linux/compiler.h
@@ -28,6 +29,13 @@ extern void ia64_bad_param_for_getreg (void);
 register unsigned long ia64_r13 asm (r13) __used;
 #endif
 
+#ifdef CONFIG_PARAVIRT
+#define INTRINSIC_INS1(name, para1, ins)   \
+   pv_cpu_ops.name(para1)
+#else
+#define INTRINSIC_INS1(name, para1, ins)   ins 
+#endif
+
 #define ia64_native_setreg(regnum, val)
\
 ({
\
switch (regnum) {
\
@@ -381,8 +389,8 @@ register unsigned long ia64_r13 asm (r13) __used;
 
 #define ia64_invala() asm volatile (invala ::: memory)
 
-#define ia64_native_thash(addr)
\
-({
\
+#define ia64_thash(addr)
\
+   INTRINSIC_INS1(thash, addr,{
\
__u64 ia64_intri_res;
\
asm volatile (thash %0=%1 : =r(ia64_intri_res) : r
(addr));\
ia64_intri_res;
\
@@ -437,8 +445,8 @@ register unsigned long ia64_r13 asm (r13) __used;
 #define ia64_native_set_rr(index, val)
\
asm volatile (mov rr[%0]=%1 :: r(index), r(val) :
memory);
 
-#define ia64_native_get_cpuid(index)
\
-({
\
+#define ia64_get_cpuid(index)
\
+   INTRINSIC_INS1(get_cpuid, index, {
\
__u64 ia64_intri_res;
\
asm volatile (mov %0=cpuid[%r1] : =r(ia64_intri_res) :
rO(index));   \
ia64_intri_res;
\
@@ -473,8 +481,8 @@ register unsigned 

[Xen-ia64-devel] pv_ops: file name

2008-03-28 Thread Dong, Eddie
Isaku:
When looking at the new files in kernel  new directory, I am
wondering if we
can merge paravirt_core.c code into paravirt.c? The size of them are
very small 
and meaning are similar, also X86 put in one file.
Another file is paravirt_entry.c, will paravirt_patch.c be much
accurate?
Some for head file side.

Thanks, eddie

___
Xen-ia64-devel mailing list
Xen-ia64-devel@lists.xensource.com
http://lists.xensource.com/xen-ia64-devel


RE: [Xen-ia64-devel] pv_ops: entry.S simplification

2008-03-28 Thread Dong, Eddie
Isaku Yamahata wrote:
 On Fri, Mar 28, 2008 at 01:43:23PM +0800, Dong, Eddie wrote:
 
 Eventually those running_on_xen checks should be removed somehow.
 Are you just thinking that the multi compile with binary patching
 should be introduced after the first merge?
 Or do you have any idea other than the multi compile with binary
 patching? 
 
 
 Dual compile every change may be not necessary for me.
 The reason for IVT is that code there is very critical and
 stakeholders won't change them to steal registers. They even don't
 want a single change without full hand of performance data + stress
 test. 
 
 In entry.S, steal clobber register is easy.
 
 ia64_swtich_to(), ia64_leave_syscall() and ia64_leave_kernel()
 are also performance critical, aren't they?
 
 

If we rate those performance critical items, I would vote IVT as 1st,
and
then followed by fast system call. The 3rd one can be this one.

A handler of IVT is in the range of 20-50 instructions, a fast hypercall

may be less than 100 instructions. For ia64_switch_to, the scheduler 
 switch code is in levels of multiple handreds of instructions in my
understanding. 

Putting indirect function call of pv_ops here just introduce 3-10
additional
instructions. Is this what you concern?

Thanks, eddie


___
Xen-ia64-devel mailing list
Xen-ia64-devel@lists.xensource.com
http://lists.xensource.com/xen-ia64-devel


[Xen-ia64-devel] include/asm-ia64/xen/hypercall.h

2008-03-27 Thread Dong, Eddie
It seems some APIs in that file is dead code, this one is to
remove dead code or dom0 only code?

Signed-off-by: Yaozu (Eddie) Dong [EMAIL PROTECTED]

diff --git a/arch/ia64/xen/Makefile b/arch/ia64/xen/Makefile
index 605b757..dc8fee6 100644
--- a/arch/ia64/xen/Makefile
+++ b/arch/ia64/xen/Makefile
@@ -5,7 +5,7 @@
 KBUILD_AFLAGS += -D__IA64_ASM_PARAVIRTUALIZED_XEN
  
 obj-y := hypercall.o time.o xenivt.o xensetup.o xen_pv_ops.o irq_xen.o
\
-hypervisor.o util.o xencomm.o xcom_hcall.o xcom_asm.o
paravirt_xen.o
+hypervisor.o util.o xencomm.o xcom_hcall.o paravirt_xen.o
 
 obj-y += ../kernel/ivt.o
 
diff --git a/arch/ia64/xen/xcom_asm.S b/arch/ia64/xen/xcom_asm.S
deleted file mode 100644
index 8747908..000
--- a/arch/ia64/xen/xcom_asm.S
+++ /dev/null
@@ -1,27 +0,0 @@
-/*
- * xencomm suspend support
- * Support routines for Xen
- *
- * Copyright (C) 2005 Dan Magenheimer [EMAIL PROTECTED]
- */
-#include asm/asmmacro.h
-#include xen/interface/xen.h
-
-/*
- * Stub for suspend.
- * Just force the stacked registers to be written in memory.
- */
-GLOBAL_ENTRY(xencomm_arch_hypercall_suspend)
-   ;;
-   alloc r20=ar.pfs,0,0,6,0
-   mov r2=__HYPERVISOR_sched_op
-   ;;
-   /* We don't want to deal with RSE.  */
-   flushrs
-   mov r33=r32
-   mov r32=2 // SCHEDOP_shutdown
-   ;;
-   break 0x1000
-   ;;
-   br.ret.sptk.many b0
-END(xencomm_arch_hypercall_suspend)
diff --git a/arch/ia64/xen/xcom_hcall.c b/arch/ia64/xen/xcom_hcall.c
index bfddbd7..4a89a74 100644
--- a/arch/ia64/xen/xcom_hcall.c
+++ b/arch/ia64/xen/xcom_hcall.c
@@ -401,17 +401,6 @@ xencomm_hypercall_memory_op(unsigned int cmd, void
*arg)
 }
 EXPORT_SYMBOL_GPL(xencomm_hypercall_memory_op);
 
-int
-xencomm_hypercall_suspend(unsigned long srec)
-{
-   struct sched_shutdown arg;
-
-   arg.reason = SHUTDOWN_suspend;
-
-   return xencomm_arch_hypercall_suspend(
-   xencomm_map_no_alloc(arg, sizeof(arg)));
-}
-
 long
 xencomm_hypercall_vcpu_op(int cmd, int cpu, void *arg)
 {
@@ -443,16 +432,3 @@ xencomm_hypercall_opt_feature(void *arg)
xencomm_map_no_alloc(arg,
 sizeof(struct
xen_ia64_opt_feature)));
 }
-
-int
-xencomm_hypercall_fpswa_revision(unsigned int *revision)
-{
-   struct xencomm_handle *desc;
-
-   desc = xencomm_map_no_alloc(revision, sizeof(*revision));
-   if (desc == NULL)
-   return -EINVAL;
-
-   return xencomm_arch_hypercall_fpswa_revision(desc);
-}
-EXPORT_SYMBOL_GPL(xencomm_hypercall_fpswa_revision);
diff --git a/include/asm-ia64/xen/hypercall.h
b/include/asm-ia64/xen/hypercall.h
index 075b9e1..77dda9d 100644
--- a/include/asm-ia64/xen/hypercall.h
+++ b/include/asm-ia64/xen/hypercall.h
@@ -313,38 +313,7 @@ HYPERVISOR_unexpose_foreign_p2m(unsigned long gpfn,
domid_t domid)
 }
 #endif
 
-static inline int
-xencomm_arch_hypercall_perfmon_op(unsigned long cmd,
- struct xencomm_handle *arg,
- unsigned long count)
-{
-   return _hypercall4(int, ia64_dom0vp_op,
-  IA64_DOM0VP_perfmon, cmd, arg, count);
-}
-
-static inline int
-xencomm_arch_hypercall_fpswa_revision(struct xencomm_handle *arg)
-{
-   return _hypercall2(int, ia64_dom0vp_op,
-  IA64_DOM0VP_fpswa_revision, arg);
-}
 
-static inline int
-xencomm_arch_hypercall_ia64_debug_op(unsigned long cmd,
-unsigned long domain,
-struct xencomm_handle *arg)
-{
-   return _hypercall3(int, ia64_debug_op, cmd, domain, arg);
-}
-
-static inline int
-HYPERVISOR_add_io_space(unsigned long phys_base,
-   unsigned long sparse,
-   unsigned long space_number)
-{
-   return _hypercall4(int, ia64_dom0vp_op,
IA64_DOM0VP_add_io_space,
-  phys_base, sparse, space_number);
-}
 
 /* for balloon driver */
 #define HYPERVISOR_update_va_mapping(va, new_val, flags) (0)
@@ -355,16 +324,9 @@ HYPERVISOR_add_io_space(unsigned long phys_base,
 #define HYPERVISOR_callback_op xencomm_hypercall_callback_op
 #define HYPERVISOR_multicall xencomm_hypercall_multicall
 #define HYPERVISOR_xen_version xencomm_hypercall_xen_version
-#define HYPERVISOR_console_io xencomm_hypercall_console_io
-#define HYPERVISOR_hvm_op xencomm_hypercall_hvm_op
 #define HYPERVISOR_memory_op xencomm_hypercall_memory_op
-#define HYPERVISOR_xenoprof_op xencomm_hypercall_xenoprof_op
-#define HYPERVISOR_perfmon_op xencomm_hypercall_perfmon_op
-#define HYPERVISOR_fpswa_revision xencomm_hypercall_fpswa_revision
-#define HYPERVISOR_suspend xencomm_hypercall_suspend
 #define HYPERVISOR_vcpu_op xencomm_hypercall_vcpu_op
 #define HYPERVISOR_opt_feature xencomm_hypercall_opt_feature
-#define HYPERVISOR_kexec_op xencomm_hypercall_kexec_op
 
 /* to compile gnttab_copy_grant_page() in drivers/xen/core/gnttab.c */
 #define 

RE: [Xen-ia64-devel] pv_ops: ministate.h typo fix

2008-03-27 Thread Dong, Eddie
Isaku Yamahata wrote:
 On Thu, Mar 27, 2008 at 12:20:37PM +0800, Dong, Eddie wrote:
 
 - shuffle instructions of XEN_BSW_1 and xen DO_XEN_MIN().
   Is this for producing better bundles? Please ellaborate on this.
   If so, I'll take as another patch.
 
 ??? Which code are u talking for?
 
 The following hunks. The instruction order was changed.
 What's the purpose?

Yes, this is to make a compact bundle, otherwise IVT table
size will be overflowed in dispatch_to_fault_handler.

Thanks, eddie

___
Xen-ia64-devel mailing list
Xen-ia64-devel@lists.xensource.com
http://lists.xensource.com/xen-ia64-devel


RE: [Xen-ia64-devel] pv_ops: entry.S simplification

2008-03-27 Thread Dong, Eddie
Isaku Yamahata wrote:
 Hi Eddie.
 
 I looked into entry.S closely.
 Unfortunately I found that ia64_leave_syscall() and
 ia64_leave_kernel() includes invirtualizable instructions,
 cover instruction with psr.ic = 0 so that those paravirtualization
 is inevitable. (ia64_switch_to() doesn't need paravirtualization
 though.) 

Yes there 2 kind of instructions we must modify, one is cover when
psr.ic=0, another one is RFI which can;t be handled by Xen today.

But I temply put running on xen for now, I am working on using 
indirect function call pv_ops now.

Or do you mean there are still missed cover instruction?

 
 Does it really work? Probably just seeing login prompt test doesn't
 reveal the issues.
 
 thanks,
 

I can login and do minimal ops, I didn't take stress test.
But in my coding time, if a cover with PSR.ic=0 is missed,
or RFI is missed, guest will soon die.

Thanks, eddie

___
Xen-ia64-devel mailing list
Xen-ia64-devel@lists.xensource.com
http://lists.xensource.com/xen-ia64-devel


RE: [Xen-ia64-devel] Where to compile additional IVT.S

2008-03-27 Thread Dong, Eddie
Isaku Yamahata wrote:
 arch/ia64/kernel/ivt.o is overwritten.
 Building again under arch/ia64/kernel would cause trouble.
 What do you think the following?
 
 ia64/pv_ops: complie paravirtualized assembly files into each pv dirs.
 
 compile ivt.S and switch_leave.S into each pv instanc dir.
 With this patch, arch/ia64/kernel/Makefile can be simpler than before.
 

I already posted a similar one, and seems to be much simple.
Any issues?

Eddie

___
Xen-ia64-devel mailing list
Xen-ia64-devel@lists.xensource.com
http://lists.xensource.com/xen-ia64-devel


RE: [Xen-ia64-devel] pv_ops: move binary patching to later after CPU initialization

2008-03-27 Thread Dong, Eddie
Isaku Yamahata wrote:
 I guess you just followed x86 way, but delaying until check_bug()
 is too late for IA64 case because of at least ia64_get_cpuid().

No. Binary patching is just optimization, while pv_ops hook
is installed at very beginning.

 At this moment I'm not sure how late binary patching can
 be delayed, though.
 Presumably it is necessary to revise boot protocol.

Any time as if SMC sequence is considered.
Putting together with original Linux patching code will make things
simple.

 
 Renaming xen_paravirt_patch() to xen_patch() seems reasonable,
 so I applied only the renaming part.
 

I strongly suggest you check in entry.S patch first and then everything
will
be very simple.

And your above concern can be solved too.

Thanks, eddie

___
Xen-ia64-devel mailing list
Xen-ia64-devel@lists.xensource.com
http://lists.xensource.com/xen-ia64-devel


RE: [Xen-ia64-devel] pv_ops: entry.S simplification

2008-03-27 Thread Dong, Eddie
Isaku Yamahata wrote:
 Oh, I misunderstood your patch.
 I thought it just revert entry.S to original state. But it
 paravirtualized conver and rfi with running_on_xen check.
 Now I'm convinced that your patch works. Only one comment on
 the patch itself is,
 #ifdef CONFIG_XEN is necessary for !CONFIG_XEN case.
 
 
 Then the left issue is 'if the patch is acceptable for the upstream'.
 The purpose of reducing the total patch size is eventually
 to make ia64/xen domU patches more acceptable for the upstream.
 However with the patch you reintroduced running_on_xen check which
 we have eliminted. That contradicts with the pv_ops principle.
 It's a trade off between the patch size and the patch cleanness.

That is a temporary solution, I am working to use indirect function
call.

 
 Eventually those running_on_xen checks should be removed somehow.
 Are you just thinking that the multi compile with binary patching
 should be introduced after the first merge?
 Or do you have any idea other than the multi compile with binary
 patching? 
 

Dual compile every change may be not necessary for me.
The reason for IVT is that code there is very critical and stakeholders
won't change them to steal registers. They even don't want a single
change without full hand of performance data + stress test.

In entry.S, steal clobber register is easy.

 
 Anyway it's linux-ia64 people that finally determines what way is
 better. To be honest I'm not sure which way is more acceptable.
 So let's discuss with linux-ia64.
 
 
Yes, we can, but keeping that big patch is a problem for now.
Also the switch_entry.S has a ugly code that must use binary patching
to patch to Xen.

Meanwhile like you mentioned yesterday, where to do binary patching 
is an issue, especially for native case. Current approach doesn;t
consider
native patching which is more important than xen patching for now.

Eddie

___
Xen-ia64-devel mailing list
Xen-ia64-devel@lists.xensource.com
http://lists.xensource.com/xen-ia64-devel


[Xen-ia64-devel] remove dead pv_irq_ops.init_IRQ_late

2008-03-26 Thread Dong, Eddie
commit d9c6c77dbb20cd5cc9ffbbe8e2398eb737a83162
Author: root [EMAIL PROTECTED]
Date:   Wed Mar 26 14:14:31 2008 +0800

Remove paravirt_init_IRQ_late since it is never activated.

Signed-off-by: Yaozu (Eddie) Dong [EMAIL PROTECTED]

diff --git a/arch/ia64/kernel/irq_ia64.c b/arch/ia64/kernel/irq_ia64.c
index 10d9a5d..ceafde9 100644
--- a/arch/ia64/kernel/irq_ia64.c
+++ b/arch/ia64/kernel/irq_ia64.c
@@ -665,7 +665,6 @@ init_IRQ (void)
pfm_init_percpu();
 #endif
platform_irq_init();
-   paravirt_init_IRQ_late();
 }
 
 void
diff --git a/include/asm-ia64/paravirt.h b/include/asm-ia64/paravirt.h
index 3721eff..285f7ff 100644
--- a/include/asm-ia64/paravirt.h
+++ b/include/asm-ia64/paravirt.h
@@ -164,7 +164,6 @@ __iosapic_write(char __iomem *iosapic, unsigned int
reg, u32 val)
 
 struct pv_irq_ops {
void (*init_IRQ_early)(void);
-   void (*init_IRQ_late)(void);
 
int (*assign_irq_vector)(int irq);
void (*free_irq_vector)(int vector);
@@ -184,13 +183,6 @@ paravirt_init_IRQ_early(void)
pv_irq_ops.init_IRQ_early();
 }
 
-static inline void
-paravirt_init_IRQ_late(void)
-{
-   if (pv_irq_ops.init_IRQ_late)
-   pv_irq_ops.init_IRQ_late();
-}
-
 static inline int
 assign_irq_vector(int irq)
 {
@@ -266,7 +258,6 @@ paravirt_do_steal_accounting(unsigned long *new_itm)
 #define paravirt_inst_patch_module(start, end) do { } while (0)
 
 #define paravirt_init_IRQ_early()  do { } while (0)
-#define paravirt_init_IRQ_late()   do { } while (0)
 
 #define paravirt_init_missing_ticks_accounting(cpu)do { } while (0)
 #define paravirt_do_steal_accounting() 0


irq_ia64_clean1.patch
Description: irq_ia64_clean1.patch
___
Xen-ia64-devel mailing list
Xen-ia64-devel@lists.xensource.com
http://lists.xensource.com/xen-ia64-devel

[Xen-ia64-devel] pv_ops: RFC: paravirt_init_IRQ_early

2008-03-26 Thread Dong, Eddie
Currently, paravirt_init_IRQ_early is used to register
IA64_IPI_RESCHEDULE/IA64_IPI_LOCAL_TLB_FLUSH for different
hypervisor/native. It seems not strightforward from the name, how about
something like:
pv_irq_ops.register_ipi ?
We can let include IA64_IPI_VECTOR register too.

I am not native english, so I am not sure.  Comments?

Thanks, eddie

___
Xen-ia64-devel mailing list
Xen-ia64-devel@lists.xensource.com
http://lists.xensource.com/xen-ia64-devel


[Xen-ia64-devel] pv_ops: move binary patching to later after CPU initialization

2008-03-26 Thread Dong, Eddie

 arch/ia64/kernel/paravirt.c   |8 +++-
 arch/ia64/kernel/paravirt_core.c  |   17 ++---
 arch/ia64/kernel/paravirt_entry.c |3 ++-
 arch/ia64/kernel/setup.c  |3 +++
 arch/ia64/xen/paravirt_xen.c  |8 +---
 arch/ia64/xen/xen_pv_ops.c|4 
 arch/ia64/xen/xensetup.S  |   10 --
 include/asm-ia64/paravirt.h   |1 +
 8 files changed, 20 insertions(+), 34 deletions(-)

So far it is still NULL for both native  xen.

Thanks, eddie


Defer binary patching from beginning to later after initialization
is done.

Signed-off-by: Yaozu (Eddie) Dong [EMAIL PROTECTED]

diff --git a/arch/ia64/kernel/paravirt.c b/arch/ia64/kernel/paravirt.c
index 37bad82..b7340dd 100644
--- a/arch/ia64/kernel/paravirt.c
+++ b/arch/ia64/kernel/paravirt.c
@@ -39,12 +39,18 @@ struct pv_info pv_info = {
.name = bare hardware
 };
 
+static void native_patch(void)
+{
+}
+
 
/***

  * pv_init_ops
  * initialization hooks.
  */
 
-struct pv_init_ops pv_init_ops;
+struct pv_init_ops pv_init_ops = {
+   .patch = native_patch,
+};
 
 
/***

  * pv_cpu_ops
diff --git a/arch/ia64/kernel/paravirt_core.c
b/arch/ia64/kernel/paravirt_core.c
index 6b7c70f..003ce1f 100644
--- a/arch/ia64/kernel/paravirt_core.c
+++ b/arch/ia64/kernel/paravirt_core.c
@@ -21,20 +21,7 @@
  */
 
 #include asm/paravirt_core.h
-
-/*
- * flush_icache_range() can't be used here.
- * we are here before cpu_init() which initializes
- * ia64_i_cache_stride_shift. flush_icache_range() uses it.
- */
-void __init_or_module
-paravirt_flush_i_cache_range(const void *instr, unsigned long size)
-{
-   unsigned long i;
-
-   for (i = 0; i  size; i += sizeof(bundle_t))
-   asm volatile (fc.i %0:: r(instr + i): memory);
-}
+#include asm/pgtable.h
 
 bundle_t* __init_or_module
 paravirt_get_bundle(unsigned long tag)
@@ -162,7 +149,7 @@ paravirt_write_inst(unsigned long tag, cmp_inst_t
inst)
default:
BUG();
}
-   paravirt_flush_i_cache_range(bundle, sizeof(*bundle));
+   flush_icache_range((unsigned long)bundle, (unsigned
long)(bundle+1));
 }
 
 /* for debug */
diff --git a/arch/ia64/kernel/paravirt_entry.c
b/arch/ia64/kernel/paravirt_entry.c
index 708287a..857d2a1 100644
--- a/arch/ia64/kernel/paravirt_entry.c
+++ b/arch/ia64/kernel/paravirt_entry.c
@@ -20,6 +20,7 @@
 
 #include asm/paravirt_core.h
 #include asm/paravirt_entry.h
+#include asm/pgtable.h
 
 /* br.cond.sptk.many target25B1 */
 typedef union inst_b1 {
@@ -56,7 +57,7 @@ __paravirt_entry_apply(unsigned long tag, const void
*target)
inst.l = inst_b1.l;
 
paravirt_write_inst(tag, inst);
-   paravirt_flush_i_cache_range(bundle, sizeof(*bundle));
+   flush_icache_range((unsigned long)bundle, (unsigned
long)(bundle+1));
 }
 
 static void __init
diff --git a/arch/ia64/kernel/setup.c b/arch/ia64/kernel/setup.c
index 24561d3..6634ba7 100644
--- a/arch/ia64/kernel/setup.c
+++ b/arch/ia64/kernel/setup.c
@@ -987,6 +987,9 @@ cpu_init (void)
 void __init
 check_bugs (void)
 {
+#ifdef CONFIG_PARAVIRT_GUEST
+pv_init_ops.patch();
+#endif
ia64_patch_mckinley_e9((unsigned long)
__start___mckinley_e9_bundles,
   (unsigned long)
__end___mckinley_e9_bundles);
 }
diff --git a/arch/ia64/xen/paravirt_xen.c b/arch/ia64/xen/paravirt_xen.c
index aa12cb5..969478e 100644
--- a/arch/ia64/xen/paravirt_xen.c
+++ b/arch/ia64/xen/paravirt_xen.c
@@ -28,7 +28,7 @@ const static struct paravirt_entry xen_entries[]
__initdata = {
 };
 
 void __init
-xen_entry_patch(void)
+xen_patch(void)
 {
extern const struct paravirt_entry_patch
__start_paravirt_entry[];
extern const struct paravirt_entry_patch
__stop_paravirt_entry[];
@@ -39,12 +39,6 @@ xen_entry_patch(void)
 
sizeof(xen_entries)/sizeof(xen_entries[0]));
 }
 
-void __init
-xen_paravirt_patch(void)
-{
-   xen_entry_patch();
-}
-
 /*
  * Local variables:
  * mode: C
diff --git a/arch/ia64/xen/xen_pv_ops.c b/arch/ia64/xen/xen_pv_ops.c
index 3601b79..a2da7b2 100644
--- a/arch/ia64/xen/xen_pv_ops.c
+++ b/arch/ia64/xen/xen_pv_ops.c
@@ -38,6 +38,9 @@
 #include irq_xen.h
 #include time.h
 
+/* TODO: move xen_patch to this file */
+extern void xen_patch(void);
+
 
/***

  * general info
  */
@@ -157,6 +160,7 @@ xen_post_smp_prepare_boot_cpu(void)
 
 static const struct pv_init_ops xen_init_ops __initdata = {
.banner = xen_banner,
+   .patch = xen_patch,
 
.reserve_memory = xen_reserve_memory,
 
diff --git a/arch/ia64/xen/xensetup.S b/arch/ia64/xen/xensetup.S
index cb3432b..0df93d8 100644
--- a/arch/ia64/xen/xensetup.S
+++ b/arch/ia64/xen/xensetup.S
@@ -45,16 +45,6 @@ GLOBAL_ENTRY(early_xen_setup)
;;
 #endif
 
-#ifdef 

[Xen-ia64-devel] RE: Xen common code across architecture

2008-03-26 Thread Dong, Eddie
Jeremy Fitzhardinge wrote:
 Dong, Eddie wrote:
 Jeremy/Andrew:
 
  Isaku Yamahata, I and some other IA64/Xen community memebers are
 
 working together to enable pv_ops for IA64 Linux. This patch is a
 preparation to move common arch/x86/xen/events.c to drivers/xen
 (contents are identical) against mm tree, it is based on Yamahata's
  IA64/pv_ops patch serie. In case you want to have a brief view
of
 whole pv_ops/IA64 patch serie, please refer to IA64 Linux
 mailinglist. 
 
 
 How do you want to manage this work?  I'm currently basing off
 Ingo+tglx's x86.git tree.  Would you like me to track these kinds of
 common-code changes in my tree, while you maintain a separate
 ia64-specific tree?
 

Hi, Jeremy:
I didn't realized there is a xen pv_ops downstream tree, yes if
you can take it first, that will be great!

BTW, where is your latest tree? I got one from
git://git.kernel.org/pub/scm/linux/kernel/git/jeremy/xen.git, but it
seems to be pretty old.

commit ab9c232286c2b77be78441c2d8396500b045777e
Merge: 8bd0983... 2855568...
Author: Linus Torvalds [EMAIL PROTECTED]
Date:   Fri Oct 12 16:16:41 2007 -0700

Merge branch 'upstream' of
git://git.kernel.org/pub/scm/linux/kernel/git/jga

Thanks, eddie

___
Xen-ia64-devel mailing list
Xen-ia64-devel@lists.xensource.com
http://lists.xensource.com/xen-ia64-devel


RE: [Xen-ia64-devel] pv_ops: ministate.h typo fix

2008-03-26 Thread Dong, Eddie
Isaku Yamahata wrote:
 Hi Eddie.
 The attached patches does many things. Could you explain?
 
 - convert cover argument in SAVE_MIN_WITH_COVER(_R19) into COVER.
   This seems correct. I'll take this part.
 
 - convert __COVER argument into COVER.
   Using conflicting argument is a bad practice.

This is what original Linux uses, I think we don't need to convert from
COVER to __COVER which I guess you think it is as a cover instruction.

 
 - shuffle instructions of XEN_BSW_1 and xen DO_XEN_MIN().
   Is this for producing better bundles? Please ellaborate on this.
   If so, I'll take as another patch.

??? Which code are u talking for?

 
 - churning header file inclusion.
   I need to rethink to do this with another mail you posted as
   where to compile. I'll answer it to that mail.
   I'm now inclined to move ia64/kernel/minstate.h under
   include/asm-ia64/native/.

This is not in my patch, right?
If you want to remove it back  forth, can u do after my queueing
patch is taken or modified since rebase is headache.

Thanks, eddie

___
Xen-ia64-devel mailing list
Xen-ia64-devel@lists.xensource.com
http://lists.xensource.com/xen-ia64-devel


[Xen-ia64-devel] pv_ops: hypercall.S cleanup

2008-03-26 Thread Dong, Eddie
Most hypercall are identical in source code, using a common
MACRO to define 0/1/2 parameter(s) hypercall is much simple.

 arch/ia64/xen/hypercall.S |  154
+-
 include/asm-ia64/xen/privop.h |   26 ---
 2 files changed, 51 insertions(+), 129 deletions(-)

Signed-off-by: Yaozu (Eddie) Dong [EMAIL PROTECTED]

diff --git a/arch/ia64/xen/hypercall.S b/arch/ia64/xen/hypercall.S
index 615dad9..ce7b015 100644
--- a/arch/ia64/xen/hypercall.S
+++ b/arch/ia64/xen/hypercall.S
@@ -2,79 +2,64 @@
  * Support routines for Xen hypercalls
  *
  * Copyright (C) 2005 Dan Magenheimer [EMAIL PROTECTED]
+ * Copyright (C) 2008 Yaozu (Eddie) Dong [EMAIL PROTECTED]
  */
 
 #include asm/asmmacro.h
 #include asm/intrinsics.h
 
-GLOBAL_ENTRY(xen_get_psr)
-   XEN_HYPER_GET_PSR
-   br.ret.sptk.many rp
-   ;;
-END(xen_get_psr)
-
-GLOBAL_ENTRY(xen_get_ivr)
-   XEN_HYPER_GET_IVR
-   br.ret.sptk.many rp
-   ;;
-END(xen_get_ivr)
-
-GLOBAL_ENTRY(xen_get_tpr)
-   XEN_HYPER_GET_TPR
-   br.ret.sptk.many rp
-   ;;
-END(xen_get_tpr)
-
-GLOBAL_ENTRY(xen_set_tpr)
-   mov r8=r32
-   XEN_HYPER_SET_TPR
-   br.ret.sptk.many rp
-   ;;
-END(xen_set_tpr)
-
-GLOBAL_ENTRY(xen_eoi)
-   mov r8=r32
-   XEN_HYPER_EOI
-   br.ret.sptk.many rp
-   ;;
-END(xen_eoi)
-
-GLOBAL_ENTRY(xen_thash)
-   mov r8=r32
-   XEN_HYPER_THASH
-   br.ret.sptk.many rp
-   ;;
-END(xen_thash)
-
-GLOBAL_ENTRY(xen_set_itm)
-   mov r8=r32
-   XEN_HYPER_SET_ITM
-   br.ret.sptk.many rp
-   ;;
-END(xen_set_itm)
+/*
+ * Hypercalls without parameter.
+ */
+#define __HCALL0(name,hcall)   \
+   GLOBAL_ENTRY(name); \
+   break   hcall;  \
+   br.ret.sptk.many rp;\
+   END(name)
 
-GLOBAL_ENTRY(xen_ptcga)
-   mov r8=r32
-   mov r9=r33
-   XEN_HYPER_PTC_GA
-   br.ret.sptk.many rp
-   ;;
-END(xen_ptcga)
+/*
+ * Hypercalls with 1 parameter.
+ */
+#define __HCALL1(name,hcall)   \
+   GLOBAL_ENTRY(name); \
+   mov r8=r32; \
+   break   hcall;  \
+   br.ret.sptk.many rp;\
+   END(name)
 
-GLOBAL_ENTRY(xen_get_rr)
-   mov r8=r32
-   XEN_HYPER_GET_RR
-   br.ret.sptk.many rp
-   ;;
-END(xen_get_rr)
+/*
+ * Hypercalls with 2 parameters.
+ */
+#define __HCALL2(name,hcall)   \
+   GLOBAL_ENTRY(name); \
+   mov r8=r32; \
+   mov r9=r33; \
+   break   hcall;  \
+   br.ret.sptk.many rp;\
+   END(name)
+
+__HCALL0(xen_get_psr, HYPERPRIVOP_GET_PSR)
+__HCALL0(xen_get_ivr, HYPERPRIVOP_GET_IVR)
+__HCALL0(xen_get_tpr, HYPERPRIVOP_GET_TPR)
+__HCALL0(xen_hyper_ssm_i, HYPERPRIVOP_SSM_I)
+
+__HCALL1(xen_set_tpr, HYPERPRIVOP_SET_TPR)
+__HCALL1(xen_eoi, HYPERPRIVOP_EOI)
+__HCALL1(xen_thash, HYPERPRIVOP_THASH)
+__HCALL1(xen_set_itm, HYPERPRIVOP_SET_ITM)
+__HCALL1(xen_get_rr, HYPERPRIVOP_GET_RR)
+__HCALL1(xen_fc, HYPERPRIVOP_FC)
+__HCALL1(xen_get_cpuid, HYPERPRIVOP_GET_CPUID)
+__HCALL1(xen_get_pmd, HYPERPRIVOP_GET_PMD)
+
+__HCALL2(xen_ptcga, HYPERPRIVOP_PTC_GA)
+__HCALL2(xen_set_rr, HYPERPRIVOP_SET_RR)
+__HCALL2(xen_set_kr, HYPERPRIVOP_SET_KR)
 
-GLOBAL_ENTRY(xen_set_rr)
-   mov r8=r32
-   mov r9=r33
-   XEN_HYPER_SET_RR
-   br.ret.sptk.many rp
-   ;;
-END(xen_set_rr)
+#ifdef CONFIG_IA32_SUPPORT
+__HCALL1(xen_get_eflag, HYPERPRIVOP_GET_EFLAG)
+__HCALL1(xen_set_eflag, HYPERPRIVOP_SET_EFLAG) // refer SDM vol1 3.1.8
+#endif /* CONFIG_IA32_SUPPORT */
 
 GLOBAL_ENTRY(xen_set_rr0_to_rr4)
mov r8=r32
@@ -87,45 +72,6 @@ GLOBAL_ENTRY(xen_set_rr0_to_rr4)
;;
 END(xen_set_rr0_to_rr4)
 
-GLOBAL_ENTRY(xen_set_kr)
-   mov r8=r32
-   mov r9=r33
-   XEN_HYPER_SET_KR
-   br.ret.sptk.many rp
-END(xen_set_kr)
-
-GLOBAL_ENTRY(xen_fc)
-   mov r8=r32
-   XEN_HYPER_FC
-   br.ret.sptk.many rp
-END(xen_fc)
-
-GLOBAL_ENTRY(xen_get_cpuid)
-   mov r8=r32
-   XEN_HYPER_GET_CPUID
-   br.ret.sptk.many rp
-END(xen_get_cpuid)
-
-GLOBAL_ENTRY(xen_get_pmd)
-   mov r8=r32
-   XEN_HYPER_GET_PMD
-   br.ret.sptk.many rp
-END(xen_get_pmd)
-
-#ifdef CONFIG_IA32_SUPPORT
-GLOBAL_ENTRY(xen_get_eflag)
-   XEN_HYPER_GET_EFLAG
-   br.ret.sptk.many rp
-END(xen_get_eflag)
-
-// some bits aren't set if pl!=0, see SDM vol1 3.1.8
-GLOBAL_ENTRY(xen_set_eflag)
-   mov r8=r32
-   XEN_HYPER_SET_EFLAG
-   br.ret.sptk.many rp
-END(xen_set_eflag)
-#endif /* CONFIG_IA32_SUPPORT */
-
 GLOBAL_ENTRY(xen_send_ipi)
mov r14=r32
mov r15=r33
diff --git a/include/asm-ia64/xen/privop.h
b/include/asm-ia64/xen/privop.h
index 7657d37..e69380a 100644
--- a/include/asm-ia64/xen/privop.h
+++ b/include/asm-ia64/xen/privop.h
@@ -35,22 +35,8 @@
 #define XEN_HYPER_ITC_Ibreak HYPERPRIVOP_ITC_I
 #define XEN_HYPER_SSM_I  

[Xen-ia64-devel] RE: Xen common code across architecture

2008-03-25 Thread Dong, Eddie
Dong, Eddie wrote:
 Jeremy/Andrew:
 
   Isaku Yamahata, I and some other IA64/Xen community memebers are
 
 working together to enable pv_ops for IA64 Linux. This patch is a
 preparation to
 move common arch/x86/xen/events.c to drivers/xen (contents are
 identical) against
 mm tree, it is based on Yamahata's IA64/pv_ops patch serie.
   In case you want to have a brief view of whole pv_ops/IA64 patch
 serie,
 please refer to IA64 Linux mailinglist.
 
 Thanks, Eddie
 
 
 Fix a typo. Merged one is attached too.


Signed-off-by: Yaozu (Eddie) Dong [EMAIL PROTECTED]

--- drivers/xen/events_old.c2008-03-25 14:31:40.503525471 +0800
+++ drivers/xen/events.c2008-03-25 14:19:39.841851430 +0800
@@ -37,7 +37,7 @@
 #include xen/interface/xen.h
 #include xen/interface/event_channel.h
 
-#include xen-ops.h
+#include xen/xen-ops.h
 
 /*
  * This lock protects updates to the following mapping and
reference-count


typo
Description: typo


move_xenirq3.patch
Description: move_xenirq3.patch
___
Xen-ia64-devel mailing list
Xen-ia64-devel@lists.xensource.com
http://lists.xensource.com/xen-ia64-devel

[Xen-ia64-devel] pv_ops: ministate.h typo fix

2008-03-21 Thread Dong, Eddie
The MACRO parameter COVER in DO_SAVE_MIN won't
be replaced by COVER macro in inst.h since it is already
replaced when compiler extend SAVE_MIN_WITH_COVER
macro etc.
Thanks, eddie




Fix DO_SAVE_MIN macro typo, and move some
instructions to make bundle compact.

Signed-off-by: Yaozu (Eddie) Dong [EMAIL PROTECTED]

diff --git a/arch/ia64/kernel/ivt.S b/arch/ia64/kernel/ivt.S
index d1cebe5..f2306ae 100644
--- a/arch/ia64/kernel/ivt.S
+++ b/arch/ia64/kernel/ivt.S
@@ -75,7 +75,6 @@
 # define DBG_FAULT(i)
 #endif
 
-#include inst_paravirt.h
 #include minstate.h
 
 #define FAULT(n)
\
diff --git a/arch/ia64/kernel/minstate.h b/arch/ia64/kernel/minstate.h
index 10a412c..9e18fb0 100644
--- a/arch/ia64/kernel/minstate.h
+++ b/arch/ia64/kernel/minstate.h
@@ -2,6 +2,7 @@
 #include asm/cache.h
 
 #include entry.h
+#include inst_paravirt.h
 
 #ifdef __IA64_ASM_PARAVIRTUALIZED_NATIVE
 /*
@@ -29,7 +30,7 @@
  * Note that psr.ic is NOT turned on by this macro.  This is so that
  * we can pass interruption state as arguments to a handler.
  */
-#define DO_SAVE_MIN(__COVER,SAVE_IFS,EXTRA)
\
+#define DO_SAVE_MIN(COVER,SAVE_IFS,EXTRA)
\
mov r16=IA64_KR(CURRENT);   /* M */
\
mov r27=ar.rsc; /* M */
\
mov r20=r1; /* A */
\
@@ -38,7 +39,7 @@
mov r26=ar.pfs; /* I */
\
MOV_FROM_IIP(r28);  /* M */
\
mov r21=ar.fpsr;/* M */
\
-   __COVER;/* B;; (or nothing) */
\
+   COVER;  /* B;; (or nothing) */
\
;;
\
adds r16=IA64_TASK_THREAD_ON_USTACK_OFFSET,r16;
\
;;
\
@@ -194,6 +195,6 @@
st8 [r25]=r10;  /* ar.ssd */\
;;
 
-#define SAVE_MIN_WITH_COVERDO_SAVE_MIN(cover, mov r30=cr.ifs,)
-#define SAVE_MIN_WITH_COVER_R19DO_SAVE_MIN(cover, mov
r30=cr.ifs, mov r15=r19)
+#define SAVE_MIN_WITH_COVERDO_SAVE_MIN(COVER, mov r30=cr.ifs,)
+#define SAVE_MIN_WITH_COVER_R19DO_SAVE_MIN(COVER, mov
r30=cr.ifs, mov r15=r19)
 #define SAVE_MIN   DO_SAVE_MIN( , mov r30=r0, )
diff --git a/arch/ia64/xen/xenivt.S b/arch/ia64/xen/xenivt.S
index 2d509f2..17987af 100644
--- a/arch/ia64/xen/xenivt.S
+++ b/arch/ia64/xen/xenivt.S
@@ -13,9 +13,8 @@
 #include asm/kregs.h
 #include asm/pgtable.h
 
-#include asm/xen/inst.h
-#include asm/xen/minstate.h
 #include ../kernel/minstate.h
+#include asm/xen/minstate.h
 
.section .text,ax
 GLOBAL_ENTRY(xen_event_callback)
diff --git a/include/asm-ia64/xen/inst.h b/include/asm-ia64/xen/inst.h
index a8fb2ac..1e92d02 100644
--- a/include/asm-ia64/xen/inst.h
+++ b/include/asm-ia64/xen/inst.h
@@ -414,10 +414,10 @@
movl r30 = XSI_B1NAT;   \
;;  \
ld8 r30 = [r30];\
+   mov r31 = 1;\
;;  \
mov ar.unat = r30;  \
movl r30 = XSI_BANKNUM; \
-   mov r31 = 1;\
;;  \
st4 [r30] = r31;\
movl r30 = XSI_BANK1_R16;   \
diff --git a/include/asm-ia64/xen/minstate.h
b/include/asm-ia64/xen/minstate.h
index 67bbf79..7cdebc2 100644
--- a/include/asm-ia64/xen/minstate.h
+++ b/include/asm-ia64/xen/minstate.h
@@ -25,17 +25,16 @@
  * Note that psr.ic is NOT turned on by this macro.  This is so that
  * we can pass interruption state as arguments to a handler.
  */
-#define DO_SAVE_MIN(__COVER,SAVE_IFS,EXTRA)
\
+#define DO_SAVE_MIN(COVER,SAVE_IFS,EXTRA)
\
mov r16=IA64_KR(CURRENT);   /* M */
\
mov r27=ar.rsc; /* M */
\
mov r20=r1; /* A */
\
mov r25=ar.unat;/* M */
\
MOV_FROM_IPSR(r29); /* M */
\
-   mov r26=ar.pfs; /* I */
\
MOV_FROM_IIP(r28);  /* M */
\
mov r21=ar.fpsr;/* M */
\
-   __COVER;/* B;; (or nothing) */
\
-   ;;
\
+   mov r26=ar.pfs; /* I */
\
+   COVER;  /* B;; (or nothing) */
\
adds r16=IA64_TASK_THREAD_ON_USTACK_OFFSET,r16;
\
;;
\
ld1 r17=[r16];  /* load
current-thread.on_ustack flag */   \
@@ -80,17 +79,17 @@
 .mem.offset 8,0; st8.spill [r17]=r9,16;
\
 ;;
\
 .mem.offset 0,0; st8.spill [r16]=r10,24;
\
+   movl r8=XSI_PRECOVER_IFS;
\
 .mem.offset 8,0; st8.spill [r17]=r11,24;
\
 ;;
\
/* xen special handling for possibly lazy cover */
\
/* XXX: SAVE_MIN case in dispatch_ia32_handler: mov r30=r0 */
\
-   movl r8=XSI_PRECOVER_IFS;
\
;;
\
ld8 r30=[r8];
\
-   ;;
\
+(pUStk)sub r18=r18,r22;/* r18=RSE.ndirty*8 */
\
st8 [r16]=r28,16;   /* save cr.iip */
\
+   ;;  \
st8 [r17]=r30,16;   /* save cr.ifs */

[Xen-ia64-devel] pv_ops: IVT.s replacement to cover all sensitive instructions

2008-03-21 Thread Dong, Eddie
Replace all sensitive instructions in dual compile IVT.s.


Now the total change against upstream is:
 89 files changed, 8857 insertions(+), 1059 deletions(-)
All in one patch file size is 11643 lines.

If we can seperate those common file movement from arch/x86/xen
to driver/xen, additional 1.2-1.4K lines can be saved.

Thanks, eddie


Signed-off-by: Yaozu (Eddie) Dong [EMAIL PROTECTED]

diff --git a/arch/ia64/kernel/ivt.S b/arch/ia64/kernel/ivt.S
index f2306ae..d516bf4 100644
--- a/arch/ia64/kernel/ivt.S
+++ b/arch/ia64/kernel/ivt.S
@@ -19,6 +19,7 @@
  * Copyright (c) 2008 Isaku Yamahata yamahata at valinux co jp
  *VA Linux Systems Japan K.K.
  *pv_ops.
+ *  Yaozu (Eddie) Dong [EMAIL PROTECTED]
  */
 /*
  * This file defines the interruption vector table used by the CPU.
@@ -338,7 +339,7 @@ ENTRY(alt_itlb_miss)
DBG_FAULT(3)
MOV_FROM_IFA(r16)   // get address that caused the TLB miss
movl r17=PAGE_KERNEL
-   MOV_FROM_IPSR(r21)
+   MOV_FROM_IPSR(p0,r21)
movl r19=(((1  IA64_MAX_PHYS_BITS) - 1)  ~0xfff)
mov r31=pr
;;
@@ -378,7 +379,7 @@ ENTRY(alt_dtlb_miss)
movl r17=PAGE_KERNEL
MOV_FROM_ISR(r20)
movl r19=(((1  IA64_MAX_PHYS_BITS) - 1)  ~0xfff)
-   MOV_FROM_IPSR(r21)
+   MOV_FROM_IPSR(p0,r21)
mov r31=pr
mov r24=PERCPU_ADDR
;;
@@ -417,7 +418,7 @@ ENTRY(alt_dtlb_miss)
dep r21=-1,r21,IA64_PSR_ED_BIT,1
;;
or r19=r19,r17  // insert PTE control bits into r19
-(p6)   mov cr.ipsr=r21
+   MOV_FROM_IPSR(p6,r21)
;;
ITC_D(p7, r19, r18) // insert the TLB entry
mov pr=r31,-1
@@ -618,9 +619,9 @@ ENTRY(iaccess_bit)
/*
 * Erratum 10 (IFA may contain incorrect address) has NoFix
status.
 */
-   mov r17=cr.ipsr
+   MOV_FROM_IPSR(p0,r17)
;;
-   mov r18=cr.iip
+   MOV_FROM_IIP(r18)
tbit.z p6,p0=r17,IA64_PSR_IS_BIT// IA64 instruction set?
;;
 (p6)   mov r16=r18 // if so, use cr.iip
instead of cr.ifa
@@ -745,7 +746,7 @@ ENTRY(break_fault)
 */
DBG_FAULT(11)
mov.m r16=IA64_KR(CURRENT)  // M2 r16 - current
task (12 cyc)
-   MOV_FROM_IPSR(r29)  // M2 (12 cyc)
+   MOV_FROM_IPSR(p0,r29)   // M2 (12 cyc)
mov r31=pr  // I0 (2 cyc)
 
MOV_FROM_IIM(r17)   // M2 (2 cyc)
@@ -1057,11 +1058,11 @@ ENTRY(dispatch_illegal_op_fault)
.prologue
.body
SAVE_MIN_WITH_COVER
-   ssm psr.ic | PSR_DEFAULT_BITS
+   SSM_PSR_IC_AND_DEFAULT_BITS(r3,r24)
;;
srlz.i  // guarantee that interruption collection is on
;;
-(p15)  ssm psr.i   // restore psr.i
+   SSM_PSR_I(p15,r3)   // restore psr.i
adds r3=8,r2// set up second base pointer for SAVE_REST
;;
alloc r14=ar.pfs,0,0,1,0// must be first in insn group
@@ -1109,15 +1110,15 @@ ENTRY(non_syscall)
// suitable spot...
 
alloc r14=ar.pfs,0,0,2,0
-   mov out0=cr.iim
+   MOV_FROM_IIM(out0)
add out1=16,sp
adds r3=8,r2// set up second base pointer
for SAVE_REST
 
-   ssm psr.ic | PSR_DEFAULT_BITS
+   SSM_PSR_IC_AND_DEFAULT_BITS(r15,r24)
;;
srlz.i  // guarantee that interruption
collection is on
;;
-(p15)  ssm psr.i   // restore psr.i
+   SSM_PSR_I(p15,r15)  // restore psr.i
movl r15=ia64_leave_kernel
;;
SAVE_REST
@@ -1143,14 +1144,14 @@ ENTRY(dispatch_unaligned_handler)
SAVE_MIN_WITH_COVER
;;
alloc r14=ar.pfs,0,0,2,0// now it's safe (must
be first in insn group!)
-   mov out0=cr.ifa
+   MOV_FROM_IFA(out0)
adds out1=16,sp
 
-   ssm psr.ic | PSR_DEFAULT_BITS
+   SSM_PSR_IC_AND_DEFAULT_BITS(r3,r24)
;;
srlz.i  // guarantee that
interruption collection is on
;;
-(p15)  ssm psr.i   // restore psr.i
+   SSM_PSR_I(p15,r3)   // restore psr.i
adds r3=8,r2// set up second base
pointer
;;
SAVE_REST
@@ -1182,17 +1183,17 @@ ENTRY(dispatch_to_fault_handler)
 */
SAVE_MIN_WITH_COVER_R19
alloc r14=ar.pfs,0,0,5,0
-   mov out0=r15
MOV_FROM_ISR(out1)
MOV_FROM_IFA(out2)
MOV_FROM_IIM(out3)
MOV_FROM_ITIR(out4)
;;
-   ssm psr.ic | PSR_DEFAULT_BITS
+   SSM_PSR_IC_AND_DEFAULT_BITS(r3, out0)
+   mov out0=r15
;;
srlz.i  // guarantee that
interruption collection is on
;;
-(p15)  ssm psr.i   // restore psr.i
+   

RE: [Xen-ia64-devel] pv_ops: IVT.s replacement to cover all sensitiveinstructions

2008-03-21 Thread Dong, Eddie
Akio Takebe wrote:
 Hi, Eddie
 
 diff --git a/arch/ia64/kernel/ivt.S b/arch/ia64/kernel/ivt.S
 index f2306ae..d516bf4 100644
 --- a/arch/ia64/kernel/ivt.S
 +++ b/arch/ia64/kernel/ivt.S
 @@ -19,6 +19,7 @@
  * Copyright (c) 2008 Isaku Yamahata yamahata at valinux co jp
  *VA Linux Systems Japan K.K.
  *pv_ops.
 + *  Yaozu (Eddie) Dong [EMAIL PROTECTED]
  */
 /*
  * This file defines the interruption vector table used by the CPU.
 @@ -338,7 +339,7 @@ ENTRY(alt_itlb_miss)
  DBG_FAULT(3)
  MOV_FROM_IFA(r16)   // get address that caused the TLB miss
movl
 r17=PAGE_KERNEL -MOV_FROM_IPSR(r21)
 +MOV_FROM_IPSR(p0,r21)
 Why do you specify p0 to the macro?
 Is it not neccessary to perform the mov?

Originally it is just mov, but there is a place it needs a Pred, see 
alt_dtlb_miss in ivt.S. So the MACRO in inst.h is replaced with a pred, 
Or do u mean we keep seperate MACRO for with Pred?.
Actually p0 is default pred for all(default) instruction, so post
compile code is same.

Thanks, eddie

___
Xen-ia64-devel mailing list
Xen-ia64-devel@lists.xensource.com
http://lists.xensource.com/xen-ia64-devel


RE: [Xen-ia64-devel] [Q] How to download pv_ops git tree

2008-03-21 Thread Dong, Eddie
Akio Takebe wrote:
 Hi, Isaku and all
 
 I'd like to start pv_ops work.
 I'm not familiar with git command.
 
 I did git clone Isaku's git tree, but I cannot download it.
 (I can download Xiantao's kvm-ia64 git tree.)
 Did I have some mistakes? And can I use git protocol instead of http?
 
 # git clone
 http://people.valinux.co.jp/~yamahata/xen-ia64/linux-2.6-xen-ia6 
 4.git
 Initialized empty Git repository in
 /root/pv_ops/linux-2.6-xen-ia64/.git/ Cannot get remote repository
 information. 
 Perhaps git-update-server-info needs to be run there?
 
 Best Regards,
 
 Akio Takebe
 
I am not a git expert either :( But would you please check if the
http_proxy
is set correctly? Xiantao'tree support git port I think.

We was ever suffer from this for couple days too :)
Eddie

___
Xen-ia64-devel mailing list
Xen-ia64-devel@lists.xensource.com
http://lists.xensource.com/xen-ia64-devel


RE: [Xen-ia64-devel] Where to compile additional IVT.S

2008-03-20 Thread Dong, Eddie
Dong, Eddie wrote:
 Alex/Isaku:
   Current the make file is to compile additional ivt.S at
 kernel/., another approach is to compile in xen/..
   The later one has following benfit:
   1: Easy to read for Makefile and easy to extend for more
 hypervisors.
   2: Xen specific ministate.h can be in arch/ia64/xen/, like the
 one under arch/ia64/kernel.
 
 
   I am not a makefile expert, just use this example to explain
 idea, suggestion?
 thanks, eddie
 
 
Here is the formal patch for this.

Thanks, eddie



Move 2nd compile of ivt.S to per hypervisor sub dir.

Signed-off-by: Yaozu (Eddie) Dong [EMAIL PROTECTED]

diff --git a/arch/ia64/kernel/Makefile b/arch/ia64/kernel/Makefile
index 3e9a162..78ec040 100644
--- a/arch/ia64/kernel/Makefile
+++ b/arch/ia64/kernel/Makefile
@@ -80,16 +80,3 @@ $(obj)/gate-data.o: $(obj)/gate.so
 #
 AFLAGS_ivt.o += -D__IA64_ASM_PARAVIRTUALIZED_NATIVE
 
-# xen multi compile
-$(obj)/xen_%.o: $(src)/%.S FORCE
-   $(call if_changed_dep,as_o_S)
-
-#
-# xenivt.o
-#
-obj-$(CONFIG_XEN) += xen_ivt.o
-ifeq ($(CONFIG_XEN), y)
-targets += xen_ivt.o
-$(obj)/build-in.o: xen_ivt.o
-endif
-AFLAGS_xen_ivt.o += -D__IA64_ASM_PARAVIRTUALIZED_XEN
diff --git a/arch/ia64/xen/Makefile b/arch/ia64/xen/Makefile
index 87e29d2..605b757 100644
--- a/arch/ia64/xen/Makefile
+++ b/arch/ia64/xen/Makefile
@@ -2,7 +2,11 @@
 # Makefile for Xen components
 #
 
+KBUILD_AFLAGS += -D__IA64_ASM_PARAVIRTUALIZED_XEN
+ 
 obj-y := hypercall.o time.o xenivt.o xensetup.o xen_pv_ops.o irq_xen.o
\
 hypervisor.o util.o xencomm.o xcom_hcall.o xcom_asm.o
paravirt_xen.o
 
+obj-y += ../kernel/ivt.o
+
 obj-$(CONFIG_IA64_GENERIC) += machvec.o
diff --git a/arch/ia64/xen/xenivt.S b/arch/ia64/xen/xenivt.S
index c688aaa..2d509f2 100644
--- a/arch/ia64/xen/xenivt.S
+++ b/arch/ia64/xen/xenivt.S
@@ -13,7 +13,6 @@
 #include asm/kregs.h
 #include asm/pgtable.h
 
-#define __IA64_ASM_PARAVIRTUALIZED_XEN
 #include asm/xen/inst.h
 #include asm/xen/minstate.h
 #include ../kernel/minstate.h


ivt_simplify.patch
Description: ivt_simplify.patch
___
Xen-ia64-devel mailing list
Xen-ia64-devel@lists.xensource.com
http://lists.xensource.com/xen-ia64-devel

[Xen-ia64-devel] Xen common code across architecture

2008-03-20 Thread Dong, Eddie
Jeremy  all:
Current xen kernel codes are in arch/x86/xen, but xen dynamic
irqchip (events.c) are common for other architectures such as IA64. We
are in progress with enabling pv_ops for IA64 now and want to reuse same
code, do we need to move the code to some place common? suggestions?
Thanks, eddie

___
Xen-ia64-devel mailing list
Xen-ia64-devel@lists.xensource.com
http://lists.xensource.com/xen-ia64-devel


[Xen-ia64-devel] RE: simplify hw_irq.h

2008-03-19 Thread Dong, Eddie
Either are fine. 

-Original Message-
From: Isaku Yamahata [mailto:[EMAIL PROTECTED] 
Sent: 2008年3月19日 10:45
To: Dong, Eddie
Cc: Alex Williamson; xen-ia64-devel@lists.xensource.com
Subject: Re: simplify hw_irq.h

Hi Eddie.
Thank you for the patches.

ia64_vector is for iosapic redirect vector which is 8bit width, isn't it?
So just unconditionally replacing u8 with u16 seems unreasonable.
How about the following?

#ifndef CONFIG_PARAVIRT
typedef u8 ia64_vector;
#else
typedef u16 ia64_vector;
#endif


On Tue, Mar 18, 2008 at 09:30:19PM +0800, Dong, Eddie wrote:

 This one should be safe and easy to be accepted to remove
 CONFIG_XEN.
 
 
 Signed-off-by: Yaozu (Eddie) Dong [EMAIL PROTECTED]
 
 diff --git a/include/asm-ia64/hw_irq.h b/include/asm-ia64/hw_irq.h
 index 80009cd..f670433 100644
 --- a/include/asm-ia64/hw_irq.h
 +++ b/include/asm-ia64/hw_irq.h
 @@ -15,11 +15,7 @@
  #include asm/ptrace.h
  #include asm/smp.h
 
 -#ifndef CONFIG_XEN
 -typedef u8 ia64_vector;
 -#else
  typedef u16 ia64_vector;
 -#endif
 
  /*
   * 0 special



-- 
yamahata

___
Xen-ia64-devel mailing list
Xen-ia64-devel@lists.xensource.com
http://lists.xensource.com/xen-ia64-devel


[Xen-ia64-devel] RE: pv_ops: entry.S simplification

2008-03-19 Thread Dong, Eddie
Followup patch to delete dead file then.

Thanks, eddie


entry32.patch
Description: entry32.patch
___
Xen-ia64-devel mailing list
Xen-ia64-devel@lists.xensource.com
http://lists.xensource.com/xen-ia64-devel

[Xen-ia64-devel] Where to compile additional IVT.S

2008-03-19 Thread Dong, Eddie
Alex/Isaku:
Current the make file is to compile additional ivt.S at
kernel/., another approach is to compile in xen/..
The later one has following benfit:
1: Easy to read for Makefile and easy to extend for more
hypervisors.
2: Xen specific ministate.h can be in arch/ia64/xen/, like the
one under arch/ia64/kernel.


I am not a makefile expert, just use this example to explain
idea, suggestion?
thanks, eddie







diff --git a/arch/ia64/kernel/Makefile b/arch/ia64/kernel/Makefile
index 3e9a162..78ec040 100644
--- a/arch/ia64/kernel/Makefile
+++ b/arch/ia64/kernel/Makefile
@@ -80,16 +80,3 @@ $(obj)/gate-data.o: $(obj)/gate.so
 #
 AFLAGS_ivt.o += -D__IA64_ASM_PARAVIRTUALIZED_NATIVE
 
-# xen multi compile
-$(obj)/xen_%.o: $(src)/%.S FORCE
-   $(call if_changed_dep,as_o_S)
-
-#
-# xenivt.o
-#
-obj-$(CONFIG_XEN) += xen_ivt.o
-ifeq ($(CONFIG_XEN), y)
-targets += xen_ivt.o
-$(obj)/build-in.o: xen_ivt.o
-endif
-AFLAGS_xen_ivt.o += -D__IA64_ASM_PARAVIRTUALIZED_XEN
diff --git a/arch/ia64/kernel/ivt.S b/arch/ia64/kernel/ivt.S
index d1cebe5..e0c5ec8 100644
--- a/arch/ia64/kernel/ivt.S
+++ b/arch/ia64/kernel/ivt.S
@@ -75,8 +75,13 @@
 # define DBG_FAULT(i)
 #endif
 
-#include inst_paravirt.h
+#ifdef __IA64_ASM_PARAVIRTUALIZED_XEN
+#include asm/xen/inst.h
 #include minstate.h
+#else
+#include asm/native/inst.h
+#endif
+#include ../kernel/minstate.h
 
 #define FAULT(n)
\
mov r31=pr;
\
diff --git a/arch/ia64/xen/Makefile b/arch/ia64/xen/Makefile
index 87e29d2..a6b5b9a 100644
--- a/arch/ia64/xen/Makefile
+++ b/arch/ia64/xen/Makefile
@@ -2,7 +2,14 @@
 # Makefile for Xen components
 #
 
+extra-y += xen-ivt.S
+KBUILD_AFLAGS += -D__IA64_ASM_PARAVIRTUALIZED_XEN
+ 
 obj-y := hypercall.o time.o xenivt.o xensetup.o xen_pv_ops.o irq_xen.o
\
-hypervisor.o util.o xencomm.o xcom_hcall.o xcom_asm.o
paravirt_xen.o
+hypervisor.o util.o xencomm.o xcom_hcall.o xcom_asm.o \
+paravirt_xen.o xen-ivt.o
+
+$(obj)/xen-ivt.S:
+   cp $(obj)/../kernel/ivt.S $(obj)/xen-ivt.S 
 
 obj-$(CONFIG_IA64_GENERIC) += machvec.o

___
Xen-ia64-devel mailing list
Xen-ia64-devel@lists.xensource.com
http://lists.xensource.com/xen-ia64-devel


RE: [Xen-ia64-devel] RE: pv_ops polish: config option head file

2008-03-18 Thread Dong, Eddie
Alex Williamson wrote:
 On Tue, 2008-03-18 at 11:36 +0800, Dong, Eddie wrote:
 
I think CONFIG_XEN might become something like
 CONFIG_PARAVIRT_XEN, which will be dependent on CONFIG_PARAVIRT. 
 There might also be CONFIG_PARAVIRT_LGUEST, CONFIG_PARAVIRT_KVM,
 etc...  I think that 
 
 Then a single image won't be able to run on both lguest/Xen/KVM.
 This is worse than running_on_xen dynamic condition check.
 
Huh?  I never said you couldn't enable more than one
 CONFIG_PARAVIRT_FOO flavor in the same binary.
 

How about just simply use CONFIG_PARAVIRT ?

___
Xen-ia64-devel mailing list
Xen-ia64-devel@lists.xensource.com
http://lists.xensource.com/xen-ia64-devel


[Xen-ia64-devel] simplify hw_irq.h

2008-03-18 Thread Dong, Eddie
Alex Williamson wrote:
 Hi Isaku,
 
Here's some cleanup to arch/ia64/kernel/time.c.  I removed
 time_resume() since it's not called from anywhere.  I think this file
 still needs some work; any PV guest is going to need something like
 this, so it would be nice to isolate the Xen specific parts and have
 everything else in PARAVIRT_GUEST code instead of XEN.  This might be
 an opportunity for another pv_ops structure.  Maybe we should also
 create a is_paravirt_guest() macro to clearly distinguish Xen-isms
 from things we think apply to all PV guests.  This should probably
 live in asm/paravirt.h and include asm/xen/hypervisor.h so we can
 just include one file and get both is_paravirt_guest() and
 is_running_on_xen(). Thanks,
 
   Alex
 
 Signed-off-by: Alex Williamson [EMAIL PROTECTED]
 ---
 
  time.c |   58
  +++--- 1 file
 changed, 7 insertions(+), 51 deletions(-) 
 
 diff --git a/arch/ia64/kernel/time.c b/arch/ia64/kernel/time.c
 index 1bb0362..cae777e 100644
 --- a/arch/ia64/kernel/time.c
 +++ b/arch/ia64/kernel/time.c
 @@ -31,10 +31,10 @@
 
  #include asm/xen/hypervisor.h
  #ifdef CONFIG_XEN
 +#include asm/percpu.h
  #include linux/kernel_stat.h
  #include linux/posix-timers.h
  #include xen/interface/vcpu.h
 -#include asm/percpu.h
  #endif
 
  #include fsyscall_gtod_data.h
 @@ -283,7 +283,7 @@ __setup(nojitter, nojitter_setup);
 
  #ifdef CONFIG_XEN
  /* taken from i386/kernel/time-xen.c */
 -static void init_missing_ticks_accounting(int cpu)
 +static void xen_init_missing_ticks_accounting(int cpu)
  {
   struct vcpu_register_runstate_memory_area area;
   struct vcpu_runstate_info *runstate = per_cpu(runstate, cpu);
 @@ -301,63 +301,19 @@ static void init_missing_ticks_accounting(int
   cpu) +
runstate-time[RUNSTATE_offline];
  }
 
 -static int xen_ia64_settimefoday_after_resume;
 +static int xen_ia64_settimeofday_after_resume;
 
  static int __init __xen_ia64_settimeofday_after_resume(char *str)
  {
 - xen_ia64_settimefoday_after_resume = 1;
 + xen_ia64_settimeofday_after_resume = 1;
   return 1;
  }
 
 -__setup(xen_ia64_settimefoday_after_resume,
 +__setup(xen_ia64_settimeofday_after_resume,
__xen_ia64_settimeofday_after_resume);
 
 -/* Called after suspend, to resume time.  */
 -void
 -time_resume(void)
 -{
 - unsigned int cpu;
 -
 - /* Just trigger a tick.  */
 - ia64_cpu_local_tick();
 -
 - if (xen_ia64_settimefoday_after_resume) {
 - /* do_settimeofday() resets timer interplator */
 - struct timespec xen_time;
 - int ret;
 - efi_gettimeofday(xen_time);
 -
 - ret = do_settimeofday(xen_time);
 - WARN_ON(ret);
 - } else {
 -#if 0
 - /* adjust EFI time */
 - struct timespec my_time = CURRENT_TIME;
 - struct timespec xen_time;
 - static timespec diff;
 - struct xen_domctl domctl;
 - int ret;
 -
 - efi_gettimeofday(xen_time);
 - diff = timespec_sub(xen_time, my_time);
 - domctl.cmd = XEN_DOMCTL_settimeoffset;
 - domctl.domain = DOMID_SELF;
 - domctl.u.settimeoffset.timeoffset_seconds = diff.tv_sec;
 - ret = HYPERVISOR_domctl_op(domctl);
 - WARN_ON(ret);
 -#endif
 - /* itc_clocksource remembers the last timer status in
 -  * itc_jitter_data. Forget it */
 - clocksource_resume();
 - }
 -
 - for_each_online_cpu(cpu)
 - init_missing_ticks_accounting(cpu);
 -
 - touch_softlockup_watchdog();
 -}
  #else
 -#define init_missing_ticks_accounting(cpu) do {} while (0)
 +#define xen_init_missing_ticks_accounting(cpu) do {} while (0)
  #endif
 
  void __devinit
 @@ -455,7 +411,7 @@ ia64_init_itm (void)
   clocksource_itc.rating = 50;
 
   if (is_running_on_xen())
 - init_missing_ticks_accounting(smp_processor_id());
 + xen_init_missing_ticks_accounting(smp_processor_id());
 
   /* avoid softlock up message when cpu is unplug and plugged
again.
   */ touch_softlockup_watchdog();
 
 
 
 ___
 Xen-ia64-devel mailing list
 Xen-ia64-devel@lists.xensource.com
 http://lists.xensource.com/xen-ia64-devel

This one should be safe and easy to be accepted to remove
CONFIG_XEN.


Signed-off-by: Yaozu (Eddie) Dong [EMAIL PROTECTED]

diff --git a/include/asm-ia64/hw_irq.h b/include/asm-ia64/hw_irq.h
index 80009cd..f670433 100644
--- a/include/asm-ia64/hw_irq.h
+++ b/include/asm-ia64/hw_irq.h
@@ -15,11 +15,7 @@
 #include asm/ptrace.h
 #include asm/smp.h

-#ifndef CONFIG_XEN
-typedef u8 ia64_vector;
-#else
 typedef u16 ia64_vector;
-#endif

 /*
  * 0 special


x1
Description: x1
___
Xen-ia64-devel mailing list
Xen-ia64-devel@lists.xensource.com

[Xen-ia64-devel] (no subject)

2008-03-18 Thread Dong, Eddie
Following CONFIG_XEN is kind of historic issue, with CONFIG_PARAVIRT,
those code should be always enabled, so replacing with CONFIG_PARAVIRT
makes more sense.

Signed-off-by: Yaozu (Eddie) Dong [EMAIL PROTECTED]




diff --git a/arch/ia64/kernel/Makefile b/arch/ia64/kernel/Makefile
index a80dd3f..61643f8 100644
--- a/arch/ia64/kernel/Makefile
+++ b/arch/ia64/kernel/Makefile
@@ -91,8 +91,8 @@ $(obj)/xen_%.o: $(src)/%.S FORCE
 #
 # xenivt.o, xen_switch_leave.o
 #
-obj-$(CONFIG_XEN) += xen_ivt.o xen_switch_leave.o
-ifeq ($(CONFIG_XEN), y)
+obj-$(CONFIG_PARAVIRT) += xen_ivt.o xen_switch_leave.o
+ifeq ($(CONFIG_PARAVIRT), y)
 targets += xen_ivt.o xen_switch_leave.o
 $(obj)/build-in.o: xen_ivt.o xen_switch_leave.o
 endif
diff --git a/arch/ia64/kernel/salinfo.c b/arch/ia64/kernel/salinfo.c
index 91bc631..dd6b986 100644
--- a/arch/ia64/kernel/salinfo.c
+++ b/arch/ia64/kernel/salinfo.c
@@ -378,7 +378,7 @@ salinfo_log_open(struct inode *inode, struct file
*file)
data-open = 0;
return -ENOMEM;
}
-#ifdef CONFIG_XEN
+#ifdef CONFIG_PARAVIRT
if (is_running_on_xen()) {
ia64_mca_xencomm_t *entry;
unsigned long flags;
@@ -408,7 +408,7 @@ salinfo_log_release(struct inode *inode, struct file
*file)
struct salinfo_data *data = entry-data;
 
if (data-state == STATE_NO_DATA) {
-#ifdef CONFIG_XEN
+#ifdef CONFIG_PARAVIRT
if (is_running_on_xen()) {
struct list_head *pos, *n;
ia64_mca_xencomm_t *found_entry = NULL;
diff --git a/include/asm-ia64/hw_irq.h b/include/asm-ia64/hw_irq.h
diff --git a/include/asm-ia64/sal.h b/include/asm-ia64/sal.h
index 2965112..8aeefd2 100644
--- a/include/asm-ia64/sal.h
+++ b/include/asm-ia64/sal.h
@@ -682,7 +682,7 @@ ia64_sal_clear_state_info (u64 sal_info_type)
 /* Get the processor and platform information logged by SAL with
respect to the machine
  * state at the time of the MCAs, INITs, CMCs, or CPEs.
  */
-#ifdef CONFIG_XEN
+#ifdef CONFIG_PARAVIRT
 static inline u64 ia64_sal_get_state_info_size (u64 sal_info_type);
 typedef struct ia64_mca_xencomm_t {
void *record;
@@ -697,7 +697,7 @@ static inline u64
 ia64_sal_get_state_info (u64 sal_info_type, u64 *sal_info)
 {
struct ia64_sal_retval isrv;
-#ifdef CONFIG_XEN
+#ifdef CONFIG_PARAVIRT
if (is_running_on_xen()) {
ia64_mca_xencomm_t *entry;
struct xencomm_handle *desc = NULL;


x2
Description: x2
___
Xen-ia64-devel mailing list
Xen-ia64-devel@lists.xensource.com
http://lists.xensource.com/xen-ia64-devel

RE: [Xen-ia64-devel] (no subject)

2008-03-18 Thread Dong, Eddie
Yes, but running_on_xen is already there.

Alex Williamson wrote:
 On Tue, 2008-03-18 at 21:51 +0800, Dong, Eddie wrote:
 Following CONFIG_XEN is kind of historic issue, with CONFIG_PARAVIRT,
 those code should be always enabled, so replacing with
 CONFIG_PARAVIRT makes more sense.
 
I disagree, these are xen specific.
 
   Alex
 
 
 diff --git a/arch/ia64/kernel/Makefile b/arch/ia64/kernel/Makefile
 index a80dd3f..61643f8 100644 --- a/arch/ia64/kernel/Makefile
 +++ b/arch/ia64/kernel/Makefile
 @@ -91,8 +91,8 @@ $(obj)/xen_%.o: $(src)/%.S FORCE
  #
  # xenivt.o, xen_switch_leave.o
  #
 -obj-$(CONFIG_XEN) += xen_ivt.o xen_switch_leave.o
 -ifeq ($(CONFIG_XEN), y)
 +obj-$(CONFIG_PARAVIRT) += xen_ivt.o xen_switch_leave.o +ifeq
  ($(CONFIG_PARAVIRT), y) targets += xen_ivt.o xen_switch_leave.o
  $(obj)/build-in.o: xen_ivt.o xen_switch_leave.o
  endif
 diff --git a/arch/ia64/kernel/salinfo.c b/arch/ia64/kernel/salinfo.c
 index 91bc631..dd6b986 100644 --- a/arch/ia64/kernel/salinfo.c
 +++ b/arch/ia64/kernel/salinfo.c
 @@ -378,7 +378,7 @@ salinfo_log_open(struct inode *inode, struct
  file *file) data-open = 0;
  return -ENOMEM;
  }
 -#ifdef CONFIG_XEN
 +#ifdef CONFIG_PARAVIRT
  if (is_running_on_xen()) {
  ia64_mca_xencomm_t *entry;
  unsigned long flags;
 @@ -408,7 +408,7 @@ salinfo_log_release(struct inode *inode, struct
  file *file) struct salinfo_data *data = entry-data;
 
  if (data-state == STATE_NO_DATA) {
 -#ifdef CONFIG_XEN
 +#ifdef CONFIG_PARAVIRT
  if (is_running_on_xen()) {
  struct list_head *pos, *n;
  ia64_mca_xencomm_t *found_entry = NULL;
 diff --git a/include/asm-ia64/hw_irq.h b/include/asm-ia64/hw_irq.h
 diff --git a/include/asm-ia64/sal.h b/include/asm-ia64/sal.h
 index 2965112..8aeefd2 100644
 --- a/include/asm-ia64/sal.h
 +++ b/include/asm-ia64/sal.h
 @@ -682,7 +682,7 @@ ia64_sal_clear_state_info (u64 sal_info_type)
  /* Get the processor and platform information logged by SAL with
   respect to the machine * state at the time of the MCAs, INITs,
 CMCs, or CPEs.   */ -#ifdef CONFIG_XEN
 +#ifdef CONFIG_PARAVIRT
  static inline u64 ia64_sal_get_state_info_size (u64 sal_info_type);
  typedef struct ia64_mca_xencomm_t {
  void *record;
 @@ -697,7 +697,7 @@ static inline u64
  ia64_sal_get_state_info (u64 sal_info_type, u64 *sal_info)  {
  struct ia64_sal_retval isrv;
 -#ifdef CONFIG_XEN
 +#ifdef CONFIG_PARAVIRT
  if (is_running_on_xen()) {
  ia64_mca_xencomm_t *entry;
  struct xencomm_handle *desc = NULL;
 ___
 Xen-ia64-devel mailing list
 Xen-ia64-devel@lists.xensource.com
 http://lists.xensource.com/xen-ia64-devel


___
Xen-ia64-devel mailing list
Xen-ia64-devel@lists.xensource.com
http://lists.xensource.com/xen-ia64-devel


RE: [Xen-ia64-devel] RE: pv_ops polish: config option head file

2008-03-18 Thread Dong, Eddie

 How about just simply use CONFIG_PARAVIRT ?
 
Then how do you specify that you want a kernel built with Xen
 support, but not KVM?
 

Mmm, this is kind of what level of detail do we want user to choose. 
Given that RHEL want one image, so this sub-option is just for
in house development even if multiple IA64 VMM really comes.
We can argu for the usage model.


Leaving some code for this is OK, but 
but at least for those who have running_on_xen condition already,
we don;t need CONFIG_XEN, (rather CONFIG_PARAVIRT).
Also for those Xen specific files, i.e. those xen wrapper code,
we can treat whole directory as one, either compile it or skip it.
Does this make sense?

___
Xen-ia64-devel mailing list
Xen-ia64-devel@lists.xensource.com
http://lists.xensource.com/xen-ia64-devel


[Xen-ia64-devel] remove CONFIG_XEN for those already embraced in xen directory

2008-03-18 Thread Dong, Eddie
Xen specific directories are only compiled with Xen, keeping CONFIG_XEN
in each file is redudant.



diff --git a/arch/ia64/xen/xen_pv_ops.c b/arch/ia64/xen/xen_pv_ops.c
index 93a5c64..0c978e8 100644
--- a/arch/ia64/xen/xen_pv_ops.c
+++ b/arch/ia64/xen/xen_pv_ops.c
@@ -210,10 +210,8 @@ static void __init
 xen_post_paging_init(void)
 {
 #ifdef notyet /* XXX: notyet dma api paravirtualization*/
-#ifdef CONFIG_XEN
xen_contiguous_bitmap_init(max_pfn);
 #endif
-#endif
 }
 
 static void __init
diff --git a/arch/ia64/xen/xenpal.S b/arch/ia64/xen/xenpal.S
index 0e05210..57dca95 100644
--- a/arch/ia64/xen/xenpal.S
+++ b/arch/ia64/xen/xenpal.S
@@ -13,9 +13,7 @@
 #include asm/paravirt_nop.h
 
 GLOBAL_ENTRY(xen_pal_call_static)
-#ifdef CONFIG_XEN
BR_IF_NATIVE(native_pal_call_static, r22, p7)
-#endif
.prologue ASM_UNW_PRLG_RP|ASM_UNW_PRLG_PFS,
ASM_UNW_PRLG_GRSAVE(5)
alloc loc1 = ar.pfs,4,5,0,0
movl loc2 = pal_entry_point
@@ -30,21 +28,16 @@ GLOBAL_ENTRY(xen_pal_call_static)
mov loc4=ar.rsc // save RSE configuration
;;
mov ar.rsc=0// put RSE in enforced lazy, LE
mode
-#ifdef CONFIG_XEN
mov r9 = r8
XEN_HYPER_GET_PSR
;;
mov loc3 = r8
mov r8 = r9
;;
-#else
-   mov loc3 = psr
-#endif
mov loc0 = rp
.body
mov r30 = in2
 
-#ifdef CONFIG_XEN
// this is low priority for paravirtualization, but is called
// from the idle loop so confuses privop counting
movl r31=XSI_PSR_I_ADDR
@@ -57,13 +50,6 @@ GLOBAL_ENTRY(xen_pal_call_static)
mov r31 = in3
mov b7 = loc2
;;
-#else
-   mov r31 = in3
-   mov b7 = loc2
-
-(p7)   rsm psr.i
-   ;;
-#endif
mov rp = r8
br.cond.sptk.many b7
 1: mov psr.l = loc3


x3
Description: x3
___
Xen-ia64-devel mailing list
Xen-ia64-devel@lists.xensource.com
http://lists.xensource.com/xen-ia64-devel

RE: [Xen-ia64-devel] (no subject)

2008-03-18 Thread Dong, Eddie
Alex Williamson wrote:
 On Tue, 2008-03-18 at 22:19 +0800, Dong, Eddie wrote:
 Yes, but running_on_xen is already there.
 
Will it be there if we only have a kernel compiled with PV KVM
 support?  Are we going to stub out any *_xen_* function/macro in that
 case?
 
Well, firstly it is harmless, and then do u mean we should also add
CONFIG_NATIVE when building with Xen or KVM?

It is minor, so up to you and Isaku for each of them case by case.
For me I persuade small patch size, and I checked X86 side only have 
2 CONFIG_XEN in common file.

Thx, eddie

___
Xen-ia64-devel mailing list
Xen-ia64-devel@lists.xensource.com
http://lists.xensource.com/xen-ia64-devel


[Xen-ia64-devel] remove CONFIG_XEN_IA64_EXPOSE_P2M for now

2008-03-18 Thread Dong, Eddie
CONFIG_XEN_IA64_EXPOSE_P2M could be dropped for 1st domU only patch to
achieve small patch size, since it is a kind of performance patch.

Thx, eddie


x4
Description: x4
___
Xen-ia64-devel mailing list
Xen-ia64-devel@lists.xensource.com
http://lists.xensource.com/xen-ia64-devel

RE: [Xen-ia64-devel] RE: pv_ops polish: config option head file

2008-03-18 Thread Dong, Eddie
Alex Williamson wrote:
 On Tue, 2008-03-18 at 22:18 +0800, Dong, Eddie wrote:
 How about just simply use CONFIG_PARAVIRT ?
 
Then how do you specify that you want a kernel built with Xen
 support, but not KVM? 
 
 
 Mmm, this is kind of what level of detail do we want user to choose.
 Given that RHEL want one image, so this sub-option is just for
 in house development even if multiple IA64 VMM really comes.
 We can argu for the usage model.
 
 
 Leaving some code for this is OK, but
 but at least for those who have running_on_xen condition already,
 we don;t need CONFIG_XEN, (rather CONFIG_PARAVIRT).
 Also for those Xen specific files, i.e. those xen wrapper code,
 we can treat whole directory as one, either compile it or skip it.
 Does this make sense?
 
Hmm, I still disagree.  The way the Kconfigs are structured now, we
 have:
 
 PARAVIRT_GUEST
   - PARAVIRT
   - XEN
 
 PARAVIRT_GUEST adds no code, but enables the other config options. 
 XEN is dependent on PARAVIRT.  IMHO, PARAVIRT should enable the pv_ops
 functionality, but not add the Xen specific code.  You can imagine a

Yes if full pv_ops is enabled, those kind issue will all go away like
X86 side.
Actually I compromised to your suggestion to leave some running_on_xen
in code, though I still think majority of them should be pv_opsed.

With full pv_ops, running_on_xen can disappear too.

In this case, full pv_ops will solve CONFIG_XEN too, but since we may 
not have that much resource to complete it in short term, so I agree
we may leave some CONFIG_XEN  running_on_xen.

For those directories dedicated for XEN, I don't think we need
in code CONFIG_XEN any more.

For those running_on_xen + CONFIG_XEN case, it is a coding style issue.
Long time goal is to use full pv_ops, mca_xencomm_list is one of the
candidate IMO.

But for now leaveing runing_on_xen, or CONFIG_XEN is OK to me.
Whether it needs double condition is up to your guys.

 PV KVM option or LGUEST option that wants PARAVIRT, but not XEN (or
 all of them together in one binary).  I think which VMMs you want to
 support is a reasonable level of detail for someone configuring a
 kernel to select. The granularity also shows upstream that we've

It is always a tradeoff. If LGUEST or KVM hypervisor will come soon,
I bet full pv_ops will come soon too...

 thought about generic PV support and we're not just trying to dump a
 bunch of Xen-only code into the tree.

In this case, those mca_xencomm_list is hard to say Xen only code, it
could be abstracted as generic PV mechanism I think. But we just leave
it
to future.

 
We don't need to be constantly concerned with RH's config, we need
 to look at the bigger picture for what's right in Linux.  We can make
 sure RH's config has everything we want for a single binary later as
 long as we enable that possibility in what we're doing.  Thanks,
 
   Alex

Thanks, eddie

___
Xen-ia64-devel mailing list
Xen-ia64-devel@lists.xensource.com
http://lists.xensource.com/xen-ia64-devel


RE: [Xen-ia64-devel] RE: pv_ops polish: config option head file

2008-03-17 Thread Dong, Eddie
Alex Williamson wrote:
 On Fri, 2008-03-14 at 17:19 +0800, Dong, Eddie wrote:
 Oh, either are OK, just make sure we are in same page. Pleae
 keep this here. But we need to make sure generic_defconfig can
 include Xen machine vector in current case. Some Makefile/source
 change 
 is needed to include this, I think REDHAT use generic_defconfig.
 
I don't think any distributions use defconfig directly.  The RH
 kernel config is significantly different.  I don't think we need to
 touch the defconfig for now.  It's useful to have an example domU

Sure, that is fine and can be a future minor task.

 config (I think I'm actually the one who requested it) as we put the
 pieces together.  We need to keep that integration in mind, but we're
 a long way from that point.  Thanks,

Something I want to get clarified first, eventually with pv_ops patch
series
get in, RH eventually will only compile to get one image to run on
different
platforms including xen machine. In this way all the codes with
CONFIG_XEN today must be either checked in as generic code, or pv_ops
 except for those dual compile codes such as IVT  gate.

In other word, CONFIG_XEN will disappear mostly. RIght?

Xen machine vector also needs to be compiled when in
CONFIG_IA64_GENERIC.

Thanks, eddie

___
Xen-ia64-devel mailing list
Xen-ia64-devel@lists.xensource.com
http://lists.xensource.com/xen-ia64-devel


RE: [Xen-ia64-devel] pv_ops polish: remove fsys.S changes

2008-03-17 Thread Dong, Eddie

 Commit a07b265c618c279a84bac8f75f5acba1c1646200 is quit intrusive,
 it removed code from entry.S to a new file switch_entry.S and create
 1000 lines of patch. 
 
 At least we stay in original file, not?
 
 At least xen/ia64 needs to paravirtualize ia64_swtich_to,

I did a scan on Alex's tree and find the diff between entry.S
 xenentry.S is very small, see the attachment, probably KR set
need to by modified.
For rest, what I saw is just a sensitive instruction replacement, 
we can do using indirect function call, or leave to future.


 ia64_leave_syscall and ia64_leave_kernel.
 The discussion for hand written assembly code indicates that single
 source and multiple compile.

This is easy for IVT  gate page, but for those normal APIs, I am
conservative
except indirect function call can't solve the problem.

 I think the clean way to do for those three functions is to split out
 them from entry.S. Cluttering out entry.S by ifdef would be uglry.
 Do you have better idea?

Maybe my above finding is wrong, otherwise leave it for now is fine.

Eddie



entry_s_diff.patch
Description: entry_s_diff.patch
___
Xen-ia64-devel mailing list
Xen-ia64-devel@lists.xensource.com
http://lists.xensource.com/xen-ia64-devel

RE: [Xen-ia64-devel] RE: pv_ops polish: config option head file

2008-03-17 Thread Dong, Eddie
Alex Williamson wrote:
 On Mon, 2008-03-17 at 14:07 +0800, Dong, Eddie wrote:
 
 Something I want to get clarified first, eventually with pv_ops
 patch series get in, RH eventually will only compile to get one
 image to run on different platforms including xen machine. In this
 way all the codes with CONFIG_XEN today must be either checked in as
  generic code, or pv_ops except for those dual compile codes such as
 IVT  gate. 
 
 In other word, CONFIG_XEN will disappear mostly. RIght?
 
 Xen machine vector also needs to be compiled when in
 CONFIG_IA64_GENERIC.
 
I think CONFIG_XEN might become something like CONFIG_PARAVIRT_XEN,
 which will be dependent on CONFIG_PARAVIRT.  There might also be
 CONFIG_PARAVIRT_LGUEST, CONFIG_PARAVIRT_KVM, etc...  I think that

Then a single image won't be able to run on both lguest/Xen/KVM. This is
worse than running_on_xen dynamic condition check.

 would fit the typical Linux model of being able to selectively include
 features.  We'll need to make sure distributions set these the way we

Yes if the feature is alternative one.

 want and maybe add it to the defconfig once the code is stabilized. 

Agree.

 The Xen machine vector will need to be included in
 CONFIG_IA64_GENERIC, but it will also depend on CONFIG_PARAVIRT_XEN. 
 Does that sound reasonable? Thanks,
 
Yes.

Eddie

___
Xen-ia64-devel mailing list
Xen-ia64-devel@lists.xensource.com
http://lists.xensource.com/xen-ia64-devel


[Xen-ia64-devel] pv_ops: kernel/inst_native.h

2008-03-16 Thread Dong, Eddie

Isaku/Alex:
There is a new file called kernel/inst_native.h to define those
pv MACROs for native. I would suggest we do following changes:
1: Move it to public head files such as include/asm-ia64 at
least since some other files will use it too.
2: Further thinking is that how we generate those dual compile
code? If we will use symbol link, then I would suggest we put it in
include/asm-ia64/native. Comments?

3: How about this style if MACRO?
#define MOV_FROM_CR(reg,crx) \
mov reg = cr.##crx


Using this we can reduce the patch size  avoid mistake.

Thx, eddie

___
Xen-ia64-devel mailing list
Xen-ia64-devel@lists.xensource.com
http://lists.xensource.com/xen-ia64-devel


[Xen-ia64-devel] pv_ops polish: config option head file

2008-03-14 Thread Dong, Eddie
Isaku:
Targeting the patchset or git tree
http://people.valinux.co.jp/~yamahata/xen-ia64/linux-2.6-xen-ia64.git/,
I got some questions:

1:  I saw some config options such as:
CONFIG_PARAVIRT
CONFIG_PARAVIRT_ALT
CONFIG_PARAVIRT_ENTRY
CONFIG_PARAVIRT_NOP_B_PATCH
CONFIG_PARAVIRT_GUEST

I am not sure what is best, but seems we expose too much here,
and X86 just have one CONFIG_PARAVIRT. I suggest we can go mainly using
one especially we have strong reasons.

2: config file
I saw you generated a new config file specifically for domU
(xen_domu_wip_defconfig), I am wondering is it is what Redhat want. I
think RH will only build one image for various machine including PV
guest in one release. So I suggest we remove the new config file 
xen_domu_wip_defconfig, but put CONFIG_PARAVIRT into each existing
config files.

Comments?
Thanks, eddie


___
Xen-ia64-devel mailing list
Xen-ia64-devel@lists.xensource.com
http://lists.xensource.com/xen-ia64-devel


[Xen-ia64-devel] pv_ops polish: remove fsys.S changes

2008-03-14 Thread Dong, Eddie
CONFIG_XEN needs to be gradually removed per discussion since pv_ops
include this concept.
Due to this, we defer the fsys.S changes for some time later to use
indirect function call.

Temporay undo it for now.

Thanks, eddie



commit a882270f415e717a3694f2762f348ab285fb55ce
Author: root [EMAIL PROTECTED]
Date:   Fri Mar 14 16:18:49 2008 +0800

Undo performance optimization items temporary for fsys.S per
discussion to make first patch to upstream simple.

Signed-off-by: Yaozu (Eddie) Dong [EMAIL PROTECTED]

diff --git a/arch/ia64/kernel/fsys.S b/arch/ia64/kernel/fsys.S
index 7d97e37..4484197 100644
--- a/arch/ia64/kernel/fsys.S
+++ b/arch/ia64/kernel/fsys.S
@@ -570,34 +570,11 @@ ENTRY(fsys_fallback_syscall)
adds r17=-1024,r15
movl r14=sys_call_table
;;
-#ifdef CONFIG_XEN
-   movl r18=running_on_xen;;
-   ld4 r18=[r18];;
-   // p14 = running_on_xen
-   // p15 = !running_on_xen
-   cmp.ne p14,p15=r0,r18
-   ;;
-(p14)  movl r18=XSI_PSR_I_ADDR;;
-(p14)  ld8 r18=[r18]
-(p14)  mov r29=1;;
-(p14)  st1 [r18]=r29
-(p15)  rsm psr.i
-#else
rsm psr.i
-#endif
shladd r18=r17,3,r14
;;
ld8 r18=[r18]   // load normal
(heavy-weight) syscall entry-point
-#ifdef CONFIG_XEN
-(p14)  mov r27=r8
-(p14)  XEN_HYPER_GET_PSR
-   ;;
-(p14)  mov r29=r8
-(p14)  mov r8=r27
-(p15)  mov r29=psr // read psr (12 cyc load
latency)
-#else
mov r29=psr // read psr (12 cyc load
latency)
-#endif
mov r27=ar.rsc
mov r21=ar.fpsr
mov r26=ar.pfs
@@ -709,25 +686,7 @@ GLOBAL_ENTRY(fsys_bubble_down)
mov rp=r14  // I0   set the real
return addr
and r3=_TIF_SYSCALL_TRACEAUDIT,r3   // A
;;
-#ifdef CONFIG_XEN
-   movl r14=running_on_xen;;
-   ld4 r14=[r14];;
-   // p14 = running_on_xen
-   // p15 = !running_on_xen
-   cmp.ne p14,p15=r0,r14
-   ;;
-(p14)  movl r28=XSI_PSR_I_ADDR;;
-(p14)  ld8 r28=[r28];;
-(p14)  adds r28=-1,r28;;   // event_pending
-(p14)  ld1 r14=[r28];;
-(p14)  cmp.ne.unc p13,p14=r14,r0;;
-(p13)  XEN_HYPER_SSM_I
-(p14)  adds r28=1,r28;;// event_mask
-(p14)  st1 [r28]=r0;;
-(p15)  ssm psr.i
-#else
ssm psr.i   // M2   we're on kernel
stacks now, reenable irqs
-#endif
cmp.eq p8,p0=r3,r0  // A
 (p10)  br.cond.spnt.many ia64_ret_from_syscall // Breturn if bad
call-frame or r15 is a NaT
 


fsys_S_undo.patch
Description: fsys_S_undo.patch
___
Xen-ia64-devel mailing list
Xen-ia64-devel@lists.xensource.com
http://lists.xensource.com/xen-ia64-devel

[Xen-ia64-devel] RE: pv_ops polish: config option head file

2008-03-14 Thread Dong, Eddie
Isaku Yamahata wrote:
 On Fri, Mar 14, 2008 at 03:39:15PM +0800, Dong, Eddie wrote:
 Isaku:
  Targeting the patchset or git tree

http://people.valinux.co.jp/~yamahata/xen-ia64/linux-2.6-xen-ia64.git/,
 I got some questions:
 
 Thank you for comments.
 
 
 1:   I saw some config options such as:
 CONFIG_PARAVIRT
 CONFIG_PARAVIRT_ALT
 CONFIG_PARAVIRT_ENTRY
 CONFIG_PARAVIRT_NOP_B_PATCH
 CONFIG_PARAVIRT_GUEST
 
  I am not sure what is best, but seems we expose too much here,
 and X86 just have one CONFIG_PARAVIRT. I suggest we can go mainly
 using one especially we have strong reasons.
 
 In fact I'm sorting them out right now as a part of pv_cpu_ops
 clean up.

Great!

 They are just historical leftovers.
 Presumably we'll have only CONFIG_PARAVIRT and CONFIG_PARAVIRT_GUEST.
 (X86 has both CONFIG_PARAVIRT and CONFIG_PARAVIRT_GUEST.
 Please make sure.)

Oh, Yes it is in latest tree now:)
 
 
 2: config file
  I saw you generated a new config file specifically for domU
 (xen_domu_wip_defconfig), I am wondering is it is what Redhat want. I
 think RH will only build one image for various machine including PV
 guest in one release. So I suggest we remove the new config file
 xen_domu_wip_defconfig, but put CONFIG_PARAVIRT into each existing
 config files.
 
 I put the file there because others may want to know my config.
 I haven't intended to push the file to the upstream. I should have
 written so in the commit log message.
 Hmm, I can also remove it and put my config somewhere else.
 Either way is okay because the file is just for other's convenience.
 Which do you prefer, removing it or updating the commit log?

Oh, either are OK, just make sure we are in same page. Pleae
keep this here. But we need to make sure generic_defconfig can include
Xen machine vector in current case. Some Makefile/source change 
is needed to include this, I think REDHAT use generic_defconfig.

Thx, eddie

___
Xen-ia64-devel mailing list
Xen-ia64-devel@lists.xensource.com
http://lists.xensource.com/xen-ia64-devel


RE: [Xen-ia64-devel] pv_ops polish: remove fsys.S changes

2008-03-14 Thread Dong, Eddie
Isaku Yamahata wrote:
 Hi Eddie.
 Thank you for your patch.
 the change is already isolated as the commit of
 d81f732b0d57371bfc220b1a1027ab18ea9a5265.
 So what we need to do is just dropping the change set.
 The same would apply to the gate page paravirtualization change set.
 I'll take care of it.
 
 Do you have any other changesets to be dropped for minimal domU?
 

Commit a07b265c618c279a84bac8f75f5acba1c1646200 is quit intrusive, it
removed code
from entry.S to a new file switch_entry.S and create 1000 lines of
patch.

At least we stay in original file, not? 

Thanks, eddie

___
Xen-ia64-devel mailing list
Xen-ia64-devel@lists.xensource.com
http://lists.xensource.com/xen-ia64-devel


[Xen-ia64-devel] pv_ops enable

2008-03-14 Thread Dong, Eddie
Isaku/Alex:
There are some some certain features that target dom0 or for
some extended domU features such as EXEC support etc, I would suggest we
drop it temporary after some basic domU code get in. If this is true,
then patch like following can be dropped, also many other files are
similar and can be dropped.

diff --git a/arch/ia64/kernel/machine_kexec.c
b/arch/ia64/kernel/machine_kexec.c
index eaec78a..bf2e473 100644
--- a/arch/ia64/kernel/machine_kexec.c
+++ b/arch/ia64/kernel/machine_kexec.c
@@ -25,6 +25,9 @@
 #include asm/meminit.h
 #include asm/processor.h
 #ifdef CONFIG_XEN
+#ifdef notyet
+#include xen/interface/kexec.h
+#endif
 #include asm/kexec.h
 #endif

@@ -131,7 +134,13 @@ void machine_kexec(struct kimage *image)
for(;;);
 }
 #else /* CONFIG_XEN */
-/* notyet */
+#ifdef notyet
+void machine_kexec_setup_load_arg(xen_kexec_image_t *xki, struct kimage
*image)
+{
+   xki-reboot_code_buffer =
+   kexec_page_to_pfn(image-control_code_page) 
PAGE_SHIFT;
+}
+#endif
 #endif /* CONFIG_XEN */

 void arch_crash_save_vmcoreinfo(void)




Basically original CSET 226/227 in Alex's tree can temporary be
removed from Alex's tree.
Also in Alex's tree, I noticed most driver directory patches are
dropped since it is dom0 feature, but I see intel-agp.c is imported,
typo? Can we remove it now?


commit 3a0f146c2b00f9b48dd3e23c4bdf16e5c1775259
Author: Isaku Yamahata [EMAIL PROTECTED]
Date:   Mon Jan 21 18:45:15 2008 +0900

ia64/xen: import patches under drivers

Thanks, eddie

___
Xen-ia64-devel mailing list
Xen-ia64-devel@lists.xensource.com
http://lists.xensource.com/xen-ia64-devel


RE: [Xen-ia64-devel] Paravirt_ops/hybrid directions and next steps

2008-03-13 Thread Dong, Eddie
Tristan:
We are talking about pv_ops interface calling convention, not 
hypervisor API convention. It should not violate each other because we still 
have hypervisor wrapper which can do the convertion.

One thing in my mind is that when we do pv_ops, we stand in hypervisor 
neutral position. Only when we implement xen hypervisor wrapper of pv_ops, we 
stand on Xen.

But yes, since we use single source, dual compile to generate code in 
place. Actually those pv_cpu_asm_ops won't be used frequently, most of them are 
not used. So even we use this policy, it is very few place which may use a 
formal pv_ops for ASM code which imply the calling convention.  All IVT/gate 
table/page doesn't have this issue.

Thanks, eddie 

-Original Message-
From: Tristan Gingold [mailto:[EMAIL PROTECTED] 
Sent: 2008年3月11日 17:24
To: Dong, Eddie
Cc: Alex Williamson; xen-ia64-devel
Subject: Re: [Xen-ia64-devel] Paravirt_ops/hybrid directions and next steps

Hi,

just a point about call convention: I don't think switching to PAL static
convention is a good idea as it doesn't work well with xen hyperprivop
because of banked registers.

Tristan.

___
Xen-ia64-devel mailing list
Xen-ia64-devel@lists.xensource.com
http://lists.xensource.com/xen-ia64-devel


RE: [Xen-ia64-devel] Paravirt_ops/hybrid directions and next steps

2008-03-10 Thread Dong, Eddie
Alex Williamson wrote:
 On Mon, 2008-03-10 at 13:56 +0800, Dong, Eddie wrote:
 Alex  all:
  I exchanged some ideas with Isaku to discuss the gap and status
 of pv_ops support in IA64, Isaku did a lot of work toward pv_ops
 since his previous forward backport patch. Great thanks to Isaku.
  The attached doc is a draft for some of the key gaps and current
 status. I think it is time for another cross major company meeting to
 discuss how we cooperate and how effectively go with pv_ops. Mostly
 Isaku and I are in same page for what IA64 pv_ops should look like
 now, though some details may have different understanding.
 
 Any ideas?
 
 Hi Eddie,
 
Much thanks to you and Isaku for leading this effort.  I'm open to
 another conference call, but maybe we can discuss some items here on
 the mailing list too.  I saw that Isaku has created a wiki page on
 the Xen wiki and started a new git project on Gitorious.org.  The
 wiki page seems like a good place to keep track of who is working on
 which chunk and the status.  For the git side, I would suggest that
 the model might be that each developer has a project on gitorious.org
 and sends out patches or pull requests to have a single upstream
 reference.  Isaku's tree seems to be a good focal point for now if
 he's willing to take on the task of accepting code from others.
 
The 2.6.26 merge window will likely open before too long, so we
 also need to do some coordination with Tony Luck and the other

Since we are unable to get whole solution (dom0) to upstream in 
near future since X86 didn't complete it yet. OSV are unable to build
single image for all, so I think they may stay with current solution
a little bit longer till X86 get solved. I am not that care about
which version IA64 pv_ops will be in. As if Tony starts to take the
patch,
the rest will be easy.

 upstream developers.  Are they going to be interested in putting in
 pieces at each upstream merge window, or should we build up a
 complete solution for domU support in Isaku's tree or Tony's testing

Yes, we need to get clear message firstly. In the doc, I was assuming 
maintainer need to see whole patch, though he takes one slowly at 
beginning especially.

 branch first?  We also need to be careful about submitting patch sets
 that stand on their own and are bisect-able.  It's likely going to

Agree, so at least we get xen-ia64/kvm-ia64 people buy in the patch
first
so that we can push together.

 take several kernel merge windows before we get full domU support,
 let alone dom0. 
 
In your slide set, you mention removing running_on_xen since it
 conflicts with pv_ops.  I think this is a really good goal, but I have
 doubts about whether it's achievable.  We're not likely to make a

My assumption is that Linux maintainer won't expect to see 
running_on_xen, running_on_kvm, running_on_lguest, 
running_on_hybrid_xen, running_on_hybrid_kvm etc. It is too ugly.

But if you mean we keep it temporary for now, I am fine. 

 pv_ops to fit every corner case, and we may have to resort to an ugly
 direct test for xen.  Let's try to avoid them, but we already have a
 few cases of checking machine vector names for this type of thing in
 other parts of the ia64 code.  Thanks,

Could you point me to the code that you feel pv_ops may be hard here?

 
   Alex

Thanks, eddie

___
Xen-ia64-devel mailing list
Xen-ia64-devel@lists.xensource.com
http://lists.xensource.com/xen-ia64-devel


RE: [Xen-ia64-devel] Paravirt_ops/hybrid directions and next steps

2008-03-10 Thread Dong, Eddie

   I don't have any architecture specific examples off the top of my
 head, but how about skipping serial port detection on dom0?  It's
 rather Xen specific and we haven't yet come up with a way to hide
 Xen's UART (ioport  mmio) from dom0.  KVM/Lguest wouldn't care about
 this, so it may not be worthy of a pv_op.  Thanks,
 
Hi, Alex. 
We can keep this detail in pv_ops enabling time to see if we can
get a right abstract. I assume we will need 1-2 month to make it full
pv_ops.

Thanks, Eddie



___
Xen-ia64-devel mailing list
Xen-ia64-devel@lists.xensource.com
http://lists.xensource.com/xen-ia64-devel


RE: [Xen-ia64-devel] [PATCH] unify vtlb and vhpt

2008-03-03 Thread Dong, Eddie

 Limiting the entry to be not moved to VHPT head could solve this
 issue but again the code will be complicated.
 
 Sharing VTLB/VHPT memory could be simply used here, and the patch
 will be more smaller and simple IMO.

My concept is just sharing vTLB/VHPT memory. 
As long as sharing the pool of collision chain,
distinction of vTLB/VHPT can't be avoided

I am not sure about the statement. Putting vTLB in physical VHPT side
is mixing something, not only sharing. What I mean here is something
like following pseudo code, (defenitely init code and many cleanup
was not in this pseudo code).

This way, we don't impact low level VHPT walk. and makes it clear
in concept to distinguish vTLB  VHPT.


diff -r ff90abf572f2 xen/arch/ia64/vmx/vtlb.c
--- a/xen/arch/ia64/vmx/vtlb.c  Fri Jan 18 14:11:20 2008 -0700
+++ b/xen/arch/ia64/vmx/vtlb.c  Tue Mar 04 02:18:33 2008 +0800
@@ -398,7 +398,9 @@ static thash_data_t *__alloc_chain(thash

 cch = cch_alloc(hcb);
 if (cch == NULL) {
-thash_recycle_cch_all(hcb);
+   vcpu = container_of(hcb, vcpu, vtlb);
+thash_recycle_cch_all(vcpu-vtlb);
+thash_recycle_cch_all(vcpu-vhpt);
 cch = cch_alloc(hcb);
 }
 return cch;
@@ -440,12 +442,13 @@ static void vtlb_insert(VCPU *v, u64 pte
 }
 cch = cch-next;
 }
+vcpu = container_of(hcb, vcpu, vtlb);
 if (hash_table-len = MAX_CCN_DEPTH) {
 thash_recycle_cch(hcb, hash_table);
-cch = cch_alloc(hcb);
+   cch = cch_alloc(vcpu-vhpt);
 }
 else {
-cch = __alloc_chain(hcb);
+   cch = __alloc_chain(vcpu-vhpt);
 }
 cch-page_flags = pte;
 cch-itir = itir;

___
Xen-ia64-devel mailing list
Xen-ia64-devel@lists.xensource.com
http://lists.xensource.com/xen-ia64-devel


RE: [Xen-ia64-devel] [PATCH] unify vtlb and vhpt

2008-02-29 Thread Dong, Eddie
[EMAIL PROTECTED] wrote:
 Hi Kouya,
 
 to be honest I have mixed feelings about your patch.  I like it but I
 don't really understand its purpose.  See my comment.
 
 I still think it would be nice if the vTLB were TR-mapped.

This is same with sharing vTLB/VHPT memory. Single TR
or double TR (your case) can solve problem both.

 
 
 Quoting Kouya Shimura [EMAIL PROTECTED]:
 
 Dong, Eddie writes:
 This can be simply solved by increasing vTLB size, or
 use same memory with VHPT.
 
 The problem is, how much size is suitable?
 There is a trade off. The larger size consumes a time for ptc.e
 emulation and causes a serious slowdown for a Windows guest.
 

How frequently does Windows issue PTC.E? In current situation, 
VHPT is 16MB, while vTLB is 32K, so I would think purging VHPT
is dominant. 

 Ok.
 
 Currently vTLB size is configurable but ordinary users
 can't understand what vTLB is.

??? This is not true except the user(developer) doesn't 
have virtualization concept. In my experience, I have trouble
to explain what is host VHPT in VMM for a guest, but pretty easy 
to say the meaning of vTLB whose original meaning is guest
 TLB. The issue in today's Xen/IA64 is
that so called vTLB is not equal to real guest TLB. (guest TLB
= vTR + vTLB + something in VHPT + something in machine TLB)

If you want to rename vTLB to something else, I will vote for Yes.

 A purpose of this patch is to make users free from
 setting vTLB size.

This is same with sharing memory between VTLB  VHPT.

 
 By merging vTLB and VHPT the user can't anymore set the size of the
 vTLB. This is obvious.  But is your patch different from increasing
 vTLB size ? Did I miss a point ?
 
 I am not sure it is a good idea to remove vTLB size.  On a real
 processor the TLB structure is fixed and defined.

Yes, but probably this is ok since vTLB isn't equal to guest TLB :(
Ideally guest TLB should have a fixed size. 

Sharing memory makes concept clear for me. I.e. VHPT is VHPT,
while vTLB is those entries can't be put into VHPT.

With this patch, if a VTLB entry in collision chain has to become
head of VHPT table, it is really dilemma to put this to head or not.
GP fault for reserved bit could be used here with performance 
penalty but it is really not good and it could happen again as if the
 VHPT entry head keeps for vTLB (TC could go away soon).
Limiting the entry to be not moved to VHPT head could solve this
issue but again the code will be complicated.

Sharing VTLB/VHPT memory could be simply used here, and the patch
will be more smaller and simple IMO.

 
 To tell the truth, I rewrote the vtlb_thash() function before. See.

http://lists.xensource.com/archives/html/xen-ia64-devel/2007-08/msg00108
.html
 
 I think the algorithm is the same as HW.
 I did a reverse engineering on a Montecito processor.
 (I'm afraid Montvale use the different algorithm...)

Could be in reality, I don't know :) But we still think it is different
since 
we can;t guarante it is same :(


 
 This seems to be the same algorithm as the one for Madison.  Cf
 Matthew Chapman pages.
 
 Tristan.

Thanks, eddie


___
Xen-ia64-devel mailing list
Xen-ia64-devel@lists.xensource.com
http://lists.xensource.com/xen-ia64-devel


[Xen-ia64-devel] RE: [PATCH 0/5] RFC: ia64/pv_ops: ia64 intrinsics paravirtualization

2008-02-28 Thread Dong, Eddie
Isaku Yamahata wrote:
 Hi. Thank you for comments on asm code paravirtualization.
 Its direction is getting clear. Although it hasn't been finished yet,
 I'd like to start discussion on ia64 intrinsics paravirtualization.
 This patch set is just for discussion so that it is a subset of
 xen Linux/ia64 domU paravirtualization, not self complete.
 You can get the full patched tree by typing
 git clone
 http://people.valinux.co.jp/~yamahata/xen-ia64/linux-2.6-xen-ia64.git/
 
 
 A paravirtualized guest wants to replace ia64 intrinsics, i.e.
 the operations defined in include/asm-ia64/gcc_instrin.h or
 include/asm-ia64/intel_instrin.h, with its own version.
 (At least xenLinux/ia64 does.)
 So we need a sort of interface to do so.
 I want to discuss on which direction to go for, please comment.
 
 
 This paravirtualization corresponds to the part of x86 pv_ops,
   Performance critical code written in C. They are basically indirect
   function call via pv_xxx_ops. For performance, each pv instance is
   allowed to binary patch in order to replace function call
   instruction with their predefined instructions in place.
   The ia64 intrinsics corresonds to this kind of interface.
 
 The discussion points so far are
 - binary patching should be mandatory or optional?
   The current patch requires binary patch, but some people think
   requiring binary patch for pv instances is a bad idea.
   I think by providing reasonable helper functions set, binary patch
   won't be burden for pv instances.
 
 - How differ from x86 pv_ops?
   Some people think that the very similarity to x86 pv_ops is
   important. I guess they're thinking so considering maintenance
   cost. Anyway ia64 is already different from x86, so such difference
   doesn't matter as long as ia64 paravirtualization interface is
 clean enough for maintenance. 
 
 Note: the way can differ from one operation from another, but it
 might cause some inconsistency.
 The following ways are proposed so far.
 
 
 * Option 1: the current way
 The code would look like
 static inline unsigned long
 paravirt_get_cpuid(int index)
 {
 register __u64 ia64_intri_res asm (r8);
 register __u64 __index asm (r8) = index;
 asm volatile (paravirt_alt_inst(mov %0=cpuid[%r1],
PARAVIRT_INST_GET_CPUID):
   =r(ia64_intri_res): 0O(__index));
 return ia64_intri_res;
 }
 #define ia64_get_cpuid  paravirt_get_cpuid
 
 note:
 Using r8 is derived from xen hypercall abi.
 We have to define which register should be used or can be
 clobbered. 
 
   Pros:
   - in-place binary patch is possible.
 (We may want to pad with nop. How many?)
   - native case performance is good.
   - native case doesn't need any modification.
 
   Cons:
   - binary patch is required for pv instances.
   - Probably current implementation might be too xen-biased.
 Reviewing them would be necessary for hypervisor neutrality.
 
 * Option 2: direct branch
 The code would look like
 static inline unsigned long
 paravirt_get_cpuid(int index)
 {
register __u64 ia64_intri_res asm (r8);
register __u64 __index asm (r8) = index;
register __u64 ret_addr asm (r9);
asm volatile (paravirt_alt_inst(
 br.cond b0=native_get_cpuid,
 /* or brl.cond for fast hypercall */
  PARAVIRT_INST_GET_CPUID):
  =r(ia64_intri_res), =r(ret_addr):
  0O(__index)
  b0);
return ia64_intri_res;
 }
 #define ia64_get_cpuid  paravirt_get_cpuid
 
 note:
 Using r8 is derived from xen hypercall abi.
 We have to define which register should be used or can be
 clobbered. 
 
   Pros:
   - in-place binary patch is possible.
 (We may want to pad with nop. How many?)
   - so that performance would be good for native case using it.
 
   Cons:
   - binary patch is required for pv instances.
   - native case needs binary patch for optimal performance.
 
 * Option 3: indirect branch
 The code would look like
 static inline unsigned long
 paravirt_get_cpuid(int index)
 {
register __u64 ia64_intri_res asm (r8);
register __u64 __index asm (r8) = index;
register __u64 func asm (r9);
asm volatile (paravirt_alt_inst(
 mov %1 = pv_cpu_ops
 add %1 = %1, PV_CPU_GET_CPUID_OFFSET
 ld8 %1 = [%1]
 mov b1 = %1
 br.cond b0=b1
  PARAVIRT_INST_GET_CPUID):
  =r(ia64_intri_res),
  =r(func):

[Xen-ia64-devel] RE: [kvm-ia64-devel] [PATCH 0/4] ia64/xen: paravirtualization ofhand written assembly code

2008-02-25 Thread Dong, Eddie
Keith Owens wrote:
 Isaku Yamahata (on Mon, 25 Feb 2008 12:16:42 +0900) wrote:
 Hi. The patch I send before was too large so that it was dropped from
 the maling list. I'm sending again with smaller size.
 This patch set is the xen paravirtualization of hand written assenbly
 code. And I expect that much clean up is necessary before merge.
 We really need the feed back before starting actual clean up as
 Eddie already said before. 
 
 Eddie discussed how to clean up and suggested several ways.
  1: Dual IVT source code, dual IVT table. (The way this patch set
  adopted) 2: Same IVT source code, but dual/mulitple compile to
 generate dual/multiple IVT table using assembler macro.
  3: Single IVT table, using indirect function call for pv_ops using 
 branch/binary patching. 
 
 At this moment my preference is the option 2. Please comment.
 
 A combination of options (2) and (3) would work.  Have a single source
 file for the IVT, using conditional macros.  Use that source file to
 build (at least) two copies of the IVT, for native and any virtualized

Thanks, we are getting more comments now:)
I would like to take this chance to go into a little bit more details
now for sub-alternatives. 

For all of above, we need replace IVT source code like following
example:
@@ -102,7 +116,7 @@  *  - the faulting virtual address uses
unimplemented address bits   *  - the faulting virtual address
has no valid page table mapping  */
-   mov r16=cr.ifa  // get address that caused the TLB miss
+   _READ_IFA(r16, r24, r25)
 #ifdef CONFIG_HUGETLB_PAGE
movl r18=PAGE_SHIFT mov r25=cr.itir

For #2 (Dual compile, Dual IVT instance),  now we have following 
sub-alternatives:

A) Generate code in place like following:
+#ifdef CONFIG_XEN
+#define _READ_IFA(regr, clob1, clob2)  \
+   movl clob1=XSI_IFA;;\
+   ld8 regr=[clob1];;
+#endif

+#ifdef CONFIG_NATIVE
+#define _READ_IFA(regr, clob1, clob2)  \
+   mov regr=cr.ifa;
+#endif

In this approach, we don't do function call/jump, all the codes
for different hypervisor are generated in place. To be more
important, it doesn't require any fixed clobber registers, i.e.
any registers found spare can be used as clob registers.

If we go with this apporach, the coding effort is minized and
current Xen code can be simply merged into this model.

Cons:  No explicit pv_asm_ops function table, diversity to X86's
is bigger.

B) Directly jump
This model use function call (actually jump) in those primitive
pv MACROs.


+GLOBAL_ENTRY(xen_read_ifa)
+   mov b0=r24;
+   movl r25=XSI_IFA;;
+   ld8 r24=[r25];;
+   br.cond.sptk b0
+END(xen_read_ifa)

+#ifdef CONFIG_XEN
+#define _READ_IFA(regr, clob1, clob2)  \
+   movl r24=1f;\
+   br.sptk.many xen_read_ifa;; \
+1: \
+   mov regr=r24;;
+#endif

Pros: less code size generated in place, 
Cons: need clob registers and probably fixed
clob registers.

C) Indirect function call
This model is mostly close to what pv_ops mean. Previous 
solution actually doesn't refer to the function table.

possible for C  ASM to share same pv_ops code with wrapper
in C side, and could support single IVT table solution.

Cons: Need more clobber registers and change IVT source code.

+#define _READ_IFA(regr, clob1, clob2)  \
+   mov r24=_READ_IFA_OPS_INDEX;\
+   movl r25=pv_cpu_asm_ops;;   \
+   add r25=r24,r25;;   \
+   ld8 r25=[r25];  \
+   movl r24=1f;;   \
+   mov b0=r25;;\
+   br.sptk.many b0;;   \
+1: \
+   mov regr=r24;;
+


Binary patching at boot ime can convert C to B or A, or convert B to A
if certain condition is met such as clob registers  code size. So run
time
performance degradation to native is minimized. The only difference is
we
get more nop ops in native IVT table (patching will convert those
non-used
 code space to nop instructions, or maybe use a relative jump to skip
those
 spare code).

#A is easiest from effort point of view (no need to re-org mass IVT
code), and
#A doesn;t need binary patching. 
but the code quality may be not that good in current Xen such as:

@@ -192,7 +235,17 @@
 */
adds r24=__DIRTY_BITS_NO_ED|_PAGE_PL_0|_PAGE_AR_RW,r23
;;
+#ifdef CONFIG_XEN
+(p7)   mov r25=r8
+(p7)   mov r8=r24
+   ;;
+(p7)   XEN_HYPER_ITC_D
+   ;;
+(p7)   mov r8=r25
+   ;;
+#else
 (p7)   itc.d r24
+#endif
;;
 #ifdef CONFIG_SMP




#C(also #B) need massive IVT source code change to find clob registers.


 modes.  The native copy of the IVT starts at label ia64_ivt in section
 .text.ivt, as it does now.  Any IVT versions for virtualized mode are
 defined as __cpuinitdata, so they are discarded after boot, unless

Looks like you prefer #A of above dual compiler option, right?
If most people agree with this, we 

[Xen-ia64-devel] RE: [kvm-ia64-devel] [PATCH 13/28] ia64/xen: introduce xenhypercall routines necessary for domU.

2008-02-25 Thread Dong, Eddie

 IA64 pv_ops frame work doesn't exist yet so that xen code does
 in order to boot on both native and xen for now.
 I expect those check will be eliminated during developing ia64 pv_ops.

Qing He  I am working on the pv_ops framework, hopefully we can get
a draft soon :)

Eddie

___
Xen-ia64-devel mailing list
Xen-ia64-devel@lists.xensource.com
http://lists.xensource.com/xen-ia64-devel


[Xen-ia64-devel] RE: paravirt_ops support in IA64

2008-02-18 Thread Dong, Eddie
Isaku Yamahata wrote:
 
  2: Same IVT source code, but dual/mulitple compile to generate
 dual/multiple IVT table. I.e. we replace those primitive ops
 (sensitive instructions) with a MACRO which uses compile option for
  different hypervisor type. The pseudo code of the MACRO
could be:
 (take read CR.IVR 
 as example)
 
 AltA:
 #define ASM_READ_IVR /* read IVR to GR24 */
 #ifdef XEN
  breg1 = return address
  brxen_readivr
 #else/* native
  mov  GR24=CR.IVR;
 #endif
  Or
 AltB:
 #define ASM_READ_IVR /* read IVR to GR24 */
 #ifdef XEN
  in place code of function xen_readivr
 #else/* native
  mov  GR24=CR.IVR;
 #endif
 
  From maintenance effort point of view, it is minimized,
 but not exactly what X86 pv_ops look like.
 
  Both approach will cause code size issue, but altB is
 much worse in this area, while AltA need one additional BR clobber
 register
 
 
 Pros:
 - single code
 - hopefull less maintenance cost compared to #1
 
 Cons:
 - requires restriction on register usage. And we need to define its
   convension.
   When modifying ivt.S in the future after converting ivt.S,
   those convesion must be kept in mind.
 - suboptimal for paravirtualized case compared to #1 case
 
 
  3: Single IVT table, using indirect function call for pv_ops.
  This is more like X86 pv_ops, but we need to pay 2
 additional BR clobber registers due to indirect function call, like
 following pseudo code: 
 
 AltC:
  breg0 = pv_ops base
  breg0 += offset for this pv_ops
  breg1 = return address;
  br  breg0.  /* pv_ops clobbered breg0/breg1 */
 
 
  For both #2  #3, we need to modify Linux IVT code to get
 clobber register for those MACROs, #3 need 2 br registers and 1-2 GR
 registers for the function body. #2A needs least clobber register,
 just 1-2 GR registers.
 
 #2B may also need clobber 1(or 2?) GR registers depending on the
 original instruction.

Yes, clobber GR # is almost same for all Alts.

 
 Pros:
 - single code/binary
 - less maintenance cost
 
 Cons:
 - requires restriction on register usage. And we need to define its
   convension.
   When modifying ivt.S in the future after converting ivt.S,
   those convesion must be kept in mind.
 - more clobbered register (for AltC)
 - suboptimal even for native case.

After binary patching, native side won't have impact. 
We can have in place patching, i..e. replace whole AltC
code dynamically with mov GRx=CR.IVR;nop;nop...

 
 Presumably we can use binary patching technique to mitigate those
 overhead. Probably for native case, we can convert those branch with
 single instruction.
 For example we can make 'br breg0' into direct branch.

If it is single IVT table, we don't know the target address of
the function call.

 AltD(AltC'):
 breg1 = return address;
 br  native_pv_ops_ops   === binary patch at boot time
 

?? Are u talking about AltA?

thanks, Eddie

___
Xen-ia64-devel mailing list
Xen-ia64-devel@lists.xensource.com
http://lists.xensource.com/xen-ia64-devel


[Xen-ia64-devel] paravirt_ops support in IA64

2008-02-17 Thread Dong, Eddie
Hi, Tony  all:
Recently Xen-IA64 community is considering to add paravirt_ops
support to keep sync with X86 and reduce maintenance effort. With
pv_ops, sensitive instructions or some high level primitive
functionalities (such as MMU ops) are replaced with pv_ops which is a
function table call whose exact function pointer is initialized at Linux
startup time depending on different hypervisor (or native) runing
underlayer.

With this, we can reuse many code with X86 such as irqchip with
X86, and similar dma support with X86, similar xenoprof/PMU profiling
support etc. While CPU side pv_ops is quit different especially for
those ASM code, since IA64 processor doesn;t have memory/stack ready at
most IVT handler code.

In X86, ASM side pv_ops can save clobber registers to stack and
do function call, but IA64 can't due to unavailable of memory access.

#define DISABLE_INTERRUPTS(clobbers)
\
PARA_SITE(PARA_PATCH(pv_irq_ops, PV_IRQ_irq_disable), clobbers,
\
  pushl %eax; pushl %ecx; pushl %edx;
\
  call *%cs:pv_irq_ops+PV_IRQ_irq_disable;
\
  popl %edx; popl %ecx; popl %eax)
\


One of the 1st biggest argument is how to support those ASM IVT
handler code. Some ideas discussed include:

1: Dual IVT source code, dual IVT table.
This is current Xen did, and probably are not warmly
welcomed since it is not in upstream yet and have maintenance effort.
2: Same IVT source code, but dual/mulitple compile to generate
dual/multiple IVT table. I.e. we replace those primitive ops (sensitive
instructions) with a MACRO which uses compile option for different
hypervisor type. 
The pseudo code of the MACRO could be: (take read CR.IVR
as example)

AltA:
#define ASM_READ_IVR/* read IVR to GR24 */
#ifdef XEN
breg1 = return address
brxen_readivr
#else   /* native
mov  GR24=CR.IVR;
#endif
Or
AltB:
#define ASM_READ_IVR/* read IVR to GR24 */
#ifdef XEN
in place code of function xen_readivr
#else   /* native
mov  GR24=CR.IVR;
#endif

From maintenance effort point of view, it is minimized,
but not exactly what X86 pv_ops look like.

Both approach will cause code size issue, but altB is
much worse in this area, while AltA need one additional BR clobber
register 

3: Single IVT table, using indirect function call for pv_ops.
This is more like X86 pv_ops, but we need to pay 2
additional BR clobber registers due to indirect function call, like
following pseudo code:

AltC:
breg0 = pv_ops base
breg0 += offset for this pv_ops
breg1 = return address;
br  breg0.  /* pv_ops clobbered breg0/breg1 */


For both #2  #3, we need to modify Linux IVT code to get
clobber register for those MACROs, #3 need 2 br registers and 1-2 GR
registers for the function body. #2A needs least clobber register, just
1-2 GR registers.


In X86, there are another enhancement (dynamic patching) base on
pv_ops. The purpose is to improve cpu predication by converting indriect
function call to direct function call for both C  ASM code. We may take
similar approach some time later too.

We really need advices from community before we jump into
coding.
CC some active members that I though may be interested in pv_ops
since KVM-IA64 mailinglist doesn;t exist yet.

Thanks a lot, Eddie


___
Xen-ia64-devel mailing list
Xen-ia64-devel@lists.xensource.com
http://lists.xensource.com/xen-ia64-devel


RE: [Xen-ia64-devel] Re: [Xen-devel] XenLinux/IA64 domU forward port

2008-02-14 Thread Dong, Eddie
Isaku Yamahata wrote:
 Dong, Eddie wrote:
 I guess we are talking in different angle which hide the real
 issues. We
 have multiple alternaitves:
 1:  pv_ops
 2: pv_ops + binary patching to convert those indirect function call
 to direct function call like in X86
 3: pure binary patching
 
 For community,
 #1 need many effort like Jeremy spent in X86 side, it could last for
 6-12 months, #2 is based on #1, the additional effort is very small,
 probably 2-4 weeks. #3 is not pv_ops, it may need 2-3 months effort.
 
 Per my understanding to previous Yamahata san's patch, it address
 part of #3 effort. I.e. #A of #3. What I want to suggest is #2.
 
 Hmm, by pv_ops you mean a set of functions which are grouped, right?
 My current implementation does
 #define ia64_fc(addr)   paravirt_fc(addr)
 ...
 But do you want to make them indirect call?
 i.e. something like
 #define ia64_fc(addr)   pv_ops-fc(addr)

That is what X86 pv_ops did, such as following pv_ops for halt
instruction in X86.

static inline void halt(void)
{
PVOP_VCALL0(pv_irq_ops.safe_halt);
}

The key issue is current approach (putting native instruction by
default) 
always need a binary patching. While pv_ops doesn't assume in that way.
In X86, only only cli/sti/iret  sti; sysexit is in patch site.
Xen/X86  patch cli/sti from indirect call to direct call.
Anyway patching or not is totally depend on hypervisor itself.

 
 
 With pv_ops, all those instruction both in A/B/C are already
 replaced by source level pv_ops code, so no binary patching is
 needed. The only patching needed in #2 is to convert indirect
 function call to direct function call for some hot APIs, for example
 X86 does for cli/sti. The majority of pv_ops are not patched. 
 
 So basically #2  #3 approach is kind of conflict, and we probably
 need to decide which way to go earlier.
 
 It's not difficult to make #A of #3 to #A of #2.


Yes, but the issue is pv_ops based patching is pure optional.
But this patch makes it permanent. And the 3-5 cycles saved by
patching is too small for  huge C code function, which should be
addressed in later stage.

Looking at X86 code, only arch\x86\kernel/entry_32.S may be
patched, such as sysenter_entry(). All C code are not.

For IA64, it is IVT code if we take same policy with X86, i,e. #C is 
critical and may need patching. (Note here: patching or not
even for #C is still optional)

 (At least for making the current implementation into #A of #2,
 but it requires more work and performance degrade.)
 However I don't see any advantage #A of #2 than #A of #3.

We don't need #A at all.

 If it is necessary to call some other function for #A of #3,
 it is possible to rewrite instructions into something like
 
 mov reg = 1f
 br target25 (relocation is necessary)
 1:
 
 So left issues are how many instructions (or bundles) should be
 reserved for each operations and what is their calling convention.
 Although currently I put instructions for native as default case,
 you can put the above sequence if you desire.

The issue is we don;t have clobber registers here, that is
what I say pv_ops for ASM is key challenge, and we need to change IVT
code
a lot to get clobber registers. That is why adding pv_ops 
support is a big challenge, but patching or not is not that difficult.
Even we
want to patch it, we need to get pv_ops code done and than do
optimization. 

For those C based codewe can always use scratch registers.

 Given that #A of #2 is for performance critical path, so that
 not using usual stacked calling convension would be acceptable.
 As you already proposed, PAL static calling convention is a candidate.

Not at all. For #A, it is already in C code and thus memory is
available,
C calling convention can be applied seamlessly.

PAL like convention is only for IVT code.

 
 However I don't see any advantage to switch from the current
 convention (using r8, r9...) for #A at this moment.

I don't oppose here either. I just leave the question here and let
 Linux guys to decide.
In neutral, I won't let Xen specific implementation impacts pv_ops
design since the later one is hypervisor neutral. But I can accept
either.

 It is necessary to discuss with linux-ia64 people to see if it's
 acceptable or not. If we found it necessary to change the convention,
 it wouldn't be so difficult to do so. But it should be after
 discussion with linux-ia64. Not now.
 
 

Actually I didn't opose binary patching, but my point is that we can't 
assume patching is a must for each hypervisor. Leaving the code 
to native by default will enforce this assumption.
Also I think we should get pv_ops done first and then
do optimization (patching), reverse sequence will just make more effort
for whole community. Once we get pv_ops done, the framework used
in this patch can be extended to that code base and we can 
decide which one need patching.

Per my understanding to this patch, I think the 90% effort is forward

RE: [Xen-ia64-devel] Question about migration

2008-02-05 Thread Dong, Eddie
 
   The kernel guarantees applications only see time move forward, even
 across multiple CPUs.  See: 
 
 kernel/timer.c:time_interpolator_get_counter()
 
 We never return a time before last_cycle unless booted with the
 nojitter options. 
 

Echo from me too.
I was told some time ago, the crystal used in IPF platform is usually
expansive
than other platforms and thus much more accurate.
Normally the small difference won;t cause application see backward ITC
value,
but live migration per current Xen time virtualization policy is another
story. It
could be a headache :(

Hopefully some time later, with Tukwila, we can live with hybrid
virtualization, 
thus we got the problem solved by HW trapping application ITC read :)
Or if some platform has high resolution platform time, we can restore
physical ITC
at VP switch time. (platform_time + per VP offset)

Eddie

___
Xen-ia64-devel mailing list
Xen-ia64-devel@lists.xensource.com
http://lists.xensource.com/xen-ia64-devel


RE: [Xen-ia64-devel] paravirt_ops and its alternatives

2008-02-05 Thread Dong, Eddie
Kouya Shimura wrote:
 Dong, Eddie writes:
  3: irq chip paravirt_ops, xen irq chip or vSAPIC?
 
 Is xen irqchip really necessary?

X86 side already pushed the xen irq chip into upstream, so I think
it should be easy to do same thing in IPF side.

 
 In current PV implementation, an evtchn interrupt is injected and
 reflected directly to a guest OS. 
 See reflect_event()@xen-unstable.hg/xen/arch/ia64/xen/faults.c and
 [EMAIL PROTECTED]/arch/ia64/xen/xenivt.c 
 

Yes, this is xen irq chip. It should have better performance than
vSAPIC.
We need to re-do this base on upstream xen riq chip code, + debug, it is
more than vSAPIC, but love to see going in this direction since x86
already
pushed it.

 There is no intermediate layer there.
 I think that the same mechanism can work in paravirt_ops.
 
 Perhaps I might misunderstand something. :-)
 

Just term difference :) basically we are talking about same thing.

thx, eddie

___
Xen-ia64-devel mailing list
Xen-ia64-devel@lists.xensource.com
http://lists.xensource.com/xen-ia64-devel


RE: [Xen-ia64-devel] paravirt_ops and its alternatives

2008-02-05 Thread Dong, Eddie
Isaku Yamahata wrote:

 Dual compile could be a good approach. Another alternative will be
 X86 pv_ops like dynamic binary patching per compile time hints. The
 later method also uses macro for those different path, but this
 macro will generate a special section including the information for
 runtime patch base on pv_ops hook. (some kind of similar to
 Yamahata's binary patch method though it addresses C side code)
 
 With dynamic pv_ops patch, the native machine side will again see
 original single instruction + some nop. So I guess the performance
 lose will be very minor. Both approach is very similar and could be
 left to Linux community's favorite in future :)
 
 Actually already we adopted dual compilatin approach for gate page.
 See gate.S in xenLinux arch/ia64/kernel/gate.S and Makefile.
 I'm guessing that dual compiling approach is easier than binary
 patching approach because some modification of xenivt.S doesn't
 correspond to single instruction. Yes I agree that we can go for
 either way according to upstream favor.  

Yes, it is there already. 
When we implemented pv_ops, I would assume we define the APIs
and would ask future kernel patches to follow too (not conflict those
APIs). So we have to define clear clobber register in those MACRO,
and then modify many original linux IVT code to provide those clobber
register effectively. Current XenLinux provide one solution, but I saw
2 issues:
1: The coding style is not as good as original IVT code.
For example:
#ifdef CONFIG_XEN
mov r24=r8
mov r8=r18
;;
(p10)   XEN_HYPER_ITC_I
;;
(p11)   XEN_HYPER_ITC_D
;;
mov r8=r24
;;
#else
This kind of save/restore R8 in each replacement (MACRO)
is kind of not well tuned. We probably need a big IVT code
change
to avoid frequent save/restore in each MACRO.

This needs many effort. Of course taking shortcut before

into upstream.

2: We are not using function pointer which pv_ops wants.
But this one can be avoided if we use dual IVT. This is kind of
very high level pv_ops (hypervisor provide whole IVT table), not
normal pv_ops address (for low level instruction API). But
anyway
I love the idea too if the upstream guys like too

 
 
 Another problem I want to raise is about pv_ops calling convention.
 Unlike X86 where stack is mostly available, IPF ASM code such as
 IVT entrance doesn't invoke stack, so I think we have to define
 static registers only pv_ops  stacked registers pv_ops like PAL.
 
 With respect to hypervisor ABI, we have already differentiate them.
 ia64 specific HYPERVIRVOPS as static registers convention and
 normal xen hypercall as stacked registers convention.

Yes, hyperpriv is doing something similar, so I think people won't 
have much resist here.

 
 
 For most ASM code (ivt), it have to use static registers only pv_ops.
 We need to carefully define the clobber registers used and do
 manual modification to Linux IVT.s. Dual IVT table or binary
 patching is preferred for performance.
 
 Stacked register pv_ops could follow C convention and it is less
 performance critical, so probably no need to do dynamic patching.
 
 I'm guessing one important exception is masking/unmasking interrupts.
 i.e. ssm/rsm psr.i. Anyway we will see during our merge effort.

If it is called in C, I won't say it is critical becuase it is slow path
in native OS too.
But some time later, we can add more after the Linux community takes it.
This is the advantage of pv_ops when people argued about ABI level
abstraction
or API level abstraction at very beginning when Vmware raises their VMI
spec.
API approach can have on going improvement :)

Thx, eddie

___
Xen-ia64-devel mailing list
Xen-ia64-devel@lists.xensource.com
http://lists.xensource.com/xen-ia64-devel


RE: [Xen-ia64-devel] paravirt_ops and its alternatives

2008-02-05 Thread Dong, Eddie
Isaku Yamahata wrote:
 On Tue, Feb 05, 2008 at 10:17:10PM +0800, Dong, Eddie wrote:
  1: The coding style is not as good as original IVT code.
 
 I have to agree with you here.
 
 
  For example:
 #ifdef CONFIG_XEN
 mov r24=r8
 mov r8=r18
 ;;
 (p10)   XEN_HYPER_ITC_I
 ;;
 (p11)   XEN_HYPER_ITC_D
 ;;
 mov r8=r24
 ;;
 #else
  This kind of save/restore R8 in each replacement (MACRO)
  is kind of not well tuned. We probably need a big IVT code
change
  to avoid frequent save/restore in each MACRO.
 
  This needs many effort. Of course taking shortcut before
 
 into upstream.
 
 Yes, such register value save/restore is suboptimal.

Another issue from me is that why we use R8/R9 for In/Out parameter
in Xen static hypercall. This raises us an issue to save/restore R8/R9
using
bank 0 register. static PAL call doesn't use R8/R9, should we?
Especially
pv_ops itself is Xen neutral.


 I'm guessing such overhead is relatively small compared to the
 hyperprivops overhead which issues break instruction. 

Yes, the overhead is mostly un-observable, but mainly coding style or 
code quality concern. I assume Linux guys is much more paranoid in
pursuing best.

 So presumably for reducing such overhead, it is necessary to replace
 those break instructions with fast hyperprivops using gate page. Such
 optimization would be the next step after upstream merge though. 

Yes, this could be future effort, actually this is not a pv_ops work,
but
xen wrapper work. 

Let me create another thread for compile time dual IVT table vs. single
discussion.
thx, eddie

___
Xen-ia64-devel mailing list
Xen-ia64-devel@lists.xensource.com
http://lists.xensource.com/xen-ia64-devel


[Xen-ia64-devel] per hypervisor IVT table vs. global IVT table

2008-02-05 Thread Dong, Eddie
All:
If we use single IVT table, the pv_ops code will look like:

ALT0:
breg0 = pv_ops base
breg0 += offset for this pv_ops
breg1 = return address;
br  breg0.  /* pv_ops clobbered breg0/breg1 */

That means we have to use 2 BR clobber register.

Or we can use X86 hypercall page like technique to copy those
hooks
to a common page to avoid breg0. This make ALT0 same with following
ALT1.

If we use per hypervisor IVT table at compile time. We could do:

ALT1:
#define ASM_READ_IVR
#if XEN
breg1 = return address
brpv_ops_api_readivr
#endif
When pv_ops_api_readivr is hooked, it do read_ivr_code.

Or we can just do:

ALT2:
#define ASM_READ_IVR
#ifdef XEN
read_ivr_code
#endif

ALT1 is more like X86 pv_ops that some initialization code will
hook,
ALT2 can save an additional br register, and thus probably less change
to Linux
IVT code.
In terms of former approach, binary patching can patch ALT1 code
back
to ALT2 solutuon to avoid the indirection call cost if we follow same
approach 
with X86.
Which ALT should we pursue first?
thx, eddie

___
Xen-ia64-devel mailing list
Xen-ia64-devel@lists.xensource.com
http://lists.xensource.com/xen-ia64-devel


RE: [Xen-ia64-devel] paravirt_ops and its alternatives

2008-02-04 Thread Dong, Eddie
Yang, Fred wrote:
 Alex Williamson wrote:
 On Mon, 2008-02-04 at 09:53 +0800, Dong, Eddie wrote:
 Yang, Fred wrote:
 Dong, Eddie wrote:
 Re-post it to warmup discussion in case people can't read PPT
 format,
 
 IVT is very performance sensitive for the native Linux, how about
 dual IVT tables alternative for CPU virtualization?  It would need
 maintainance effort but it would be much cleaner forIA64 situation.
 -Fred
 
 Dual IVT table could be a night mare for Tony, I guess. But yes we
 need to have more active discussion to kick it off.
 
Yes, two separate IVTs with 95+% of the code being the same would
 not be ideal.  I think we should aim for a single ivt.S that gets
 compiled a couple times with different options, once for native and
 again for each virtualization option.  It looks like more than half
 of the changes in xenivt.S could be easily converted to macros that
 could be switched by compile options.  Perhaps a pattern will emerge
 for the rest.
 If it is not necessarily to stick with a single image and runtime to
 determine code path, multi-compile paths to generate different PV or
 native image then macros can possibly work.. -Fred 

Dual compile could be a good approach. Another alternative will be X86
pv_ops like dynamic binary patching per compile time hints. The later 
method also uses macro for those different path, but this macro will 
generate a special section including the information for runtime patch
base on pv_ops hook. (some kind of similar to Yamahata's binary
patch method though it addresses C side code)

With dynamic pv_ops patch, the native machine side will again see
original single instruction + some nop. So I guess the performance lose
will be very minor. Both approach is very similar and could be left
to Linux community's favorite in future :)


Another problem I want to raise is about pv_ops calling convention.
Unlike X86 where stack is mostly available, IPF ASM code such as
IVT entrance doesn't invoke stack, so I think we have to define 
static registers only pv_ops  stacked registers pv_ops like PAL.

For most ASM code (ivt), it have to use static registers only pv_ops.
We need to carefully define the clobber registers used and do 
manual modification to Linux IVT.s. Dual IVT table or binary 
patching is preferred for performance.

Stacked register pv_ops could follow C convention and it is less
performance critical, so probably no need to do dynamic patching.

more comments are welcome:)
Eddie

___
Xen-ia64-devel mailing list
Xen-ia64-devel@lists.xensource.com
http://lists.xensource.com/xen-ia64-devel


RE: [Xen-ia64-devel] RFC: Remove dead code?

2008-02-04 Thread Dong, Eddie
Alex Williamson wrote:
 On Fri, 2008-02-01 at 13:16 +0800, Dong, Eddie wrote:
 The following code is not used anymore, and I think it is also a
 legacy issue when people are debugging on SKI simulator.
 Should we simplify it?
 thx, eddie
 
Yes, I haven't run Xen on ski for a long time, and it's probably
 cruft that we don't want to push upstream.  Please send a sign-off
 for the below and I'll apply it.  Thanks,  
 
   Alex
 
Thanks, here it is.
Eddie




Remove ski simulator related stuff since it is for early
Xen development stage, and no longer necessary for now.

Signed-off-by: YaoZu (Eddie) Dong [EMAIL PROTECTED]


diff -r 0e62beb4c36a arch/ia64/xen/Makefile
--- a/arch/ia64/xen/MakefileFri Feb 01 09:33:32 2008 +0800
+++ b/arch/ia64/xen/MakefileFri Feb 01 12:59:24 2008 +0800
@@ -2,7 +2,7 @@
 # Makefile for Xen components
 #
 
-obj-y := hypercall.o xenivt.o xenentry.o xensetup.o xenpal.o xenhpski.o
\
+obj-y := hypercall.o xenivt.o xenentry.o xensetup.o xenpal.o \
 hypervisor.o util.o xencomm.o xcom_hcall.o \
 xcom_privcmd.o xen_dma.o
 
diff -r 0e62beb4c36a arch/ia64/xen/xenhpski.c
--- a/arch/ia64/xen/xenhpski.c  Fri Feb 01 09:33:32 2008 +0800
+++ /dev/null   Thu Jan 01 00:00:00 1970 +
@@ -1,19 +0,0 @@
-#include linux/kernel.h
-#include asm/hypervisor.h
-
-int
-running_on_sim(void)
-{
-   int i;
-   long cpuid[6];
-
-   for (i = 0; i  5; ++i)
-   cpuid[i] = xen_get_cpuid(i);
-   if ((cpuid[0]  0xff) != 'H') return 0;
-   if ((cpuid[3]  0xff) != 0x4) return 0;
-   if (((cpuid[3]  8)  0xff) != 0x0) return 0;
-   if (((cpuid[3]  16)  0xff) != 0x0) return 0;
-   if (((cpuid[3]  24)  0x7) != 0x7) return 0;
-   return 1;
-}
-


pv_ops_cleanup2.patch
Description: pv_ops_cleanup2.patch
___
Xen-ia64-devel mailing list
Xen-ia64-devel@lists.xensource.com
http://lists.xensource.com/xen-ia64-devel

RE: [Xen-ia64-devel] paravirt_ops and its alternatives

2008-02-03 Thread Dong, Eddie
Yang, Fred wrote:
 Dong, Eddie wrote:
 Re-post it to warmup discussion in case people can't read PPT format,
 
 IVT is very performance sensitive for the native Linux, how about
 dual IVT tables alternative for CPU virtualization?  It would need
 maintainance effort but it would be much cleaner forIA64 situation.
 -Fred  

Dual IVT table could be a night mare for Tony, I guess. But yes we
need to have more active discussion to kick it off.

Tony:
I think this discussion shouldn't exclude IA64 Linux 
community (at least for those active members), will u like us to post 
this kind of discussion to IA64 community? Or do u have list of people 
who are most interested?
I want to kick off some high level discussion.
thx, eddie

___
Xen-ia64-devel mailing list
Xen-ia64-devel@lists.xensource.com
http://lists.xensource.com/xen-ia64-devel


[Xen-ia64-devel] RFC: Remove dead code?

2008-01-31 Thread Dong, Eddie
The following code is not used anymore, and I think it is also a legacy
issue when people are debugging on SKI simulator.
Should we simplify it?
thx, eddie



diff -r 0e62beb4c36a arch/ia64/xen/Makefile
--- a/arch/ia64/xen/MakefileFri Feb 01 09:33:32 2008 +0800
+++ b/arch/ia64/xen/MakefileFri Feb 01 12:59:24 2008 +0800
@@ -2,7 +2,7 @@
 # Makefile for Xen components
 #
 
-obj-y := hypercall.o xenivt.o xenentry.o xensetup.o xenpal.o xenhpski.o
\
+obj-y := hypercall.o xenivt.o xenentry.o xensetup.o xenpal.o \
 hypervisor.o util.o xencomm.o xcom_hcall.o \
 xcom_privcmd.o xen_dma.o
 
diff -r 0e62beb4c36a arch/ia64/xen/xenhpski.c
--- a/arch/ia64/xen/xenhpski.c  Fri Feb 01 09:33:32 2008 +0800
+++ /dev/null   Thu Jan 01 00:00:00 1970 +
@@ -1,19 +0,0 @@
-#include linux/kernel.h
-#include asm/hypervisor.h
-
-int
-running_on_sim(void)
-{
-   int i;
-   long cpuid[6];
-
-   for (i = 0; i  5; ++i)
-   cpuid[i] = xen_get_cpuid(i);
-   if ((cpuid[0]  0xff) != 'H') return 0;
-   if ((cpuid[3]  0xff) != 0x4) return 0;
-   if (((cpuid[3]  8)  0xff) != 0x0) return 0;
-   if (((cpuid[3]  16)  0xff) != 0x0) return 0;
-   if (((cpuid[3]  24)  0x7) != 0x7) return 0;
-   return 1;
-}
-

___
Xen-ia64-devel mailing list
Xen-ia64-devel@lists.xensource.com
http://lists.xensource.com/xen-ia64-devel


RE: [Xen-ia64-devel] paravirt_ops and its alternatives

2008-01-30 Thread Dong, Eddie
Isaku Yamahata wrote:
 On Thu, Jan 31, 2008 at 08:21:51AM +0800, Dong, Eddie wrote:
 Alex  All:
 
 Hi Eddie.
 At first I'd like to make it clear. The goal is to merge
 xenLinux/ia64 modification into upstream kenrel. Hence reduce
 maintenane cost etc...  And you want to dicuss how to do it. Is this
 correct?   

Mmm, Kind of mix. I want to know the gap, and then see how we 
can across the gap:)

 
 Now I'm forward poring domU code to 2.6.24. (In fact to 2.6.24-rc,
 but I'm going to rebase to 2.6.24 release). I haven't got it work
 yet.  

This work is still useful. If we decide to go with paravirt_ops
eventually,
we can refer what you have in 2.6.24 kernel, replacing one by one to
paravirt_ops, and it will reduce debug effort in latest kernel.

 I'm planning to post it as a single jumbo patch once I get it work.
 
 To make our collaboration effective, we should have some kind of
 repository for that purpose. What kind of repository is best? 
 Considering upstream merge, having our modification as patch queues
 might be easy. But should we also have git or hg repo to track our
 change?  

Yes, Alex will work on this I think.

 
 
  Here is a gap analysis for paravirt_ops, can you all comment?
  In summary we have 4 catagory of jobs:
  1: CPU paravirt_ops including MMU  timer  interrupt   2: Xen
hooks
  3: irq chip paravirt_ops, xen irq chip or vSAPIC?
  4: dma for driver domain
 
  My understanding is that the effort is almost similar for each
part,
 while all various alternatives such as pre-virtualization, binary
 patching (privify) or even unmodified Linux as dom0 only save part of
 #1 effort, which means less than 25% effort saving. Do we really want
 a temporary solution for 25%- effort saving?
  So I would suggest we go with paravirt_ops which is the Linux
 community direction to avoid resource fragmentation.
  The writeup is very draft and I am planning to spend more time
in
 investigation, comments are welcome.
 
 Probably as you know it,  Linux/ia64 already has the machine vector
 frame work so that many basic functinality like dma api are called
 indiretly. So it would be wise to utilize machine vector at first and
 I fact we already defined xen machine vector which is due to Alex
 Williamson. If there were something unsuitable to machine vector,
 then we could introduce pv_ops. Anyway this is the only
 implementation details and how we call it. 
 Conceptually they are same.
 
 About CPU virtualization.
 Last year I wrote the patch which does binary patching like x86
 paravirt_alt. And I called it paravirt_alt patch. 
 But I'm not sure about paravirtulized hand written assembly code.
 I'm afraid Linux people may dislike such code duplication.
 Yes it's possible to use binary patch technique somehow, however it
 is inevitable make the hand written assembly code less readable to
 some extent.  

Yes. One issue in my mind is that binary patching couldn't solve the 
high level virtualization issues in Xen today such as xen irqchip  dma
patch. It could work for domU but very difficult for dom0 or driver 
domain without these.

X86 side has xen irqchip in Linux upstream today, so it should be ok for
us
to reuse it since XenLinux-IA64 is same with X86 in this area before
paravirt_ops.

Redhat guys are working on dma support, I think we can rely on them to
push
it upstream and then we implement IA64 specific things with same
concept.

 
 To detect environment, mov from cpuid can't be used on PV case
 because it isn't privileged instructions. On VT-i environment cpuid
 can be hooked though.  
 Current we check only priveleged level on which kenrel is runinng.
 Possibley more sophisticated way is necessary to allow another pv
 technology. 

Actually paravirt_ops version of X86 Linux doesn;t detect this. In
stead,
it put a special initial code in ELF Linux image and let the dom builder
find
it and start from that point to make sure it is on top of Xen
hypervisor.
For VMI stuff, there is a new code in Linux startup sequence to check
if the VMI support ROM exist.

For us, we can use same mechanism right now, Jeremy is working on 
dom0 detection support. Any way that is not a big issue.

CC jeremy  Keir in case they are not in this mailinglist.
thx, eddie

___
Xen-ia64-devel mailing list
Xen-ia64-devel@lists.xensource.com
http://lists.xensource.com/xen-ia64-devel


[Xen-ia64-devel] MINSTATE_PHYS?

2008-01-30 Thread Dong, Eddie
I did a quick grep and find MINSTATE_PHYS is never defined in xenlinux. 
Xen mca code did. Anything missed?


diff -r 71a415f9179b arch/ia64/xen/xenminstate.h
--- a/arch/ia64/xen/xenminstate.h   Fri Jan 18 14:20:59 2008 -0700
+++ b/arch/ia64/xen/xenminstate.h   Thu Jan 31 15:08:42 2008 +0800
@@ -66,12 +66,6 @@
 # define MINSTATE_GET_CURRENT(reg) mov reg=IA64_KR(CURRENT)
 # define MINSTATE_START_SAVE_MIN   MINSTATE_START_SAVE_MIN_VIRT
 # define MINSTATE_END_SAVE_MIN MINSTATE_END_SAVE_MIN_VIRT
-#endif
-
-#ifdef MINSTATE_PHYS
-# define MINSTATE_GET_CURRENT(reg) mov reg=IA64_KR(CURRENT);; tpa
reg=reg
-# define MINSTATE_START_SAVE_MIN   MINSTATE_START_SAVE_MIN_PHYS
-# define MINSTATE_END_SAVE_MIN MINSTATE_END_SAVE_MIN_PHYS
 #endif

 /*

___
Xen-ia64-devel mailing list
Xen-ia64-devel@lists.xensource.com
http://lists.xensource.com/xen-ia64-devel


RE: [Xen-ia64-devel] domU address space

2008-01-25 Thread Dong, Eddie
Do we have checks when inserting guest TLB for PV dom? Seems not, 
If a guest insert a TLB with HV VA in domU, the TLB in machine side
may be mis-used by HV. It should be able to be fixed :)
Eddie

Kouya Shimura wrote:
 In addition, It seems that PV domain can use an unimplemented VA
 address except xen area. 
 Ideally xen should check it and reflect the unimplemented address
 fault to the guest. But it sounds overkill. 
 
 Isaku Yamahata writes:
 On Thu, Jan 24, 2008 at 09:28:39AM +0800, Dong, Eddie wrote:
 Alex  All:
 First of all, pls forgive me that I was out of Xen/IA64 for quit
a
 long time, and I didn't fully catch up yet now.
 In the very beginning day of Xen/IA64, I remembered the address
 isolation between guest (domU)  hypervisor is not solved though
 guest PAL can provide less number of VA bits, it just assume pv
 guest won't touch hypervisor address space, i.e. it will strictly
 follow PAL reported VA address bits. Is this solved now?
 
 Yes. (Possibly there might be bugs, though.) In paravirtualized
 domain case, PV domain is running under ring 2 (or ring 1 depending
 on the compile time configuration), and the xen area is proteceted
 by privileged level. In VTi domain case, it's protected by psr.vm =
 1. 
 
 --
 yamahata
 
 ___
 Xen-ia64-devel mailing list
 Xen-ia64-devel@lists.xensource.com
 http://lists.xensource.com/xen-ia64-devel


___
Xen-ia64-devel mailing list
Xen-ia64-devel@lists.xensource.com
http://lists.xensource.com/xen-ia64-devel


RE: [Xen-ia64-devel] Time for hybrid virtualization?

2008-01-21 Thread Dong, Eddie
[EMAIL PROTECTED] wrote:
 Quoting Dong, Eddie [EMAIL PROTECTED]:
 Not sure if anybody ever tried to run Xen/IA64 VMM in Xen/IA64 HVM
 guest? It may not be already there, but looks like not that far.
 
 I tried in the past, but it doesn't work out of the box.  You of
 course can't run bare Xen within VTi, but a modified version can run. 

Why?  The goal of hardware virtualization is to provide architecture
equavalent virtualization. We may see the gap.

 
 It may be used to solve the compatability issues if we move root VMM
 to new VT-i dom0 based solution.
 
 But performances are not good.

Yes, we need to see the gap too, but I think the additional degradation
won't
be big.

Eddie


___
Xen-ia64-devel mailing list
Xen-ia64-devel@lists.xensource.com
http://lists.xensource.com/xen-ia64-devel


RE: single software TLB vs. multiple software TLBs (was RE: [Xen-ia64-devel] [PATCH] [Resend]Enable hash vtlb)

2006-04-13 Thread Dong, Eddie
Alex:
 It becomes much clear now that multiple softsware TLB support is a must in 
functionality. For at least following reasons:
1: We should merge VTI and para domain vMMU code together to reduce 
future maintaince effort. In this case multiple software TLB support for MMIOs 
is a must for VTI domain and thus a must for Xen/IA64.
2: Para domain needs to capture guest MMIO access too in case a native 
driver in guest detect its existance. Definitely we should not crash system in 
that case, so multiple software TLB support is a must too.
3: Multiple software TLBs provides flexibility for guest huge TLB 
translations.

Anthony's patch is ready to support all of above as a functionality 
ready solution, and so far I didn't see anybody against multiple software TLB 
support. Can u check in now as a build option? The performance difference in 
1-2 percent should be a second level consideration for now.

Thanks very much!
Eddie


Tristan Gingold wrote:
 Le Mercredi 12 Avril 2006 16:51, Dong, Eddie a écrit :
 Tristan Gingold wrote:
 At every tlb miss time, you can get guest translation from software
 TLB (not from VHPT). You actually don't need to care about VHPT
 entries no matter it is all there or nothing there.
 Ok, it is clear now.
 
 Tristan.

___
Xen-ia64-devel mailing list
Xen-ia64-devel@lists.xensource.com
http://lists.xensource.com/xen-ia64-devel


RE: [Xen-ia64-devel] [PATCH][RFC][TAKE4] the P2M/VP patches

2006-04-12 Thread Dong, Eddie
Congratulations! 
That is why Kevin and I advocated many times before to suggest p2m
translation (p!=m) :-)
Can we also share the free beer?
Eddie

Magenheimer, Dan (HP Labs Fort Collins) wrote:
 I was also able to get networking working with Isaku's patches
 and Alex's.  Hooray!  For the last eight months, I have gulped
 as I told people that Xen/ia64 doesn't support networking.
 No longer!
 
 Domo arigato, Yamahata-san!  Free beer (or sake) for you at
 the next summit!
 
 Dan
 
 -Original Message-
 From: [EMAIL PROTECTED]
 [mailto:[EMAIL PROTECTED] On Behalf
 Of Williamson, Alex (Linux Kernel Dev)
 Sent: Monday, April 10, 2006 2:08 PM
 To: Isaku Yamahata
 Cc: xen-ia64-devel@lists.xensource.com
 Subject: Re: [Xen-ia64-devel] [PATCH][RFC][TAKE4] the P2M/VP patches
 
 On Mon, 2006-04-10 at 10:51 -0600, Alex Williamson wrote:
 On Fri, 2006-04-07 at 13:16 +0900, Isaku Yamahata wrote:
 
 9512:f5d0a531cb58_dom0_vp_model_xen_part.patch
 
 
I'm having trouble with the legacy VGA memory descriptor section
 of this patch.
 
I managed to get my system booting with the patch below (2-way,
 w/  1GB RAM).  Networking works, yeah!  The main changes here are
 that I removed the fabricated MDT entries describing the legacy VGA
 space, added EFI_ACPI_RECLAIM_MEMORY to the memory types mapped, and
 sorted the
 resulting memory descriptor table.  I also included the hack to avoid
 calling assign_domain_mmio_page() for large MMIO ranges.  Minor nit,
 we're still subtracting IA64_GRANULE_SIZE from the MDT entry for
 conventional memory, but we're not adding in the granule at the end
 of memory like we used to. 
 
I also had to make a change to the -xen kernel which is not shown
 here.  The sal_cache_flush_check() appears to be causing us
 some trouble
 again with the P2M/VP patches (MCA'd on my system), so I commented
 out the call to in in arich/ia64/kernel/sal.c:ia64_sal_init(). 
 Thanks, 
 
  Alex
 
 --
 Alex Williamson HP Linux  Open Source
 Lab 
 
 --- xen/xen/arch/ia64/xen/dom_fw.c   2006-04-10
 13:17:31.0 -0600
 +++ xen/xen/arch/ia64/xen/dom_fw.c   2006-04-10
 13:15:21.0 -0600
 @@ -10,6 +10,7 @@
  #include asm/pgalloc.h
 
  #include linux/efi.h
 +#include linux/sort.h
  #include asm/io.h
  #include asm/pal.h
  #include asm/sal.h
 @@ -600,9 +601,14 @@
  u64 end = start + (md-num_pages  EFI_PAGE_SHIFT);
 
  if (md-type == EFI_MEMORY_MAPPED_IO ||
 -md-type == EFI_MEMORY_MAPPED_IO_PORT_SPACE)
 +md-type == EFI_MEMORY_MAPPED_IO_PORT_SPACE) { +
 +if (md-type == EFI_MEMORY_MAPPED_IO 
 +((md-num_pages  EFI_PAGE_SHIFT)  0x1UL)) + 
 return 0; +
  paddr = assign_domain_mmio_page(d, start, end - start); -  
 else +} else
  paddr = assign_domain_mach_page(d, start, end - start); 
  #else paddr = md-phys_addr;
 @@ -610,6 +616,7 @@
 
  BUG_ON(md-type != EFI_RUNTIME_SERVICES_CODE 
 md-type != EFI_RUNTIME_SERVICES_DATA 
 +   md-type != EFI_ACPI_RECLAIM_MEMORY 
 md-type != EFI_MEMORY_MAPPED_IO 
 md-type != EFI_MEMORY_MAPPED_IO_PORT_SPACE);
 
 @@ -626,6 +633,18 @@
  return 0;
  }
 
 +static int
 +efi_mdt_cmp(const void *a, const void *b)
 +{
 +const efi_memory_desc_t *x = a, *y = b;
 +
 +if (x-phys_addr  y-phys_addr)
 +return 1;
 +if (x-phys_addr  y-phys_addr)
 +return -1;
 +return 0;
 +}
 +
  static struct ia64_boot_param *
  dom_fw_init (struct domain *d, const char *args, int arglen,
 char *fw_mem, int fw_mem_size)
  {
 @@ -834,6 +853,7 @@
  /* simulate 1MB free memory at physical address zero */
 
 MAKE_MD(EFI_LOADER_DATA,EFI_MEMORY_WB,0*MB,1*MB, 0);//XXX  #else
 +#if 0
  //XXX dom0 should use VGA?
  #define VGA_RAM_START   0xb8000
  #define VGA_RAM_END 0xc
 @@ -852,6 +872,7 @@
  pcolour_map_end = pcolour_map + VGA_CMAPSZ * 8;
  MAKE_MD(EFI_LOADER_DATA, EFI_MEMORY_WB, 0 * MB,
 pvga_start, 1);
  MAKE_MD(EFI_LOADER_DATA, EFI_MEMORY_WB,
 pcolour_map_end, 1 * MB, 1);
 +#endif /* 0 */
  #endif
  /* hypercall patches live here, masquerade as
 reserved PAL memory */
 
 MAKE_MD(EFI_PAL_CODE,EFI_MEMORY_WB,HYPERCALL_START,HYPERCALL_END,
  0); @@ -890,6 +911,8 @@ // for ACPI table.
  efi_memmap_walk_type(EFI_RUNTIME_SERVICES_DATA,
   dom_fw_dom0_passthrough, arg);
 +efi_memmap_walk_type(EFI_ACPI_RECLAIM_MEMORY,
 + dom_fw_dom0_passthrough, arg);
  efi_memmap_walk_type(EFI_MEMORY_MAPPED_IO,
   dom_fw_dom0_passthrough, arg);
  efi_memmap_walk_type(EFI_MEMORY_MAPPED_IO_PORT_SPACE,
  @@ -902,8 +925,10 @@ #ifndef CONFIG_XEN_IA64_DOM0_VP
  MAKE_MD(EFI_LOADER_DATA,EFI_MEMORY_WB,0*MB,1*MB, 1);
#else
 +#if 0
  MAKE_MD(EFI_LOADER_DATA,EFI_MEMORY_WB, 0 * MB,
  

[Xen-ia64-devel] vIRQ design brief

2006-03-14 Thread Dong, Eddie
All:
This is the draft design of the IRQ virtualization, comments are
appreciated.
Thx,eddie

Xen/IA64 interrupt virtualization


* Introduction
This document targets xen/ia64 developers, providing an design overview of
interrupt virtualization. How the guest IOSAPIC looks like and how the 
machine IOSAPIC is used in hypervisor.


* Terminology
(Not formal definition, just for better understanding)

PIRQ:  Physical IRQ generate by partitioned device, vector 0-255 in X86

VIRQ:  Dynamic IRQ that is pure virtual. Vector 256-511 in X86

IPI:   Inter processor IRQ
VIPI:

MMIO:  Memory Mapped IO

Event channel:



* Background

How Xen/X86 handle callback and event channel:
  In Xen environment, a para-guest registers its callback/safe callback entry
to hypervisor for batch delivering of events to guest. When a guest has 
pending events(shared bitmap), the guest execution turn to 
the pre-registered callback function (evtchn_do_upcall) like an interrupt
happens on native system. This control transfer can be disabled by 
another shared variable evtchn_upcall_mask. In this way guest software
can disable upcall for some reason.
Within evtchn_do_upcall, the events is dispatched. I.e. call do_IRQ() or
evtchn_device_upcall().

Current IA64 approach for callback:
  Current Xen/IA64 is using a pesudo physical IRQ to indicate the active of 
events and do dispatch at that pseudo IRQ handle. Within Xen summit we all
agree to implement the callback/safe fallback mechanism to avoid potentail
bugs and Intel is working on that now.

How X86 Xenlinux handle IRQs:
   Guest IRQ including PIRQ, VIRQ, IPI and interdomain communication 
channel are all bund with event channel. I.e. they all are carried
by event channel.

   At intial time, the guest needs to initialize IO_APIC hardware base on 
knowledge presented by firmware. And eventually register a pure virtual 
pirq_type as hw_interrupt_type instead of ioapic_level_type 
and ioapic_edge_type.

   At run time, pirq_type works and do pure event channel based operation.
for example, irq_desc-handler-ack (becomes ack_pirq) mask the 
corresponding event channel (no hypercall). irq_desc-handler-end 
(becomes end_pirq) unmask the corresponding event channel and 
may notify xen through hypercall (PHYSDEVOP_IRQ_UNMASK_NOTIFY)
to call xen irq_desc-handler-end. The later one may signal EOI 
in hypervisor (In IO_APIC, it is unmask_IO_APIC_irq).

 
Difference between pirq_type and ioapic_level_type/ioapic_edge_type:
 The initial time of this 2 type are similar, I.e. startup/shutdown,
   enable/disable are same, both may need to access machine resource.
 But the runtime service, i.e. ack/end, are quit different. pirq_type
   mainly access event channel related share memory for mask/unmask, but 
   ioapic_level_type/ioapic_edge_type needs to access machine IOSAPIC 
   resource for example: ack_edge_ioapic_irq and ack_edge_ioapic_vector
   need to mask APIC reource and ack APIC. 
 Another difference is that with event channel approach, the 
   hw_interrupt_type, i.e. pirq_type, works for both level and edge 
   triggered IRQ. 


When Xen received PHYSDEVOP_IRQ_UNMASK_NOTIFY (comes from guest pirq_type.end):
pirq_guest_unmask()
if ( --irq_desc-action-in_flight == 0 ) {
irq_desc-handler-end();   // EOI
}
Done;



Machine IRQ delivery in Xen/X86
  The code flow of xen IRQ delivery (IRQ belongs to guests)
  A machine IRQ happens - xen - do_IRQ() of xen.
irq_desc-handler-ack();   // same with Linux, op real resource
__do_IRQ_guest()
for each bund guest {
send_guest_pirq();
irq_desc-action-in_flight++; 
}
Done;

   send_guest_pirq():
 Set pending event channel bit (shared evtchn_pending) in target 
   processor. In SMP system when the target processor is running, a 
   machine IPI will be sent to (evtchn_notify).

When xen return to guest
 Before restore_all_guest, if VCPUINFO_upcall_mask=0, 
   i.e evtchan_upcall_mask = 0 and there is pending event channel,
   Xen will create a bounce_frame on guest that is similar with
   exception frame, the guest control then goes to callback entry.





*Xen/IA64 IRQ virtualization design

1: Hypervisor owns machine IOSAPIC/LSAPIC exclusively.
   This makes IRQ sharing between driver domains much easier as there
is no contention from domains. 

 
2: Machine IRQ delivery in Xen/IA64:
 
   The basic logical is exactly same with Linux/IA64
   An IRQ happens - IVT+0x3000 - ia64_handle_irq()
   while (IRQ exist) {
vector=CR.IVR;
mask IRQ using TPR;
__do_IRQ();
unmask IRQ using TPR
Issue CR.EOI
   }
   
   A slight difference is __do_IRQ. In linux it calls do_IRQ, 
because Xen merge do_IRQ and __do_IRQ together and use name 
__do_IRQ.   --- Resue
   
   The do_IRQ do followings (API in Xen/arch/x86/irq.c), the code
sequence is same and is much detail explained here:

RE: [Xen-ia64-devel] vIOSAPIC and IRQs delivery

2006-03-13 Thread Dong, Eddie
Tristan Gingold wrote:
 To me, it's also likely to imagine a mixed style: multiple IOSAPICs
 provided by the platform with total irq lines less than number of
 interrupt sources. In that case, people may partition interrupt
 domains, but finally still with irq line sharing even within local
 domain if it's electronically wired together. :-)
 I really think this is pure theory.
 
 IRQ lines were shared when hardware was costly: IBM-PC of 1981.
 It is also less performant (we have to check all drivers) and less
 sure: what about badly written drivers or fault tolerance?
 

Do you mean you change mind in supporting shared IRQ? Previously you
agreed to support in Xen/IA64.

 Now IOSAPIC are cheap.  I don't know any ia64 vendor sharing IRQs on
 PCI bus. Maybe I am wrong, but until now it is true.
 
 Furthermore, PCI-e has no more IRQ lines.  IRQ are now in-band
 messages (MSI). Thus sharing IRQs is not the future too.

Is not FUJITSU PRIMEQUEST an example as mentioned by Kenji Kaneshige? 

 
 Tristan.
 
Eddie

___
Xen-ia64-devel mailing list
Xen-ia64-devel@lists.xensource.com
http://lists.xensource.com/xen-ia64-devel


RE: [Xen-ia64-devel] Event channel vs current scheme speed [wasvIOSAPIC and IRQs delivery]

2006-03-13 Thread Dong, Eddie
Tristan Gingold wrote:
 Le Jeudi 09 Mars 2006 21:02, Tian, Kevin a écrit :
 Anyway, good discussion by far though still some way to go for
 consensus. :-) 
 
 Maybe we want to look at this from another way - fairness. [...]
 Regarding current model, there seems to be an issue about fairness
 between physical interrupts and xen events. Taking current 0xE9 for
 example, it's lower than timer but higher than all external device
 interrupts. This means xen events will always preempt device
 interrupts in this case, which is unfair and not what we want.
 To my understanding, this is also true for x86.
 With event channel, real physical IRQs use events 0-255, while Xen
 events use events 256-511.
 
 So what is the difference ?

The difference is that with event channel, IRQ (PIRQ from 0-255 and VIRQ from 
256-511)
vector itself doesn't participate in prioritization, but event channel. There 
is a map between
event channel and IRQ in evtchn.c.
With event channel solution, all the guest physical IRQ is injected/reflected 
through event 
channel instead of vLSAPIC. Event channel is a must as VBD/VNIF and Control 
Panel is using it except you rewrite all of them, I think you will not think in 
that way.If callback
itself is built on a pseudo IRQ (0xE9) in vLSAPIC, then all event channel has 
priority 0xE9 
in VLSAPIC. That has problem as some event channel need higher priority but 
some 
lower comparing with other guest device IRQ.

So the solution is to eliminate VLSAPIC and all guest PIRQ go with event 
channel. In the 
meantime, callback must go with a so called upcall function (callback 
function).
Hope this answer your question.

 
 Tristan.

Eddie

___
Xen-ia64-devel mailing list
Xen-ia64-devel@lists.xensource.com
http://lists.xensource.com/xen-ia64-devel


RE: [Xen-ia64-devel] [PATCH] [RESEND] domU destroy page ref counter

2006-03-13 Thread Dong, Eddie
Really a good job!
A minor suggestion for next in my mind is that we may add a simple
COMPILE option in Makefile or some .h file to be able to choice 1/3 byte
swap or 1/2 byte swap. People has some thoughts that 1/2 byte swap may
have better hash locality. 

Eddie.

___
Xen-ia64-devel mailing list
Xen-ia64-devel@lists.xensource.com
http://lists.xensource.com/xen-ia64-devel


RE: [Xen-ia64-devel] [PATCH] [RESEND] domU destroy page ref counter

2006-03-13 Thread Dong, Eddie
Yes, I remembered your suggestion. That is why I suggest to
enable a compile option so that somebody can start formal benchmark measurement 
:-)
Eddie

Magenheimer, Dan (HP Labs Fort Collins) wrote:
 Rid mangling change has been discussed many times on the list,
 most recently:
 
 http://lists.xensource.com/archives/html/xen-ia64-devel/2005-11/msg00282.html
 
 -Original Message-
 From: [EMAIL PROTECTED]
 [mailto:[EMAIL PROTECTED] On Behalf Of Xu,
 Anthony Sent: Monday, March 13, 2006 6:37 PM
 To: Dong, Eddie; Tian, Kevin; Akio Takebe; Masaki Kanno;
 xen-ia64-devel@lists.xensource.com
 Subject: RE: [Xen-ia64-devel] [PATCH] [RESEND] domU destroy  page
 ref counter 
 
 From: Dong, Eddie
 Sent: 2006年3月13日 22:12
 A minor suggestion for next in my mind is that we may add a simple
 COMPILE option in Makefile or some .h file to be able to choice 1/3
 byte swap or 1/2 byte swap. People has some thoughts that 1/2 byte
 swap may have better hash locality. 
 
 Eddie.
 I second Eddie,
 
 I have some observations about this.
 Usually guest applications use almost the address space, the only
 different is rid. What I observed was if the lowest 17 bits of rid
 are same, the hash address is same. If we swap 1/3 byte,
 applications use the same address space but different rid may have
 the hash address in a majority of situations, which may make some
 collision chains very long. 
 
 These are just some observations, I don't mean 1/2 byte swap is
 better than 1/3 byte swap.I think we need to add COMPILE option to
 get benchmark data first, and then make the decision.
 
 It's obviously not a big task but deserve to do. One thing we need
 to pay extra attention is the rid byte swap is done in assembly code
 in some fast_hyperpriops. 
 
 Thanks,
 Anthony
 
 ___
 Xen-ia64-devel mailing list
 Xen-ia64-devel@lists.xensource.com
 http://lists.xensource.com/xen-ia64-devel


___
Xen-ia64-devel mailing list
Xen-ia64-devel@lists.xensource.com
http://lists.xensource.com/xen-ia64-devel


RE: [Xen-ia64-devel] Event channel vs current scheme speed [wasvIOSAPIC and IRQs delivery]

2006-03-10 Thread Dong, Eddie
 I agree the current model has implicit priorities.
 
 But I am a little bit skeptical how the priority argument.  As far as
 I understand, in Xen or in Linux first asked is first priority.

Yes, here we see 3 concerns about priority:
1: When multiple IRQs arrive at same time, higher priority one get
earlier serviced in native. In xen event channel, higher priority IRQ
can have higher priority event channel. So they are basically same.

2: A lower IRQ arrives first, then followed by an high priority IRQ.
In this case, the situation is relative complicate. If the duration is
big enough, yes the first asked is serviced. If the duration is small,
then later one may preempt the first one. In virtual machine
environment, the time is virtualized. So no matter which one is service
first has no correctness issue (you can think the virtual duration maybe
big maybe large).

3: A higher IRQ arrives first, then followed by an lower priority IRQ.
In this case, higher IRQ must be serviced earlier than lower one. Xen
event channel search the highest priority IRQ and do service. At service
time, the callbreak is masked but the event can still be set. So a lower
IRQ can't interrupt higher one. Semantics are guaranteed.

 
 All in all, above long context is just one factor that I view to
 choose the proper mechanism. :-)
 If you are that worried about priorities, we way find solution in the
 current scheme.  I'd just like to understand why priorities are that
 important. 

These are all corner cases that we must consider as product, but at
early development we can take shortcut like using pseudo IRQ for event
channel here to let the whole project go ahead. And this is what we
talked at xensummit, people (Dan, Ian, Keir, Jun) all have no object for
potential issue concerns (for example mask/unmask support and priority
issue) and agree to take next. PPC guy also uses pseudo physical IRQ for
event channel as I remembered. Their community is much smaller than us
now and their development is also lagger than IA64. 
This is why we need to clean up now as callback based event channel
approach has already been in production stage. Making a new mechanism
has high risk. 
BTW, even with this strong event channel mechanism in Xen, we sometime
saw bugs in xen-devl such as a deadlock in a VMX SMP system like Xin
rootcaused before new year. But anyway it is very few now.

Eddie

___
Xen-ia64-devel mailing list
Xen-ia64-devel@lists.xensource.com
http://lists.xensource.com/xen-ia64-devel


RE: [Xen-ia64-devel] vIOSAPIC and IRQs delivery

2006-03-08 Thread Dong, Eddie
Tristan:
One more thing is that our proposal is to make out an idea
design for IRQ virtualization. We don't want to spend plenty of time to
argue each  detail here. I have found several limitations in previous
patch and I'd like to suggest us to work together to get the idea
design. Let us complete the event channel based design first, then
people can compare and comments. Will u contribute to that effort too?
Then you can fight between your left hand and right hand, like letting a
republican to debate for democracy or visa versa :-) 
The current solution (dom0 own IOSAPIC and event channel built
on pseudo physical IRQ) can serve us for a while before a well
considered solution comes out.
BTW, please refer to xen-devl, Keir confirmed io_apic.c in X86
is only used for initialization time, it is no longer necessary at run
time.
See embedded too.
thx,eddie



Tristan Gingold wrote:
  The event channel model in some case will request real IOSAPIC
 operation base on the type it is bund to. The software stack layer
 is very clear:  1: guest PIRQ (top), 2: event channel (middle),
  3: machine IRQ (bottom).
  BTW, event channel is a pure software design, there is no
 architecture dependency here.
 I don't wholy agree.  The callback entry is written in assembly, and
 seems to have tricks.

callback is one of the mechanism that event channel is carried on. Using

pseudo physical IRQ like in current xen/ia64 is another alternative.
No related to callback here.

 Then let us see where the previous patch need to improve.
 1: IRQ sharing is not supported. This feature, especially for huge
 Iron like Itanium, is a must.
 I agree.  However we won't reach this problem now as device drivers
 do not exist yet.
we are designing this patch to solve the driver domain problem in
xensummit.
If won't reach the problem is the reason to not do that, why we need
this change?
let dom0 own IOSAPIC is pretty simple and robust. Remember our goal is:
1: Support driver domain
2: Drive domain may share IRQ lines.

 
 2: Sharing machine IOSAPIC resource to multiple guest introduces
 many dangerous situation. Example:
  If  DeviceA in DomX and Device B in DomY share IRQn, When domX
 handle DeviceA IRQ (IRQn),
  take the example of function in the patch like mask_irq:
  s1:spin_lock_irqsave(iosapic_lock, flags);
  s2:xen_iosapic_write () // write RTE to disable
 the IRQ in this line
  s3:spin_unlock_irqrestore(iosapic_lock, flags);
  Here is the domX is switched out at S3, and DeviceB fire an IRQ
 at that time. Due to the
 disable in RTE, DomY can never response to the IRQ till DomA get
 executed again and enable RTE.
  This doesn't make sense for me.
 Neither for me.
 However my patch do not allow this behavior: once an IRQ is allocated
 by a domain, it can't be modified by another one.  Again I agree this
 is far from perfect and using an in_flight mechanism is better.
No, I don't think in_flight can help on this as if domX is masking
machine RTE.
The point is all those real IOSAPIC resource should be owned by xen, no
 partitioning, no sharing.

 
 3: Another major issue is that there is no easy way in future to add
 IRQ sharing support base on that patch. That is why I want to let
 hypervisor own IOSAPIC exclusively, and guest are purely based on
 software mechanism: Event channel. 
 I don't think IRQ sharing requires event channel.  This can also be
 done using current IRQ delivery.

Don;t know what is IRQ delivery mean.

 
 4: More new hypercall is introduced and more call to hypervisor.
 Only physdev_op hypercall is added, but it is also used in x86 to set
 up IOAPIC.  You can't avoid it.

Initial time is OK for no matter what approach, runtime is critical.
I saw a lot of hypercall for RTE write.

 
 Additionnal calls to hypervisor are for reading or writting IVR, EOI
 and TPR. I really think this is fast using hyper-privop.
 
 The current ia64 model is well tested too and seems efficient too
 (according to Dan measures).
 
 Yes, Xen/IA64 can say having undergone some level of test although
 domU is still not that stable.
 Maybe because domU do not have pirqs :-)
 
 But vIOSAPIC is totally new for VMs and is not well tested.
 Whatever we do Xen will control IOSAPICs.  For sure my patch is not
 well tested, but simple enough.

We should not take this risk for a one month lifecycle patch.

 
  On the other hand, the event channel based approach is well tested
 in Xen with real deployment by customer.
 Correct but it won't drap and drop on ia64.

No, all the code are xen common in para-guest side, you don't need to
drap and drop.
And even more, the patch based on event channel will be less than your
previous patch.
I.e. less modification to xenlinux.
BTW, due to virtual driver support, all the xen event channel related
files are already 
imported in xen/ia64. This evtchn.c file contain all guest IRQ
virtualizaion code. 
You don't need to add new 

RE: [Xen-ia64-devel] SMP-g design notes

2006-03-08 Thread Dong, Eddie
Tian, Kevin wrote:
 
 Yes, emulation of ptc.ga will be more complex than other emulations.
 However simply talking about the claim above, IPI is a necessity IMO,
 if another vcpu is running on another LP at the emulation. Though we
 may add some lazy flush 
 later.
 
Agree! I would like to see xen implements this kind of IPI for global
purge. And I'd like to 
see a whole approach design in near future for this issue so that we can
comment :-)
Eddie

___
Xen-ia64-devel mailing list
Xen-ia64-devel@lists.xensource.com
http://lists.xensource.com/xen-ia64-devel


RE: [Xen-ia64-devel] vIOSAPIC and IRQs delivery

2006-03-07 Thread Dong, Eddie
Tristan Gingold wrote:
 Le Mardi 07 Mars 2006 00:34, Dong, Eddie a écrit :
 Magenheimer, Dan (HP Labs Fort Collins) wrote:
 Hi Tristan --
 
 Do you have any more design information?  I'm not very
 familiar with the x86 implementation but is it your intent
 for it to be (nearly) identical?  What would be different?
 
 The difference is that should guest OS (para xen) still access the
 IOSAPIC MMIO port? If the guest OS keeps accessing the machine
 IOSAPIC MMIO address, multiple driver domain share same IRQ has
 potential problem. The design in my opnion is that hypervisor own
 the machine IOSAPIC resource exclusively including reading IVR and
 issuing CR.EOI. All the guest is working with a pure virtual IOSAPIC
 or virtual IO_APIC (actually doesn't matter for guest).
 [Note that IVR and CR.EOI are LSAPIC stuff.]
So should we use a new term virtual IRQ or interrupt virtualization?
Both LSAPIC and IOSAPIC need to be done in vIRQ.
BTW, RTE is still accessed by para-guest in previous patch :-)
Writing of RTE in machine resource from one domain will 
impact the correctness of other domain if they share same 
IRQ line.

 
 Would all hardware I/O interrupts require queueing by
 Xen in an event channel?  This seems like it could be
 a potential high overhead performance issue.
 There are two things:
 * delivery of IRQs through event channel.  I am not sure about
 performance impact (should be almost the same).  I am sure about
 linux modification impact (new files added, interrupt low-level
 handling completly modified). 
I don't see too much Linux modifications here as most of  these files are 
already in xen.
You can find them if you compile a X86 Xen, see linux/arch/xen/kernel/** , all 
those event channel related file are there including the PIRQ dispatching.
 In some sense, the whole IOSAPIC.c file is no longer a must. 

 
 * Use of callback for event channel (instead of an IRQ).
   I suppose it should be slightly faster.  I suppose this is required
 (for speed reasons) if we deliver IRQs through event-channel.
 
 Mmm, I have different opnion here. With all guest physical IRQ
 queueing by Xen event channel through a bitmap that is shared in
 para-guest, the guest OS no longer needs to access IVR and EOI now,
 that means we don't need to trap into hypervisor. Checking the
 bitmap is defenitely higher performance than read IVR, in this way
 the performance is improved actually.
 I really think this is not that obvious due to hyper-privop and
 hyper-reflexion.
This is basically the difference between hypercall and using share memory. 
Hard to say the amount but benefits is clear, although as this code is 
frequently accessed especially 
for driver domain where there are a lot of IRQs.

 Please start (maybe using some mails we have exchanged).  I will
 complete if necessary.
Yes, I have sent u some drafts. 
Eddie


___
Xen-ia64-devel mailing list
Xen-ia64-devel@lists.xensource.com
http://lists.xensource.com/xen-ia64-devel


RE: [Xen-ia64-devel] vIOSAPIC and IRQs delivery

2006-03-06 Thread Dong, Eddie
Magenheimer, Dan (HP Labs Fort Collins) wrote:
 Hi Tristan --
 
 Do you have any more design information?  I'm not very
 familiar with the x86 implementation but is it your intent
 for it to be (nearly) identical?  What would be different?
The difference is that should guest OS (para xen) still access the
IOSAPIC MMIO port?
If the guest OS keeps accessing the machine IOSAPIC MMIO address,
multiple driver domain share same IRQ has potential problem. The design
in my opnion is that hypervisor own the machine IOSAPIC resource
exclusively including reading IVR and issuing CR.EOI. All the guest is
working with a pure virtual IOSAPIC or virtual IO_APIC (actually doesn't
matter for guest). 

 
 Would all hardware I/O interrupts require queueing by
 Xen in an event channel?  This seems like it could be
 a potential high overhead performance issue.
Mmm, I have different opnion here. With all guest physical IRQ queueing
by Xen event channel through a bitmap that is shared in para-guest, the
guest OS no longer needs to access IVR and EOI now, that means we don't
need to trap into hypervisor. Checking the bitmap is defenitely higher
performance than read IVR, in this way the performance is improved
actually.
In the meantime, we don't need to spend time to re-design the vIOSAPIC,
it could be same with X86 vIO_APIC (90%). Definitely somebody need to
write down the vIO_APIC design :-)

Tristan or me can do that, Tristan?

 
 Perhaps a design document (or at least a few paragraphs)
 would be useful for the developers on the list.
 
Yes.
 Thanks,
 Dan
 
Eddie

___
Xen-ia64-devel mailing list
Xen-ia64-devel@lists.xensource.com
http://lists.xensource.com/xen-ia64-devel


RE: [Xen-ia64-devel] CONFIG_DOMAIN0_CONTIGUOUS in domain.c

2006-02-28 Thread Dong, Eddie
Dan:
I guess you misunderstand here.
Definitely we need to fix this bug first for the path #undef can't work 
as pre-clean up patch. With this patch, everything can stay with same 
functionality. It is not necessary to combine it together with VP+DMA patches 
that makes things much complicate. 
Eddie

-Original Message-
From: Magenheimer, Dan (HP Labs Fort Collins) [mailto:[EMAIL PROTECTED] 
Sent: 2006年3月1日 7:36
To: Dong, Eddie; Tian, Kevin; Isaku Yamahata
Cc: xen-ia64-devel@lists.xensource.com
Subject: RE: [Xen-ia64-devel] CONFIG_DOMAIN0_CONTIGUOUS in domain.c

This isn't a performance issue.  I don't think domain0/U
will function correctly with CONFIG_...CONTIGUOUS undef'd
until all of Isaku's necessary VP+DMA changes (in Xen,
Xenlinux, Xen drivers, and possibly tools) are complete.

 -Original Message-
 From: Dong, Eddie [mailto:[EMAIL PROTECTED] 
 Sent: Tuesday, February 28, 2006 4:19 PM
 To: Magenheimer, Dan (HP Labs Fort Collins); Tian, Kevin; 
 Isaku Yamahata
 Cc: xen-ia64-devel@lists.xensource.com
 Subject: RE: [Xen-ia64-devel] CONFIG_DOMAIN0_CONTIGUOUS in domain.c
 
 Magenheimer, Dan (HP Labs Fort Collins) wrote:
  to VP.  HOWEVER... it may be possible and desirable
  for much of Isaku's work to support both VP and P==M.
  For non-I/O code, CONFIG_DOMAIN0_CONTIGUOUS could be
  used (or possibly renamed) to select VP or P==M at
  compile-time, at least until the conversion to VP+DMA
  is complete.  This would allow at least some of Isaku's
 As if eventually we will remove this code, putting an compile 
 option now is OK IMO. But I think the default one should be 
 #undefed by some pre-cleanip patch now so that people can 
 find issues earlier if there have. 
 #undef this one can support no matter p==m or p!=m, while 
 #define this can only support p==m. Yes maybe we will see 
 0.5% performance degradation with #undef, but this is a 
 functionality must as we all go toward p!=m :-(
 After the whole p2m/VP patch comes out, we can then do more 
 performance tuning :-)
 Eddie
 

___
Xen-ia64-devel mailing list
Xen-ia64-devel@lists.xensource.com
http://lists.xensource.com/xen-ia64-devel


RE: [Xen-ia64-devel] SMP guest: first boot

2006-02-27 Thread Dong, Eddie
Congratulations too!
Now more SMP host issues can be found, right? Bravo Tristan!
thx,eddie

Alex Williamson wrote:
 On Mon, 2006-02-27 at 14:01 +0100, Tristan Gingold wrote:
 Hi,
 
 this is my first dom0 boot using 2 cpus:
 
Nice work Tristan!


___
Xen-ia64-devel mailing list
Xen-ia64-devel@lists.xensource.com
http://lists.xensource.com/xen-ia64-devel


RE: [Xen-ia64-devel] CONFIG_DOMAIN0_CONTIGUOUS in domain.c

2006-02-27 Thread Dong, Eddie

Isaku Yamahata wrote:
 Hi.
 
 I think that construct_dom0() is broken for CONFIG_DOMAIN0_CONTIGUOUS.
 I had to modify it heavily to boot dom0 with P2M/VP model.
 
 For example
 construct_dom0()
 ...
 memcpy(__va(pinitrd_start),initrd_start,initrd_len);
 This memcpy() assumes p==m.
 
 
 I thought that CONFIG_DOMAIN0_CONTIGUOUS option was introduced
 at the eary develpment stage and it remained just because
 no one removed it. However I don't know its history.
 
Looks like true :-)
So, I guess you will provide a fix patch to remove those stuff as a
pre-cleanup, am I right? I support this as somebody else can do more
test base on this.
thx,eddie

___
Xen-ia64-devel mailing list
Xen-ia64-devel@lists.xensource.com
http://lists.xensource.com/xen-ia64-devel


RE: [Xen-ia64-devel] [PATCH] SMP_HOST: Alloc vhpt from domheap

2006-02-26 Thread Dong, Eddie
First of all, Anthony's previous patch is good enough to check in. This memory 
used for VHPT is allocated from dom heap but doesn't belong to any specific 
domain. In future per VP VHPT options, yes we should account this memory to the 
domain memory size.

Secondly, yes as Alex pointed out, we may redesign the Xenheap size, but it 
looks like we can defer this to some time later. Something in X86 side like m2p 
table may be needed in IA64 side in future but not sure. If we need this one, 
it should be in xenheap that is a relative big memory chunk. On the other side, 
the xenheap is translated by a single TR to save the treasure TLB resource, so 
we can only choose among 16MB, 64MB and 256MB supported by IA64 architecture. 
Probably 16MB is too small :-)  
Eddie



Xu, Anthony wrote:
 From: Isaku Yamahata
 Sent: 2006年2月27日 13:18
 struct domain-max_pages is used for two purposes currently.
 a) to account pages allocated for a domain.
   (by xen/common/page_alloc.c)
 b) maximal pseudo physical address.
   (e.g. lookup_domain_mpa() in xen/arch/ia64/domain.c and others)
 
 This patch breaks b). Somethings needs to be adjusted.
 Maybe it is needed to add a new member to struct arch_domain for b)
 and to compensate max_pages at domain construction.
 
 Good catch!
 Domain-max_pages should be the number of memory pages allocated to
 domain, for instance, if a domain has 512M memory, the
 domain-max_pages should be 512M/16K. VHPTs are allocated from
 domheap, but not from designated domain due to the first parameter is
 NULL, so domain-max_pages and domain-tot_pages will not be
 impacted. Seems not break a and b. Yes, you can use two variables
 with each representing domain's memory pages and pages used by this
 domain separately, the later includes the former.  
 
 
 How do you think about accounting pages which is used
 for struct arch_domain-mm?
 Please see pgtable_quick_list_alloc() in xen/arch/ia64/xen/xenmis.c.
 
 It's the same issue with above, it is better that P2M table is
 allocated from domheap with the first parameter NULL instead of
 xenheap, since you are doing P2M task; you can fix this in the same
 time.   
 
 
 Thanks.
 
 ___
 Xen-ia64-devel mailing list
 Xen-ia64-devel@lists.xensource.com
 http://lists.xensource.com/xen-ia64-devel


___
Xen-ia64-devel mailing list
Xen-ia64-devel@lists.xensource.com
http://lists.xensource.com/xen-ia64-devel


[Xen-ia64-devel] CONFIG_DOMAIN0_CONTIGUOUS in domain.c

2006-02-26 Thread Dong, Eddie



All:
 I am not 
sure if somebody has tested the pathwith CONFIG_DOMAIN0_CONTIGUOUSin 
xen/arch/ia64/xen/domain.c disabled, as this is a must for p2m/vp coming 
patches, please echo if yes?If no, then it is an urgent task now 
toeither remove the code or disable the configuration by 
default.

Thanks,eddie

___
Xen-ia64-devel mailing list
Xen-ia64-devel@lists.xensource.com
http://lists.xensource.com/xen-ia64-devel

RE: [Xen-ia64-devel] SMP guest and itc

2006-02-14 Thread Dong, Eddie
Alex:
Can't exposing an formula cooperatively between domain 
hypervisor solve the prpblem?
E.g:   guest_ITC=host_ITC * fact + offset;If the host ITC is
accurate enough.
Or guest_ITC = host_high_precise_timer_count * fat + offset; if
the hyprevisor want to use other platform high precise timer like HPET.
Implementing those one in xenlinux is quit easy IMO.

One more thing for time virtualization for us is to implement
other timer resource virtualization like ACPI timer, HPET that may be
used optionally.
Eddie

Alex Williamson wrote:
 On Tue, 2006-02-14 at 09:26 +0800, Dong, Eddie wrote:
 
 Base on my understanding, the ITC drift between different processor
 after fixup done in Linux or Xen today is less than 100ns. So I think
 that is not a big issue as if we guarantee the guest doesn't  see
 backward time. (As VP migration usually take longer than 100ns)
 
 Hi Eddie,
 
   This is why I suggest pre-sync'd ITC are probably sufficient for
 now. However, we will need to support systems where the ITC between
 processors drift.  In such a case, we either need to expose a better
 time source to the guest (perhaps a paravirtualized time interpolator)
 or fabricate ITC values for the guest which make the ITCs appear
 synchronized.  The latter feels like it could be a bottleneck.
 
 For the gettimeofday() concern, I don't agree. Because even we
 support full virtualization, an paravirtualized guest can still get
 guest ITC quickly by exposing the formula to guest or accessing
 share memory (X86 use share memory).
 
Have a look at kernel/timer.c:time_interpolator_get_counter(). 
 ITCs cannot be perfectly synchronized, therefore all ITC time
 interpolators have jitter.  We ensure that we never see time go
 backwards by keeping track of the last cycle count we returned. 
 Storing this cycle counter requires a cmpxchg.  As soon as we get
 multiple CPUs doing gettimeofday(), we get contention in the cmpxchg.
 Multiple CPUs doing gettimeofday() simultaneously can potentially
 cause us to read the ITC many, many, many times.  If each read causes
 a trap into xen, the performance implications could be severe.  This
 is why larger systems provide HPETs or other similar platform time
 sources.  The ITC based time interpolator does not scale well to
 large SMP systems. 
 
  Anyway, gettimeofday is not frequently accessed, so it is not a
 big cake:-)
 
I disagree and note that Linux/ia64 implements fsys_gettimeofday to
 avoid even entering C code to make a fast gettimeofday call.  Thanks,
 
   Alex


___
Xen-ia64-devel mailing list
Xen-ia64-devel@lists.xensource.com
http://lists.xensource.com/xen-ia64-devel


RE: [Xen-ia64-devel] IOSAPIC virtualisation

2006-02-09 Thread Dong, Eddie
Tristan Gingold wrote:
 Le Vendredi 03 Février 2006 17:13, Alex Williamson a écrit :
 On Fri, 2006-02-03 at 09:33 +0100, Tristan Gingold wrote: [...]
I agree that we can't hit this problem right now, but it's easy to
 fix and would be one less thing we might miss when we do enable
 driver domains.  It looks the block of code to mask the vector could
 be copied identically into the section to unmask the vector with the
 appropriate s/mask_vec/unmask_vec and setting of the rte values.  I
 guess it keeps catching my eye because the mask and unmask are not
 symmetric.  Thanks, Hi, 
 
 I have slightly modified the patch so that it looks almost symmetric.
 
 Thanks,
 Tristan.

Tristan:
Great work! And sorry I don't find time to go through all.
A quick question is that why we need to do cpu_wake() immediately after 
IRQ injection? 
In Xen design, this API is mainly used for VP pause/unpause and manual 
ops that is reasonable to disturb/bypass the scheduler decision. This disturb 
is heavily costed as the scheduler triggered in the next time tick will go back 
to its normal decision tree that probably means preemption of dom0 quantum. 
What X86 did is to wait for the scheduler to take the decision. 
I know the original code also do in this way, but it is not an 
architecture requirement. Rather it is a shortcut in previous implementation, 
and I think it is time to revise now.


+xen_reflect_interrupt_to_domains (ia64_vector vector)
+{
+   struct iosapic_intr_info *info = iosapic_intr_info[vector];
+   struct iosapic_rte_info *rte;
+   int res = 1;
+
+   list_for_each_entry(rte, info-rtes, rte_list) {
+   if (rte-vcpu != NULL) {
+   if (rte-vcpu == VCPU_XEN)
+   res = 0;
+   else {
+   /* printf (Send %d to vcpu as %d\n,
+  vector, rte-vec); */
+   /* FIXME: vcpus should be really
+  interrupted.  This should currently works
+  because only domain0 receive interrupts and
+  domain0 runs on CPU#0, which receives all
+  the interrupts... */
+   vcpu_pend_interrupt(rte-vcpu, rte-vcpu_vec);
+   vcpu_wake(rte-vcpu);
+   }

Another minor comments are:
1: +#define VCPU_XEN ((struct vcpu *)1)looks stranger for me. 
Further more, I'd like to put a bit in RTE indicating ownership 
of IRQ, anything else you considered?
2: Similar with #1, checking IRQ vector  (if (vector == 
IA64_TIMER_VECTOR || vector == IA64_IPI_VECTOR)) in following code is too 
hardcode. Today we only have 2 IRQs in hypervisor, but actually we need more 
such as platform management interrupt like Alex mentioned previously for 
hotplug, thermal sensor IRQ. So we don't want to see a long list of check here. 
My suggestion is to adopt similar mechanism with X86. I.e. like __do_IRQ_guest 
in arch/x86/irq.c, the detail implementation can be architecture dependant like 
x86 use desc-status  IRQ_GUEST but we may not.
Anyway, keep the capability that a machine IRQ may be bound to 
multiple guest like X86 did today is better and it is not so difficult. you may 
also be able to reuse some code there :-)


 xen_do_IRQ(ia64_vector vector)
 {
-   if (vector != IA64_TIMER_VECTOR  vector != IA64_IPI_VECTOR) {
-   extern void vcpu_pend_interrupt(void *, int);
+   struct vcpu *vcpu;
+   ia64_vector v;
+
+   /*  Do not reflect special interrupts.  */
+   if (vector == IA64_TIMER_VECTOR || vector == IA64_IPI_VECTOR)
+   return 0;
+

Thx,eddie

___
Xen-ia64-devel mailing list
Xen-ia64-devel@lists.xensource.com
http://lists.xensource.com/xen-ia64-devel


RE: [Xen-ia64-devel] IOSAPIC virtualisation

2006-02-09 Thread Dong, Eddie
Tristan Gingold wrote:
 Yes I have just copied from the original code.
 However, we should also take IPI into consideration (unless we go
 directly to event channel).
Can you explain more on IPI stuff? I am not in the context.
 Anyway, keep the capability that a machine IRQ
 may be bound to multiple guest like X86 did today is better and it
 is not so difficult. you may also be able to reuse some code there
 :-) 
 To be added on my TODO list, since we can't trigger such a case or
 test it now.
Mmm, I would suggest we come out a full solution and hold this patch for
a while. Your previous patch let hypervisor own IOAPIC, but it is still
not Xen solution. Sharing physical IRQ by mulitple driver domain is a
normal case for level triggered IRQ. To be more important, the X86
solution to handle physical IRQ is pretty clean and beautiful, why not
resue the code?   
Keir, please correct me if I made mistake in understanding the X86
IOAPIC virtualization policy.

Yes, we don't meet this situation now because driver domain is not there
yet, but are not we implementing this patch to solve future driver
domain issue?

If this is supported, then other issues like how to indicating the
ownership of IRQ disappear too.

Thx,eddie

___
Xen-ia64-devel mailing list
Xen-ia64-devel@lists.xensource.com
http://lists.xensource.com/xen-ia64-devel


RE: [Xen-ia64-devel] [PATCH] Make VTIdomain boot again

2006-02-08 Thread Dong, Eddie
Dan and all:
This is an bug related to the task event channel
callback/failsafe callback that we decided to do in Xen summit but not
started yet. If the task is completed, then same mechanism with X86
should be implemented, and the soft-IRQ only happens when returning to
guest like X86. (see arch/x86/x86_32/entry.s API: ret_from_intr,
test_all_events only happens when returning to guest).
I believe there are others issues but not found yet due to our
very beginning's walkaround in IA64. That usually means saving time at
beginning (shortcut) take us more in later :-(

Eddie

Xu, Anthony wrote:
 Hi All,
 
 Since the merge from xen-unstable, there is a small window between
 bvt_do_schedule and context_switch in function __enter_schedule, where
 interrupt is enabled.
 
 See below scenario:
 1. VTI domain accesses legacy IO, VMM gets control, sets VTI-domain
 into blocked status and calls __enter_schedule to yield scheduler and
 wait QEMU in domain0 to handle IO request.
 2. There is a timer interrupt in above window, and this timer
 interrupt triggers schedule timer, then in irq_exit function, VMM
 will do soft_irq, which in turn will invoke __enter_schedule, thus
 __enter_schedule is reentered in VMM, which is not correct.
 
 So the root cause is __enter_schedule is reentered.
 The correct way is, soft_irq should be done just before VMM returning
 to guest just like in native linux soft-irq is done just before linux
 returning to application. But in current implementation soft-irq is
 done in irq-exit function.
 
 The reason why xenU can boot is,
 xenU is always runable, so it will not be deleted from runqueue,
 though __enter_schedule is reentered, no issue appear, as for
 VTIdomain, when it does IO operation, it will be set into blocked
 status and be deleted from runqueue, which will crash the whole
 system. 
 
 This patch is just a workaround, it makes sure in irq_exit only when
 VMM is not in nested interrupted situation, soft_irq is done.
 
 I strongly suggest soft-irq be done in the path of ia64_leave_kernel
 just like native linux kernel.
 
 Any comment?
 
 Thanks,
 -Anthony


___
Xen-ia64-devel mailing list
Xen-ia64-devel@lists.xensource.com
http://lists.xensource.com/xen-ia64-devel


RE: [Xen-ia64-devel] Meeting Summary taken from Xen-ia64 Next StepsDiscussion during Xen Summit

2006-01-20 Thread Dong, Eddie
Thanks for the summary and one more thing:
We are all agreed the mechanism to send event channel should use
callback/failsafe eventually instead of using a pseudo physical IRQ, as
the latter one has some potential issues.
Actually same issues happen in PPC side too.
Thanks Dan, Thanks Ian and Keir for these great results.
Eddie

Yang, Fred wrote:
 Xen-ia64 Community members,
 
 Following is the agreement/summary taken from Xen/ia64 Next Steps
 Discussion session held during Xen Summit at January 17, 2006.  The
 items in 
 http://www.xensource.com/files/xs0106_ia64_nextsteps_disc.pdf are
 fully discussed 
 
 The Work session attendees had agreed following actions in order to
 move Xen/ia64 to the next stage, and the efforts will be started
 immediately. 
 
 1. Physical Memory support for Domain0
  * PPC port has the similar P2M issue as Xen-ia64
  * Group agreed P2M is the route to take, the detail
 implementation can be between P2M  VP approaches to change XenLinux
 as  less as possible
  * To merge P2M into mainline code may cause Xen-ia64-unstable to
 be buggy or unstable for a period of time.
 Since this is a must feature to go, we should merge the code
 and get community to work together to get system stablized
  * Fujitsu has been looking into this effort and will contribute
 this effort
  * To Enable P2M for Domain0 is must for Xen-ia64 and should be
 done early to enable VBD/VNIF  driver domain to come
 2. Memory enhancement for page reference count
  * this can possibly cause stability issue and affecting domain
 destory
  * This item is a must for Xen-ia64
 3. Virtual Interrupt Controller to let Xen owns physical IOSAPIC
  * This can help to address SMP guest for para-domains as well as
 a must for driver domain
  * a must item for xen-ia64
 4. VTLB/VHPT SMP Support
  * To support next step SMP guest support, hash VTLB and same VHPT
 model should be adapted
  * A patch to extend hash VTLB/VHPT to hookup for para-virtualized
 Domain should be added
 Code should try to be built with option to able to pull
 original VHPT model back per future performance tunning needs
  * A must item for Xen/ia64 to get to SMP guest support
 5. Reboot/Destroy Domains
  * A must item after page reference count is done
 6. Hypercalls
  * Can be adapted per P2M and future needs
 7. Timer Virtualization
  * This is defineitely worth to do to help the performance
 
 This summary lists major effort to overhaul overall Xen/ia64
 infrastructure, we welcome developers to contribute into these efforts
 
 -Fred Yang
 
 ___
 Xen-ia64-devel mailing list
 Xen-ia64-devel@lists.xensource.com
 http://lists.xensource.com/xen-ia64-devel


___
Xen-ia64-devel mailing list
Xen-ia64-devel@lists.xensource.com
http://lists.xensource.com/xen-ia64-devel


RE: [Xen-ia64-devel] [PATCH] implemented vcpu_ptc_l()

2005-12-06 Thread Dong, Eddie
Isaku Yamahata wrote:
 On Tue, Dec 06, 2005 at 01:31:03PM +0800, Dong, Eddie wrote:
 
 I disabled CONFIG_SMP manually because ski doesn't support smp.
 (And I enabled some configs related to ski.)
 para linux with CONFIG_SMP=n uses ptc.l.
That makes sense.

 
 Should I enable CONFIG_SMP?
 If so, does following patch make sense?
I think ptc doesn't need to purge TR, and also 1-entry TLB needs to be
purged.
I mean vcpu-arch.dtlb and vcpu-arch.itlb.

Thanks,eddie

___
Xen-ia64-devel mailing list
Xen-ia64-devel@lists.xensource.com
http://lists.xensource.com/xen-ia64-devel


RE: [Xen-ia64-devel] [PATCH] implemented vcpu_ptc_l()

2005-12-05 Thread Dong, Eddie
Yamahata san:
I think the para linux does not use ptc, does it on ski?
2 comments:
a:  Any reason to purge TR?
b: I believe there is a single SW TLB entry in non-VTI implementation, 
if you do that, you'd better to check this too.
Eddie


-Original Message-
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Isaku Yamahata
Sent: 2005年12月6日 11:21
To: xen-ia64-devel@lists.xensource.com
Subject: [Xen-ia64-devel] [PATCH] implemented vcpu_ptc_l()


Hi.
I implemented vcpu_ptc_l() which is needed to boot dom0 on ski.
Is there any reason why it hasn't been implemented?
I didn't see difficulties to implement it. Do I miss anything?

Signed-off-by Isaku Yamahata [EMAIL PROTECTED]

--
diff -r c4a86ad93e49 xen/arch/ia64/linux-xen/tlb.c
--- a/xen/arch/ia64/linux-xen/tlb.c Thu Dec  1 18:21:59 2005 +0900
+++ b/xen/arch/ia64/linux-xen/tlb.c Tue Dec  6 12:13:48 2005 +0900
@@ -110,6 +110,15 @@
 }
 
 void
+ia64_local_tlb_purge (unsigned long start, unsigned long end, unsigned long 
nbits)
+{
+   do {
+   ia64_ptcl(start, (nbits  2));
+   start += (1UL  nbits);
+   } while (start  end);
+}
+
+void
 local_flush_tlb_all (void)
 {
unsigned long i, j, flags, count0, count1, stride0, stride1, addr;
diff -r c4a86ad93e49 xen/arch/ia64/xen/vcpu.c
--- a/xen/arch/ia64/xen/vcpu.c  Thu Dec  1 18:21:59 2005 +0900
+++ b/xen/arch/ia64/xen/vcpu.c  Tue Dec  6 12:13:48 2005 +0900
@@ -1827,8 +1827,20 @@
 
 IA64FAULT vcpu_ptc_l(VCPU *vcpu, UINT64 vadr, UINT64 addr_range)
 {
-   printk(vcpu_ptc_l: called, not implemented yet\n);
-   return IA64_ILLOP_FAULT;
+   extern void ia64_local_tlb_purge (unsigned long start, unsigned long 
end, unsigned long nbits);
+
+   //XXX FIXME: validate not flushing Xen addresses
+   if (IS_VMM_ADDRESS(vadr)) {
+   return IA64_ILLOP_FAULT;
+   }
+   
+#ifdef VHPT_GLOBAL
+   vhpt_flush_address(vadr, addr_range);
+#endif
+   ia64_local_tlb_purge(vadr, vadr + addr_range, PAGE_SHIFT);
+   vcpu_purge_tr_entry(PSCBX(vcpu,dtlb));
+   vcpu_purge_tr_entry(PSCBX(vcpu,itlb));
+   return IA64_NO_FAULT;
 }
 
 // At privlvl=0, fc performs no access rights or protection key checks, while



-- 
yamahata

___
Xen-ia64-devel mailing list
Xen-ia64-devel@lists.xensource.com
http://lists.xensource.com/xen-ia64-devel

___
Xen-ia64-devel mailing list
Xen-ia64-devel@lists.xensource.com
http://lists.xensource.com/xen-ia64-devel


  1   2   >