Re: [kvm-devel] [PATCH/RFC 3/4]Introduce account modifiers mechanism

2007-08-19 Thread Avi Kivity

Laurent Vivier wrote:

Avi Kivity wrote:
[...]
  

The normal user/system accounting has the same issue, no?  Whereever we
happen to land (kernel or user) gets the whole tick.

So I think it is okay to have the same limitation for guest time.




So this is how it looks like.
PATCH 1 and 2 are always a prerequisite.

  



+   tmp = cputime_to_cputime64(cputime);
+   if (p-flags  PF_VCPU) {
+   p-utime = cputime_add(p-utime, cputime);
+   p-gtime = cputime_add(p-gtime, cputime);
+
+   cpustat-guest = cputime64_add(cpustat-guest, tmp);
+   cpustat-user = cputime64_add(cpustat-user, tmp);
+
+   p-flags = ~PF_VCPU;
+
+   return;
+   }
+


Where did CONFIG_GUEST_ACCOUNTING go?


--
error compiling committee.c: too many arguments to function

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/virtualization


Re: [kvm-devel] [PATCH/RFC 3/4]Introduce account modifiers mechanism

2007-08-19 Thread Avi Kivity

Laurent Vivier wrote:

So I think it is okay to have the same limitation for guest time.



OK, so we can go back to my first patch.
Who can decide to introduce this into the kernel ?
  


The sched.c parts are best merged by Ingo, and I can carry the kvm 
parts.  Alternatively, I can carry the entire patchset if Ingo acks it.


--
error compiling committee.c: too many arguments to function

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/virtualization


Re: [kvm-devel] [PATCH/RFC 3/4]Introduce account modifiers mechanism

2007-08-17 Thread Avi Kivity
Laurent Vivier wrote:
 
 - remove PATCH 3, and add in task_struct a ktime vtime where we accumulate
 guest time (by calling something like guest_enter() and guest_exit() from 
 the
 virtualization engine), and when in account_system_time() we have cputime 
 vtime we substrate vtime from cputime and add vtime to user time and guest 
 time.
 But doing like this we freeze in kernel/sched.c the link between system 
 time,
 user time and guest time (i.e. system time = system time - vtime, user time 
 =
 user time + vtime and guest time = guest time + vtime).
   
 Actually, I think we can set a per-cpu in_guest flag for the scheduler
 code, which then knows to add the tick to the guest time.  That seems
 the simplest possible solution.

 lguest or kvm would set the flag before running the guest (which is done
 with preempt disabled or using preemption hooks), and reset it
 afterwards.

 Thoughts?
 

 It was my first attempt (except I didn't have a per-cpu flag, but a per-task
 flag), it's not visible but I love simplicity... ;-)

 A KVM VCPU is stopped by preemption, so when we enter in scheduler we have
 exited from VCPU and thus this flags is off (so we account 0 to the guest). 
 What
 I did then is set the flag on when we enter in the VCPU, and
 account_system_time() sets the flag off when it adds this timeslice to 
 cpustat
 (and compute correctly guest, user, system time). But I didn't like this idea
 because all code executed after we entered in the VCPU is accounted to the 
 guest
 until we have an account_system_time() and I suppose we can have real system
 time in this part. And I guess a VCPU can be less than 1 ms (unit of cputime) 
 in
 a timeslice.

 So ? What's best ?
   

The normal user/system accounting has the same issue, no?  Whereever we
happen to land (kernel or user) gets the whole tick.

So I think it is okay to have the same limitation for guest time.

-- 
Do not meddle in the internals of kernels, for they are subtle and quick to 
panic.

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/virtualization


Re: [kvm-devel] [PATCH/RFC 3/4]Introduce account modifiers mechanism

2007-08-17 Thread Laurent Vivier
Avi Kivity wrote:
 Laurent Vivier wrote:
 
 - remove PATCH 3, and add in task_struct a ktime vtime where we 
 accumulate
 guest time (by calling something like guest_enter() and guest_exit() from 
 the
 virtualization engine), and when in account_system_time() we have cputime 
 vtime we substrate vtime from cputime and add vtime to user time and guest 
 time.
 But doing like this we freeze in kernel/sched.c the link between system 
 time,
 user time and guest time (i.e. system time = system time - vtime, user 
 time =
 user time + vtime and guest time = guest time + vtime).
   
 Actually, I think we can set a per-cpu in_guest flag for the scheduler
 code, which then knows to add the tick to the guest time.  That seems
 the simplest possible solution.

 lguest or kvm would set the flag before running the guest (which is done
 with preempt disabled or using preemption hooks), and reset it
 afterwards.

 Thoughts?
 
 It was my first attempt (except I didn't have a per-cpu flag, but a per-task
 flag), it's not visible but I love simplicity... ;-)

 A KVM VCPU is stopped by preemption, so when we enter in scheduler we have
 exited from VCPU and thus this flags is off (so we account 0 to the guest). 
 What
 I did then is set the flag on when we enter in the VCPU, and
 account_system_time() sets the flag off when it adds this timeslice to 
 cpustat
 (and compute correctly guest, user, system time). But I didn't like this idea
 because all code executed after we entered in the VCPU is accounted to the 
 guest
 until we have an account_system_time() and I suppose we can have real system
 time in this part. And I guess a VCPU can be less than 1 ms (unit of 
 cputime) in
 a timeslice.

 So ? What's best ?
   
 
 The normal user/system accounting has the same issue, no?  Whereever we
 happen to land (kernel or user) gets the whole tick.

Yes... but perhaps I should rewrite this too ;-)

 So I think it is okay to have the same limitation for guest time.

OK, so we can go back to my first patch.
Who can decide to introduce this into the kernel ?

Laurent
-- 
- [EMAIL PROTECTED]  --
  Software is hard - Donald Knuth



signature.asc
Description: OpenPGP digital signature
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/virtualization

Re: [kvm-devel] [PATCH/RFC 3/4]Introduce account modifiers mechanism

2007-08-17 Thread Christian Borntraeger
Am Freitag, 17. August 2007 schrieb Laurent Vivier:
  The normal user/system accounting has the same issue, no?  Whereever we
  happen to land (kernel or user) gets the whole tick.
 
 Yes... but perhaps I should rewrite this too ;-)

If you look further, you will see, that this was actually rewritten in 2.6.12 
and thats why we have cputime_t. The infrastructure is currently only used by 
s390 and ppc64. On s390 we use our cpu timer to get the current time on each 
syscall/irq/context switch etc to get precise accounting data for 
system/user/irq/softirq/nice. We also get steal time (this is interesting for 
the guest: how much of my cpu was actually not available because the 
hypervisor scheduled me away). I dont know enough about other architectures  
to say which one could exploit this infrastructure as well.

The current git tree is somewhat broken for CONFIG_VIRT_CPU_ACCOUTING due to 
the accouting changes introduced by CFS - we work on this.

If you are interested in the cputime stuff, you can have a look at 
arch/s390/kernel/vtime.c (account_system_vtime, account_vtime) as well as:

http://git.kernel.org/?p=linux/kernel/git/torvalds/old-2.6-bkcvs.git;a=commit;h=6be7071fdd321c206b1ee7a3e534782f25572830
for the first introduction and
http://git.kernel.org/?p=linux/kernel/git/torvalds/old-2.6-bkcvs.git;a=commit;h=db59f519d37f80683f3611927f6b5c3bdfc0f9e6
for the s390 exploitation.

Christian
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/virtualization


Re: [kvm-devel] [PATCH/RFC 3/4]Introduce account modifiers mechanism

2007-08-17 Thread Laurent Vivier
Avi Kivity wrote:
[...]
 
 The normal user/system accounting has the same issue, no?  Whereever we
 happen to land (kernel or user) gets the whole tick.
 
 So I think it is okay to have the same limitation for guest time.
 

So this is how it looks like.
PATCH 1 and 2 are always a prerequisite.

Laurent
Index: kvm/include/linux/sched.h
===
--- kvm.orig/include/linux/sched.h  2007-08-17 15:07:02.0 +0200
+++ kvm/include/linux/sched.h   2007-08-17 15:08:19.0 +0200
@@ -1310,6 +1310,7 @@
 #define PF_STARTING0x0002  /* being created */
 #define PF_EXITING 0x0004  /* getting shut down */
 #define PF_EXITPIDONE  0x0008  /* pi exit done on shut down */
+#define PF_VCPU0x0010  /* I'm a virtual CPU */
 #define PF_FORKNOEXEC  0x0040  /* forked but didn't exec */
 #define PF_SUPERPRIV   0x0100  /* used super-user privileges */
 #define PF_DUMPCORE0x0200  /* dumped core */
Index: kvm/kernel/sched.c
===
--- kvm.orig/kernel/sched.c 2007-08-17 14:42:43.0 +0200
+++ kvm/kernel/sched.c  2007-08-17 15:16:20.0 +0200
@@ -3246,10 +3246,22 @@
struct rq *rq = this_rq();
cputime64_t tmp;
 
+   tmp = cputime_to_cputime64(cputime);
+   if (p-flags  PF_VCPU) {
+   p-utime = cputime_add(p-utime, cputime);
+   p-gtime = cputime_add(p-gtime, cputime);
+
+   cpustat-guest = cputime64_add(cpustat-guest, tmp);
+   cpustat-user = cputime64_add(cpustat-user, tmp);
+
+   p-flags = ~PF_VCPU;
+
+   return;
+   }
+
p-stime = cputime_add(p-stime, cputime);
 
/* Add system time to cpustat. */
-   tmp = cputime_to_cputime64(cputime);
if (hardirq_count() - hardirq_offset)
cpustat-irq = cputime64_add(cpustat-irq, tmp);
else if (softirq_count())
Index: kvm/drivers/kvm/kvm.h
===
--- kvm.orig/drivers/kvm/kvm.h  2007-08-17 15:26:16.0 +0200
+++ kvm/drivers/kvm/kvm.h   2007-08-17 15:29:46.0 +0200
@@ -589,6 +589,19 @@
 
 int kvm_hypercall(struct kvm_vcpu *vcpu, struct kvm_run *run);
 
+#ifndef PF_VCPU
+#define PF_VCPU0   /* no kernel support */
+#endif
+
+static inline void kvm_guest_enter(void)
+{
+   current-flags |= PF_VCPU;
+}
+
+static inline void kvm_guest_exit(void)
+{
+}
+
 static inline int kvm_mmu_page_fault(struct kvm_vcpu *vcpu, gva_t gva,
 u32 error_code)
 {
Index: kvm/drivers/kvm/svm.c
===
--- kvm.orig/drivers/kvm/svm.c  2007-08-17 15:26:16.0 +0200
+++ kvm/drivers/kvm/svm.c   2007-08-17 15:27:03.0 +0200
@@ -1404,6 +1404,7 @@
clgi();
 
vcpu-guest_mode = 1;
+   kvm_guest_enter();
if (vcpu-requests)
if (test_and_clear_bit(KVM_TLB_FLUSH, vcpu-requests))
svm_flush_tlb(vcpu);
@@ -1536,6 +1537,7 @@
 #endif
: cc, memory );
 
+   kvm_guest_exit();
vcpu-guest_mode = 0;
 
if (vcpu-fpu_active) {
Index: kvm/drivers/kvm/vmx.c
===
--- kvm.orig/drivers/kvm/vmx.c  2007-08-17 15:26:16.0 +0200
+++ kvm/drivers/kvm/vmx.c   2007-08-17 15:27:45.0 +0200
@@ -2078,6 +2078,7 @@
local_irq_disable();
 
vcpu-guest_mode = 1;
+   kvm_guest_enter();
if (vcpu-requests)
if (test_and_clear_bit(KVM_TLB_FLUSH, vcpu-requests))
vmx_flush_tlb(vcpu);
@@ -2198,6 +2199,7 @@
[cr2]i(offsetof(struct kvm_vcpu, cr2))
  : cc, memory );
 
+   kvm_guest_exit();
vcpu-guest_mode = 0;
local_irq_enable();
 
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/virtualization