Re: [kvm-devel] [PATCH/RFC 3/4]Introduce account modifiers mechanism
Laurent Vivier wrote: Avi Kivity wrote: [...] The normal user/system accounting has the same issue, no? Whereever we happen to land (kernel or user) gets the whole tick. So I think it is okay to have the same limitation for guest time. So this is how it looks like. PATCH 1 and 2 are always a prerequisite. + tmp = cputime_to_cputime64(cputime); + if (p-flags PF_VCPU) { + p-utime = cputime_add(p-utime, cputime); + p-gtime = cputime_add(p-gtime, cputime); + + cpustat-guest = cputime64_add(cpustat-guest, tmp); + cpustat-user = cputime64_add(cpustat-user, tmp); + + p-flags = ~PF_VCPU; + + return; + } + Where did CONFIG_GUEST_ACCOUNTING go? -- error compiling committee.c: too many arguments to function ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linux-foundation.org/mailman/listinfo/virtualization
Re: [kvm-devel] [PATCH/RFC 3/4]Introduce account modifiers mechanism
Laurent Vivier wrote: So I think it is okay to have the same limitation for guest time. OK, so we can go back to my first patch. Who can decide to introduce this into the kernel ? The sched.c parts are best merged by Ingo, and I can carry the kvm parts. Alternatively, I can carry the entire patchset if Ingo acks it. -- error compiling committee.c: too many arguments to function ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linux-foundation.org/mailman/listinfo/virtualization
Re: [kvm-devel] [PATCH/RFC 3/4]Introduce account modifiers mechanism
Laurent Vivier wrote: - remove PATCH 3, and add in task_struct a ktime vtime where we accumulate guest time (by calling something like guest_enter() and guest_exit() from the virtualization engine), and when in account_system_time() we have cputime vtime we substrate vtime from cputime and add vtime to user time and guest time. But doing like this we freeze in kernel/sched.c the link between system time, user time and guest time (i.e. system time = system time - vtime, user time = user time + vtime and guest time = guest time + vtime). Actually, I think we can set a per-cpu in_guest flag for the scheduler code, which then knows to add the tick to the guest time. That seems the simplest possible solution. lguest or kvm would set the flag before running the guest (which is done with preempt disabled or using preemption hooks), and reset it afterwards. Thoughts? It was my first attempt (except I didn't have a per-cpu flag, but a per-task flag), it's not visible but I love simplicity... ;-) A KVM VCPU is stopped by preemption, so when we enter in scheduler we have exited from VCPU and thus this flags is off (so we account 0 to the guest). What I did then is set the flag on when we enter in the VCPU, and account_system_time() sets the flag off when it adds this timeslice to cpustat (and compute correctly guest, user, system time). But I didn't like this idea because all code executed after we entered in the VCPU is accounted to the guest until we have an account_system_time() and I suppose we can have real system time in this part. And I guess a VCPU can be less than 1 ms (unit of cputime) in a timeslice. So ? What's best ? The normal user/system accounting has the same issue, no? Whereever we happen to land (kernel or user) gets the whole tick. So I think it is okay to have the same limitation for guest time. -- Do not meddle in the internals of kernels, for they are subtle and quick to panic. ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linux-foundation.org/mailman/listinfo/virtualization
Re: [kvm-devel] [PATCH/RFC 3/4]Introduce account modifiers mechanism
Avi Kivity wrote: Laurent Vivier wrote: - remove PATCH 3, and add in task_struct a ktime vtime where we accumulate guest time (by calling something like guest_enter() and guest_exit() from the virtualization engine), and when in account_system_time() we have cputime vtime we substrate vtime from cputime and add vtime to user time and guest time. But doing like this we freeze in kernel/sched.c the link between system time, user time and guest time (i.e. system time = system time - vtime, user time = user time + vtime and guest time = guest time + vtime). Actually, I think we can set a per-cpu in_guest flag for the scheduler code, which then knows to add the tick to the guest time. That seems the simplest possible solution. lguest or kvm would set the flag before running the guest (which is done with preempt disabled or using preemption hooks), and reset it afterwards. Thoughts? It was my first attempt (except I didn't have a per-cpu flag, but a per-task flag), it's not visible but I love simplicity... ;-) A KVM VCPU is stopped by preemption, so when we enter in scheduler we have exited from VCPU and thus this flags is off (so we account 0 to the guest). What I did then is set the flag on when we enter in the VCPU, and account_system_time() sets the flag off when it adds this timeslice to cpustat (and compute correctly guest, user, system time). But I didn't like this idea because all code executed after we entered in the VCPU is accounted to the guest until we have an account_system_time() and I suppose we can have real system time in this part. And I guess a VCPU can be less than 1 ms (unit of cputime) in a timeslice. So ? What's best ? The normal user/system accounting has the same issue, no? Whereever we happen to land (kernel or user) gets the whole tick. Yes... but perhaps I should rewrite this too ;-) So I think it is okay to have the same limitation for guest time. OK, so we can go back to my first patch. Who can decide to introduce this into the kernel ? Laurent -- - [EMAIL PROTECTED] -- Software is hard - Donald Knuth signature.asc Description: OpenPGP digital signature ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linux-foundation.org/mailman/listinfo/virtualization
Re: [kvm-devel] [PATCH/RFC 3/4]Introduce account modifiers mechanism
Am Freitag, 17. August 2007 schrieb Laurent Vivier: The normal user/system accounting has the same issue, no? Whereever we happen to land (kernel or user) gets the whole tick. Yes... but perhaps I should rewrite this too ;-) If you look further, you will see, that this was actually rewritten in 2.6.12 and thats why we have cputime_t. The infrastructure is currently only used by s390 and ppc64. On s390 we use our cpu timer to get the current time on each syscall/irq/context switch etc to get precise accounting data for system/user/irq/softirq/nice. We also get steal time (this is interesting for the guest: how much of my cpu was actually not available because the hypervisor scheduled me away). I dont know enough about other architectures to say which one could exploit this infrastructure as well. The current git tree is somewhat broken for CONFIG_VIRT_CPU_ACCOUTING due to the accouting changes introduced by CFS - we work on this. If you are interested in the cputime stuff, you can have a look at arch/s390/kernel/vtime.c (account_system_vtime, account_vtime) as well as: http://git.kernel.org/?p=linux/kernel/git/torvalds/old-2.6-bkcvs.git;a=commit;h=6be7071fdd321c206b1ee7a3e534782f25572830 for the first introduction and http://git.kernel.org/?p=linux/kernel/git/torvalds/old-2.6-bkcvs.git;a=commit;h=db59f519d37f80683f3611927f6b5c3bdfc0f9e6 for the s390 exploitation. Christian ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linux-foundation.org/mailman/listinfo/virtualization
Re: [kvm-devel] [PATCH/RFC 3/4]Introduce account modifiers mechanism
Avi Kivity wrote: [...] The normal user/system accounting has the same issue, no? Whereever we happen to land (kernel or user) gets the whole tick. So I think it is okay to have the same limitation for guest time. So this is how it looks like. PATCH 1 and 2 are always a prerequisite. Laurent Index: kvm/include/linux/sched.h === --- kvm.orig/include/linux/sched.h 2007-08-17 15:07:02.0 +0200 +++ kvm/include/linux/sched.h 2007-08-17 15:08:19.0 +0200 @@ -1310,6 +1310,7 @@ #define PF_STARTING0x0002 /* being created */ #define PF_EXITING 0x0004 /* getting shut down */ #define PF_EXITPIDONE 0x0008 /* pi exit done on shut down */ +#define PF_VCPU0x0010 /* I'm a virtual CPU */ #define PF_FORKNOEXEC 0x0040 /* forked but didn't exec */ #define PF_SUPERPRIV 0x0100 /* used super-user privileges */ #define PF_DUMPCORE0x0200 /* dumped core */ Index: kvm/kernel/sched.c === --- kvm.orig/kernel/sched.c 2007-08-17 14:42:43.0 +0200 +++ kvm/kernel/sched.c 2007-08-17 15:16:20.0 +0200 @@ -3246,10 +3246,22 @@ struct rq *rq = this_rq(); cputime64_t tmp; + tmp = cputime_to_cputime64(cputime); + if (p-flags PF_VCPU) { + p-utime = cputime_add(p-utime, cputime); + p-gtime = cputime_add(p-gtime, cputime); + + cpustat-guest = cputime64_add(cpustat-guest, tmp); + cpustat-user = cputime64_add(cpustat-user, tmp); + + p-flags = ~PF_VCPU; + + return; + } + p-stime = cputime_add(p-stime, cputime); /* Add system time to cpustat. */ - tmp = cputime_to_cputime64(cputime); if (hardirq_count() - hardirq_offset) cpustat-irq = cputime64_add(cpustat-irq, tmp); else if (softirq_count()) Index: kvm/drivers/kvm/kvm.h === --- kvm.orig/drivers/kvm/kvm.h 2007-08-17 15:26:16.0 +0200 +++ kvm/drivers/kvm/kvm.h 2007-08-17 15:29:46.0 +0200 @@ -589,6 +589,19 @@ int kvm_hypercall(struct kvm_vcpu *vcpu, struct kvm_run *run); +#ifndef PF_VCPU +#define PF_VCPU0 /* no kernel support */ +#endif + +static inline void kvm_guest_enter(void) +{ + current-flags |= PF_VCPU; +} + +static inline void kvm_guest_exit(void) +{ +} + static inline int kvm_mmu_page_fault(struct kvm_vcpu *vcpu, gva_t gva, u32 error_code) { Index: kvm/drivers/kvm/svm.c === --- kvm.orig/drivers/kvm/svm.c 2007-08-17 15:26:16.0 +0200 +++ kvm/drivers/kvm/svm.c 2007-08-17 15:27:03.0 +0200 @@ -1404,6 +1404,7 @@ clgi(); vcpu-guest_mode = 1; + kvm_guest_enter(); if (vcpu-requests) if (test_and_clear_bit(KVM_TLB_FLUSH, vcpu-requests)) svm_flush_tlb(vcpu); @@ -1536,6 +1537,7 @@ #endif : cc, memory ); + kvm_guest_exit(); vcpu-guest_mode = 0; if (vcpu-fpu_active) { Index: kvm/drivers/kvm/vmx.c === --- kvm.orig/drivers/kvm/vmx.c 2007-08-17 15:26:16.0 +0200 +++ kvm/drivers/kvm/vmx.c 2007-08-17 15:27:45.0 +0200 @@ -2078,6 +2078,7 @@ local_irq_disable(); vcpu-guest_mode = 1; + kvm_guest_enter(); if (vcpu-requests) if (test_and_clear_bit(KVM_TLB_FLUSH, vcpu-requests)) vmx_flush_tlb(vcpu); @@ -2198,6 +2199,7 @@ [cr2]i(offsetof(struct kvm_vcpu, cr2)) : cc, memory ); + kvm_guest_exit(); vcpu-guest_mode = 0; local_irq_enable(); ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linux-foundation.org/mailman/listinfo/virtualization