Re: KVM: MMU: Tracking guest writes through EPT entries ?
On Mon, Sep 3, 2012 at 1:11 AM, Xiao Guangrong xiaoguangr...@linux.vnet.ibm.com wrote: On 09/03/2012 10:09 AM, Hugo wrote: On Sun, Sep 2, 2012 at 8:29 AM, Xiao Guangrong xiaoguangr...@linux.vnet.ibm.com wrote: On 09/01/2012 05:30 AM, Hui Lin (Hugo) wrote: On Thu, Aug 30, 2012 at 9:54 PM, Xiao Guangrong xiaoguangr...@linux.vnet.ibm.com wrote: On 08/31/2012 02:59 AM, Hugo wrote: On Thu, Aug 30, 2012 at 5:22 AM, Xiao Guangrong xiaoguangr...@linux.vnet.ibm.com wrote: On 08/28/2012 11:30 AM, Felix wrote: Xiao Guangrong xiaoguangrong at linux.vnet.ibm.com writes: On 07/31/2012 01:18 AM, Sunil wrote: Hello List, I am a KVM newbie and studying KVM mmu code. On the existing guest, I am trying to track all guest writes by marking page table entry as read-only in EPT entry [ I am using Intel machine with vmx and ept support ]. Looks like EPT support re-uses shadow page table(SPT) code and hence some of SPT routines. I was thinking of below possible approach. Use pte_list_walk() to traverse through list of sptes and use mmu_spte_update() to flip the PT_WRITABLE_MASK flag. But all SPTEs are not part of any single list; but on separate lists (based on gfn, page level, memory_slot). So, recording all the faulted guest GFN and then using above method work ? There are two ways to write-protect all sptes: - use kvm_mmu_slot_remove_write_access() on all memslots - walk the shadow page cache to get the shadow pages in the highest level (level = 4 on EPT), then write-protect its entries. If you just want to do it for the specified gfn, you can use rmap_write_protect(). Just inquisitive, what is your purpose? :) -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majordomo at vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Hi, Guangrong, I have done similar things like Sunil did. Simply for study purpose. However, I found some very weird situations. Basically, in the guest vm, I allocate a chunk of memory (with size of a page) in a user level program. Through a guest kernel level module and my self defined hypercall, I pass the gva of this memory to kvm. Then I try different methods in the hypercall handler to write protect this page of memory. You can see that I want to write protect it through ETP instead of write protected in the guest page tables. 1. I use kvm_mmu_gva_to_gpa_read to translate the gva into gpa. Based on the function, kvm_mmu_get_spte_hierarchy(vcpu, gpa, spte[4]), I change the codes to read sptep (the pointer to spte) instead of spte, so I can modify the spte corresponding to this gpa. What I observe is that if I modify spte[0] (I think this is the lowest level page table entry corresponding to EPT table; I can successfully modify it as the changes are reflected in the result of calling kvm_mmu_get_spte_hierarchy again), but my user level program in vm can still write to this page. In your this blog post, you mentioned (the shadow pages in the highest level (level = 4 on EPT)), I don't understand this part. Does this mean I have to modify spte[3] instead of spte[0]? I just try modify spte[1] and spte[3], both can cause vmexit. So I am totally confused about the meaning of level used in shadow page table and its relations to shadow page table. Can you help me to understand this? 2. As suggested by this post, I also use rmap_write_protect() to write protect this page. With kvm_mmu_get_spte_hierarchy(vcpu, gpa, spte[4]), I still can see that spte[0] gives me xx005 such result, this means that the function is called successfully. But still I can write to this page. I even try the function kvm_age_hva() to remove this spte, this gives me 0 of spte[0], but I still can write to this page. So I am further confused about the level used in the shadow page? kvm_mmu_get_spte_hierarchy get sptes out of mmu-lock, you can hold spin_lock(vcpu-kvm-mmu_lock) and use for_each_shadow_entry instead. And, after change, did you flush all tlbs? I do apply the lock in my codes and I do flush tlb. If it can not work, please post your code. Here is my codes. The modifications are made in x86/x86.c in KVM_HC_HL_EPTPER is my hypercall number. Method 1: int kvm_emulate_hypercall(struct kvm_vcpu *vcpu){ case KVM_HC_HL_EPTPER : This method is not working localGpa = kvm_mmu_gva_to_gpa_read(vcpu, a0, localEx); if(localGpa == UNMAPPED_GVA){ printk(read is not correct\n); return -KVM_ENOSYS; } hl_kvm_mmu_update_spte(vcpu, localGpa, 5); hl_result = kvm_mmu_get_spte_hierarchy(vcpu, localGpa, hl_sptes); printk(after changes return result is %d , gpa: %llx sptes: %llx , %llx , %llx , %llx \n, hl_result,
Re: KVM: MMU: Tracking guest writes through EPT entries ?
On 09/03/2012 10:09 AM, Hugo wrote: On Sun, Sep 2, 2012 at 8:29 AM, Xiao Guangrong xiaoguangr...@linux.vnet.ibm.com wrote: On 09/01/2012 05:30 AM, Hui Lin (Hugo) wrote: On Thu, Aug 30, 2012 at 9:54 PM, Xiao Guangrong xiaoguangr...@linux.vnet.ibm.com wrote: On 08/31/2012 02:59 AM, Hugo wrote: On Thu, Aug 30, 2012 at 5:22 AM, Xiao Guangrong xiaoguangr...@linux.vnet.ibm.com wrote: On 08/28/2012 11:30 AM, Felix wrote: Xiao Guangrong xiaoguangrong at linux.vnet.ibm.com writes: On 07/31/2012 01:18 AM, Sunil wrote: Hello List, I am a KVM newbie and studying KVM mmu code. On the existing guest, I am trying to track all guest writes by marking page table entry as read-only in EPT entry [ I am using Intel machine with vmx and ept support ]. Looks like EPT support re-uses shadow page table(SPT) code and hence some of SPT routines. I was thinking of below possible approach. Use pte_list_walk() to traverse through list of sptes and use mmu_spte_update() to flip the PT_WRITABLE_MASK flag. But all SPTEs are not part of any single list; but on separate lists (based on gfn, page level, memory_slot). So, recording all the faulted guest GFN and then using above method work ? There are two ways to write-protect all sptes: - use kvm_mmu_slot_remove_write_access() on all memslots - walk the shadow page cache to get the shadow pages in the highest level (level = 4 on EPT), then write-protect its entries. If you just want to do it for the specified gfn, you can use rmap_write_protect(). Just inquisitive, what is your purpose? :) -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majordomo at vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Hi, Guangrong, I have done similar things like Sunil did. Simply for study purpose. However, I found some very weird situations. Basically, in the guest vm, I allocate a chunk of memory (with size of a page) in a user level program. Through a guest kernel level module and my self defined hypercall, I pass the gva of this memory to kvm. Then I try different methods in the hypercall handler to write protect this page of memory. You can see that I want to write protect it through ETP instead of write protected in the guest page tables. 1. I use kvm_mmu_gva_to_gpa_read to translate the gva into gpa. Based on the function, kvm_mmu_get_spte_hierarchy(vcpu, gpa, spte[4]), I change the codes to read sptep (the pointer to spte) instead of spte, so I can modify the spte corresponding to this gpa. What I observe is that if I modify spte[0] (I think this is the lowest level page table entry corresponding to EPT table; I can successfully modify it as the changes are reflected in the result of calling kvm_mmu_get_spte_hierarchy again), but my user level program in vm can still write to this page. In your this blog post, you mentioned (the shadow pages in the highest level (level = 4 on EPT)), I don't understand this part. Does this mean I have to modify spte[3] instead of spte[0]? I just try modify spte[1] and spte[3], both can cause vmexit. So I am totally confused about the meaning of level used in shadow page table and its relations to shadow page table. Can you help me to understand this? 2. As suggested by this post, I also use rmap_write_protect() to write protect this page. With kvm_mmu_get_spte_hierarchy(vcpu, gpa, spte[4]), I still can see that spte[0] gives me xx005 such result, this means that the function is called successfully. But still I can write to this page. I even try the function kvm_age_hva() to remove this spte, this gives me 0 of spte[0], but I still can write to this page. So I am further confused about the level used in the shadow page? kvm_mmu_get_spte_hierarchy get sptes out of mmu-lock, you can hold spin_lock(vcpu-kvm-mmu_lock) and use for_each_shadow_entry instead. And, after change, did you flush all tlbs? I do apply the lock in my codes and I do flush tlb. If it can not work, please post your code. Here is my codes. The modifications are made in x86/x86.c in KVM_HC_HL_EPTPER is my hypercall number. Method 1: int kvm_emulate_hypercall(struct kvm_vcpu *vcpu){ case KVM_HC_HL_EPTPER : This method is not working localGpa = kvm_mmu_gva_to_gpa_read(vcpu, a0, localEx); if(localGpa == UNMAPPED_GVA){ printk(read is not correct\n); return -KVM_ENOSYS; } hl_kvm_mmu_update_spte(vcpu, localGpa, 5); hl_result = kvm_mmu_get_spte_hierarchy(vcpu, localGpa, hl_sptes); printk(after changes return result is %d , gpa: %llx sptes: %llx , %llx , %llx , %llx \n, hl_result, localGpa, hl_sptes[0], hl_sptes[1], hl_sptes[2], hl_sptes[3]);
Re: KVM: MMU: Tracking guest writes through EPT entries ?
On 09/01/2012 05:30 AM, Hui Lin (Hugo) wrote: On Thu, Aug 30, 2012 at 9:54 PM, Xiao Guangrong xiaoguangr...@linux.vnet.ibm.com wrote: On 08/31/2012 02:59 AM, Hugo wrote: On Thu, Aug 30, 2012 at 5:22 AM, Xiao Guangrong xiaoguangr...@linux.vnet.ibm.com wrote: On 08/28/2012 11:30 AM, Felix wrote: Xiao Guangrong xiaoguangrong at linux.vnet.ibm.com writes: On 07/31/2012 01:18 AM, Sunil wrote: Hello List, I am a KVM newbie and studying KVM mmu code. On the existing guest, I am trying to track all guest writes by marking page table entry as read-only in EPT entry [ I am using Intel machine with vmx and ept support ]. Looks like EPT support re-uses shadow page table(SPT) code and hence some of SPT routines. I was thinking of below possible approach. Use pte_list_walk() to traverse through list of sptes and use mmu_spte_update() to flip the PT_WRITABLE_MASK flag. But all SPTEs are not part of any single list; but on separate lists (based on gfn, page level, memory_slot). So, recording all the faulted guest GFN and then using above method work ? There are two ways to write-protect all sptes: - use kvm_mmu_slot_remove_write_access() on all memslots - walk the shadow page cache to get the shadow pages in the highest level (level = 4 on EPT), then write-protect its entries. If you just want to do it for the specified gfn, you can use rmap_write_protect(). Just inquisitive, what is your purpose? :) -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majordomo at vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Hi, Guangrong, I have done similar things like Sunil did. Simply for study purpose. However, I found some very weird situations. Basically, in the guest vm, I allocate a chunk of memory (with size of a page) in a user level program. Through a guest kernel level module and my self defined hypercall, I pass the gva of this memory to kvm. Then I try different methods in the hypercall handler to write protect this page of memory. You can see that I want to write protect it through ETP instead of write protected in the guest page tables. 1. I use kvm_mmu_gva_to_gpa_read to translate the gva into gpa. Based on the function, kvm_mmu_get_spte_hierarchy(vcpu, gpa, spte[4]), I change the codes to read sptep (the pointer to spte) instead of spte, so I can modify the spte corresponding to this gpa. What I observe is that if I modify spte[0] (I think this is the lowest level page table entry corresponding to EPT table; I can successfully modify it as the changes are reflected in the result of calling kvm_mmu_get_spte_hierarchy again), but my user level program in vm can still write to this page. In your this blog post, you mentioned (the shadow pages in the highest level (level = 4 on EPT)), I don't understand this part. Does this mean I have to modify spte[3] instead of spte[0]? I just try modify spte[1] and spte[3], both can cause vmexit. So I am totally confused about the meaning of level used in shadow page table and its relations to shadow page table. Can you help me to understand this? 2. As suggested by this post, I also use rmap_write_protect() to write protect this page. With kvm_mmu_get_spte_hierarchy(vcpu, gpa, spte[4]), I still can see that spte[0] gives me xx005 such result, this means that the function is called successfully. But still I can write to this page. I even try the function kvm_age_hva() to remove this spte, this gives me 0 of spte[0], but I still can write to this page. So I am further confused about the level used in the shadow page? kvm_mmu_get_spte_hierarchy get sptes out of mmu-lock, you can hold spin_lock(vcpu-kvm-mmu_lock) and use for_each_shadow_entry instead. And, after change, did you flush all tlbs? I do apply the lock in my codes and I do flush tlb. If it can not work, please post your code. Here is my codes. The modifications are made in x86/x86.c in KVM_HC_HL_EPTPER is my hypercall number. Method 1: int kvm_emulate_hypercall(struct kvm_vcpu *vcpu){ case KVM_HC_HL_EPTPER : This method is not working localGpa = kvm_mmu_gva_to_gpa_read(vcpu, a0, localEx); if(localGpa == UNMAPPED_GVA){ printk(read is not correct\n); return -KVM_ENOSYS; } hl_kvm_mmu_update_spte(vcpu, localGpa, 5); hl_result = kvm_mmu_get_spte_hierarchy(vcpu, localGpa, hl_sptes); printk(after changes return result is %d , gpa: %llx sptes: %llx , %llx , %llx , %llx \n, hl_result, localGpa, hl_sptes[0], hl_sptes[1], hl_sptes[2], hl_sptes[3]); kvm_flush_remote_tlbs(vcpu-kvm); ... } The function hl_kvm_mmu_update_spte is defined as int
Re: KVM: MMU: Tracking guest writes through EPT entries ?
On Sun, Sep 2, 2012 at 8:29 AM, Xiao Guangrong xiaoguangr...@linux.vnet.ibm.com wrote: On 09/01/2012 05:30 AM, Hui Lin (Hugo) wrote: On Thu, Aug 30, 2012 at 9:54 PM, Xiao Guangrong xiaoguangr...@linux.vnet.ibm.com wrote: On 08/31/2012 02:59 AM, Hugo wrote: On Thu, Aug 30, 2012 at 5:22 AM, Xiao Guangrong xiaoguangr...@linux.vnet.ibm.com wrote: On 08/28/2012 11:30 AM, Felix wrote: Xiao Guangrong xiaoguangrong at linux.vnet.ibm.com writes: On 07/31/2012 01:18 AM, Sunil wrote: Hello List, I am a KVM newbie and studying KVM mmu code. On the existing guest, I am trying to track all guest writes by marking page table entry as read-only in EPT entry [ I am using Intel machine with vmx and ept support ]. Looks like EPT support re-uses shadow page table(SPT) code and hence some of SPT routines. I was thinking of below possible approach. Use pte_list_walk() to traverse through list of sptes and use mmu_spte_update() to flip the PT_WRITABLE_MASK flag. But all SPTEs are not part of any single list; but on separate lists (based on gfn, page level, memory_slot). So, recording all the faulted guest GFN and then using above method work ? There are two ways to write-protect all sptes: - use kvm_mmu_slot_remove_write_access() on all memslots - walk the shadow page cache to get the shadow pages in the highest level (level = 4 on EPT), then write-protect its entries. If you just want to do it for the specified gfn, you can use rmap_write_protect(). Just inquisitive, what is your purpose? :) -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majordomo at vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Hi, Guangrong, I have done similar things like Sunil did. Simply for study purpose. However, I found some very weird situations. Basically, in the guest vm, I allocate a chunk of memory (with size of a page) in a user level program. Through a guest kernel level module and my self defined hypercall, I pass the gva of this memory to kvm. Then I try different methods in the hypercall handler to write protect this page of memory. You can see that I want to write protect it through ETP instead of write protected in the guest page tables. 1. I use kvm_mmu_gva_to_gpa_read to translate the gva into gpa. Based on the function, kvm_mmu_get_spte_hierarchy(vcpu, gpa, spte[4]), I change the codes to read sptep (the pointer to spte) instead of spte, so I can modify the spte corresponding to this gpa. What I observe is that if I modify spte[0] (I think this is the lowest level page table entry corresponding to EPT table; I can successfully modify it as the changes are reflected in the result of calling kvm_mmu_get_spte_hierarchy again), but my user level program in vm can still write to this page. In your this blog post, you mentioned (the shadow pages in the highest level (level = 4 on EPT)), I don't understand this part. Does this mean I have to modify spte[3] instead of spte[0]? I just try modify spte[1] and spte[3], both can cause vmexit. So I am totally confused about the meaning of level used in shadow page table and its relations to shadow page table. Can you help me to understand this? 2. As suggested by this post, I also use rmap_write_protect() to write protect this page. With kvm_mmu_get_spte_hierarchy(vcpu, gpa, spte[4]), I still can see that spte[0] gives me xx005 such result, this means that the function is called successfully. But still I can write to this page. I even try the function kvm_age_hva() to remove this spte, this gives me 0 of spte[0], but I still can write to this page. So I am further confused about the level used in the shadow page? kvm_mmu_get_spte_hierarchy get sptes out of mmu-lock, you can hold spin_lock(vcpu-kvm-mmu_lock) and use for_each_shadow_entry instead. And, after change, did you flush all tlbs? I do apply the lock in my codes and I do flush tlb. If it can not work, please post your code. Here is my codes. The modifications are made in x86/x86.c in KVM_HC_HL_EPTPER is my hypercall number. Method 1: int kvm_emulate_hypercall(struct kvm_vcpu *vcpu){ case KVM_HC_HL_EPTPER : This method is not working localGpa = kvm_mmu_gva_to_gpa_read(vcpu, a0, localEx); if(localGpa == UNMAPPED_GVA){ printk(read is not correct\n); return -KVM_ENOSYS; } hl_kvm_mmu_update_spte(vcpu, localGpa, 5); hl_result = kvm_mmu_get_spte_hierarchy(vcpu, localGpa, hl_sptes); printk(after changes return result is %d , gpa: %llx sptes: %llx , %llx , %llx , %llx \n, hl_result, localGpa, hl_sptes[0], hl_sptes[1], hl_sptes[2], hl_sptes[3]); kvm_flush_remote_tlbs(vcpu-kvm);
Re: KVM: MMU: Tracking guest writes through EPT entries ?
On Thu, Aug 30, 2012 at 9:54 PM, Xiao Guangrong xiaoguangr...@linux.vnet.ibm.com wrote: On 08/31/2012 02:59 AM, Hugo wrote: On Thu, Aug 30, 2012 at 5:22 AM, Xiao Guangrong xiaoguangr...@linux.vnet.ibm.com wrote: On 08/28/2012 11:30 AM, Felix wrote: Xiao Guangrong xiaoguangrong at linux.vnet.ibm.com writes: On 07/31/2012 01:18 AM, Sunil wrote: Hello List, I am a KVM newbie and studying KVM mmu code. On the existing guest, I am trying to track all guest writes by marking page table entry as read-only in EPT entry [ I am using Intel machine with vmx and ept support ]. Looks like EPT support re-uses shadow page table(SPT) code and hence some of SPT routines. I was thinking of below possible approach. Use pte_list_walk() to traverse through list of sptes and use mmu_spte_update() to flip the PT_WRITABLE_MASK flag. But all SPTEs are not part of any single list; but on separate lists (based on gfn, page level, memory_slot). So, recording all the faulted guest GFN and then using above method work ? There are two ways to write-protect all sptes: - use kvm_mmu_slot_remove_write_access() on all memslots - walk the shadow page cache to get the shadow pages in the highest level (level = 4 on EPT), then write-protect its entries. If you just want to do it for the specified gfn, you can use rmap_write_protect(). Just inquisitive, what is your purpose? :) -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majordomo at vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Hi, Guangrong, I have done similar things like Sunil did. Simply for study purpose. However, I found some very weird situations. Basically, in the guest vm, I allocate a chunk of memory (with size of a page) in a user level program. Through a guest kernel level module and my self defined hypercall, I pass the gva of this memory to kvm. Then I try different methods in the hypercall handler to write protect this page of memory. You can see that I want to write protect it through ETP instead of write protected in the guest page tables. 1. I use kvm_mmu_gva_to_gpa_read to translate the gva into gpa. Based on the function, kvm_mmu_get_spte_hierarchy(vcpu, gpa, spte[4]), I change the codes to read sptep (the pointer to spte) instead of spte, so I can modify the spte corresponding to this gpa. What I observe is that if I modify spte[0] (I think this is the lowest level page table entry corresponding to EPT table; I can successfully modify it as the changes are reflected in the result of calling kvm_mmu_get_spte_hierarchy again), but my user level program in vm can still write to this page. In your this blog post, you mentioned (the shadow pages in the highest level (level = 4 on EPT)), I don't understand this part. Does this mean I have to modify spte[3] instead of spte[0]? I just try modify spte[1] and spte[3], both can cause vmexit. So I am totally confused about the meaning of level used in shadow page table and its relations to shadow page table. Can you help me to understand this? 2. As suggested by this post, I also use rmap_write_protect() to write protect this page. With kvm_mmu_get_spte_hierarchy(vcpu, gpa, spte[4]), I still can see that spte[0] gives me xx005 such result, this means that the function is called successfully. But still I can write to this page. I even try the function kvm_age_hva() to remove this spte, this gives me 0 of spte[0], but I still can write to this page. So I am further confused about the level used in the shadow page? kvm_mmu_get_spte_hierarchy get sptes out of mmu-lock, you can hold spin_lock(vcpu-kvm-mmu_lock) and use for_each_shadow_entry instead. And, after change, did you flush all tlbs? I do apply the lock in my codes and I do flush tlb. If it can not work, please post your code. Here is my codes. The modifications are made in x86/x86.c in KVM_HC_HL_EPTPER is my hypercall number. Method 1: int kvm_emulate_hypercall(struct kvm_vcpu *vcpu){ case KVM_HC_HL_EPTPER : This method is not working localGpa = kvm_mmu_gva_to_gpa_read(vcpu, a0, localEx); if(localGpa == UNMAPPED_GVA){ printk(read is not correct\n); return -KVM_ENOSYS; } hl_kvm_mmu_update_spte(vcpu, localGpa, 5); hl_result = kvm_mmu_get_spte_hierarchy(vcpu, localGpa, hl_sptes); printk(after changes return result is %d , gpa: %llx sptes: %llx , %llx , %llx , %llx \n, hl_result, localGpa, hl_sptes[0], hl_sptes[1], hl_sptes[2], hl_sptes[3]); kvm_flush_remote_tlbs(vcpu-kvm); ... } The function hl_kvm_mmu_update_spte is defined as int hl_kvm_mmu_update_spte(struct kvm_vcpu *vcpu, u64 addr,
Re: KVM: MMU: Tracking guest writes through EPT entries ?
On 08/28/2012 11:30 AM, Felix wrote: Xiao Guangrong xiaoguangrong at linux.vnet.ibm.com writes: On 07/31/2012 01:18 AM, Sunil wrote: Hello List, I am a KVM newbie and studying KVM mmu code. On the existing guest, I am trying to track all guest writes by marking page table entry as read-only in EPT entry [ I am using Intel machine with vmx and ept support ]. Looks like EPT support re-uses shadow page table(SPT) code and hence some of SPT routines. I was thinking of below possible approach. Use pte_list_walk() to traverse through list of sptes and use mmu_spte_update() to flip the PT_WRITABLE_MASK flag. But all SPTEs are not part of any single list; but on separate lists (based on gfn, page level, memory_slot). So, recording all the faulted guest GFN and then using above method work ? There are two ways to write-protect all sptes: - use kvm_mmu_slot_remove_write_access() on all memslots - walk the shadow page cache to get the shadow pages in the highest level (level = 4 on EPT), then write-protect its entries. If you just want to do it for the specified gfn, you can use rmap_write_protect(). Just inquisitive, what is your purpose? :) -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majordomo at vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Hi, Guangrong, I have done similar things like Sunil did. Simply for study purpose. However, I found some very weird situations. Basically, in the guest vm, I allocate a chunk of memory (with size of a page) in a user level program. Through a guest kernel level module and my self defined hypercall, I pass the gva of this memory to kvm. Then I try different methods in the hypercall handler to write protect this page of memory. You can see that I want to write protect it through ETP instead of write protected in the guest page tables. 1. I use kvm_mmu_gva_to_gpa_read to translate the gva into gpa. Based on the function, kvm_mmu_get_spte_hierarchy(vcpu, gpa, spte[4]), I change the codes to read sptep (the pointer to spte) instead of spte, so I can modify the spte corresponding to this gpa. What I observe is that if I modify spte[0] (I think this is the lowest level page table entry corresponding to EPT table; I can successfully modify it as the changes are reflected in the result of calling kvm_mmu_get_spte_hierarchy again), but my user level program in vm can still write to this page. In your this blog post, you mentioned (the shadow pages in the highest level (level = 4 on EPT)), I don't understand this part. Does this mean I have to modify spte[3] instead of spte[0]? I just try modify spte[1] and spte[3], both can cause vmexit. So I am totally confused about the meaning of level used in shadow page table and its relations to shadow page table. Can you help me to understand this? 2. As suggested by this post, I also use rmap_write_protect() to write protect this page. With kvm_mmu_get_spte_hierarchy(vcpu, gpa, spte[4]), I still can see that spte[0] gives me xx005 such result, this means that the function is called successfully. But still I can write to this page. I even try the function kvm_age_hva() to remove this spte, this gives me 0 of spte[0], but I still can write to this page. So I am further confused about the level used in the shadow page? kvm_mmu_get_spte_hierarchy get sptes out of mmu-lock, you can hold spin_lock(vcpu-kvm-mmu_lock) and use for_each_shadow_entry instead. And, after change, did you flush all tlbs? If it can not work, please post your code. -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: KVM: MMU: Tracking guest writes through EPT entries ?
On Thu, Aug 30, 2012 at 5:22 AM, Xiao Guangrong xiaoguangr...@linux.vnet.ibm.com wrote: On 08/28/2012 11:30 AM, Felix wrote: Xiao Guangrong xiaoguangrong at linux.vnet.ibm.com writes: On 07/31/2012 01:18 AM, Sunil wrote: Hello List, I am a KVM newbie and studying KVM mmu code. On the existing guest, I am trying to track all guest writes by marking page table entry as read-only in EPT entry [ I am using Intel machine with vmx and ept support ]. Looks like EPT support re-uses shadow page table(SPT) code and hence some of SPT routines. I was thinking of below possible approach. Use pte_list_walk() to traverse through list of sptes and use mmu_spte_update() to flip the PT_WRITABLE_MASK flag. But all SPTEs are not part of any single list; but on separate lists (based on gfn, page level, memory_slot). So, recording all the faulted guest GFN and then using above method work ? There are two ways to write-protect all sptes: - use kvm_mmu_slot_remove_write_access() on all memslots - walk the shadow page cache to get the shadow pages in the highest level (level = 4 on EPT), then write-protect its entries. If you just want to do it for the specified gfn, you can use rmap_write_protect(). Just inquisitive, what is your purpose? :) -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majordomo at vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Hi, Guangrong, I have done similar things like Sunil did. Simply for study purpose. However, I found some very weird situations. Basically, in the guest vm, I allocate a chunk of memory (with size of a page) in a user level program. Through a guest kernel level module and my self defined hypercall, I pass the gva of this memory to kvm. Then I try different methods in the hypercall handler to write protect this page of memory. You can see that I want to write protect it through ETP instead of write protected in the guest page tables. 1. I use kvm_mmu_gva_to_gpa_read to translate the gva into gpa. Based on the function, kvm_mmu_get_spte_hierarchy(vcpu, gpa, spte[4]), I change the codes to read sptep (the pointer to spte) instead of spte, so I can modify the spte corresponding to this gpa. What I observe is that if I modify spte[0] (I think this is the lowest level page table entry corresponding to EPT table; I can successfully modify it as the changes are reflected in the result of calling kvm_mmu_get_spte_hierarchy again), but my user level program in vm can still write to this page. In your this blog post, you mentioned (the shadow pages in the highest level (level = 4 on EPT)), I don't understand this part. Does this mean I have to modify spte[3] instead of spte[0]? I just try modify spte[1] and spte[3], both can cause vmexit. So I am totally confused about the meaning of level used in shadow page table and its relations to shadow page table. Can you help me to understand this? 2. As suggested by this post, I also use rmap_write_protect() to write protect this page. With kvm_mmu_get_spte_hierarchy(vcpu, gpa, spte[4]), I still can see that spte[0] gives me xx005 such result, this means that the function is called successfully. But still I can write to this page. I even try the function kvm_age_hva() to remove this spte, this gives me 0 of spte[0], but I still can write to this page. So I am further confused about the level used in the shadow page? kvm_mmu_get_spte_hierarchy get sptes out of mmu-lock, you can hold spin_lock(vcpu-kvm-mmu_lock) and use for_each_shadow_entry instead. And, after change, did you flush all tlbs? I do apply the lock in my codes and I do flush tlb. If it can not work, please post your code. Here is my codes. The modifications are made in x86/x86.c in KVM_HC_HL_EPTPER is my hypercall number. Method 1: int kvm_emulate_hypercall(struct kvm_vcpu *vcpu){ case KVM_HC_HL_EPTPER : This method is not working localGpa = kvm_mmu_gva_to_gpa_read(vcpu, a0, localEx); if(localGpa == UNMAPPED_GVA){ printk(read is not correct\n); return -KVM_ENOSYS; } hl_kvm_mmu_update_spte(vcpu, localGpa, 5); hl_result = kvm_mmu_get_spte_hierarchy(vcpu, localGpa, hl_sptes); printk(after changes return result is %d , gpa: %llx sptes: %llx , %llx , %llx , %llx \n, hl_result, localGpa, hl_sptes[0], hl_sptes[1], hl_sptes[2], hl_sptes[3]); kvm_flush_remote_tlbs(vcpu-kvm); ... } The function hl_kvm_mmu_update_spte is defined as int hl_kvm_mmu_update_spte(struct kvm_vcpu *vcpu, u64 addr, u64 mask) { struct kvm_shadow_walk_iterator iterator; int nr_sptes = 0; u64 sptes[4]; u64* sptep[4]; u64 localMask =
Re: KVM: MMU: Tracking guest writes through EPT entries ?
On Tue, 2012-07-31 at 14:53 -0400, Sunil Agham wrote: On Mon, Jul 30, 2012 at 10:49 PM, Xiao Guangrong xiaoguangr...@linux.vnet.ibm.com wrote: On 07/31/2012 01:18 AM, Sunil wrote: Hello List, I am a KVM newbie and studying KVM mmu code. On the existing guest, I am trying to track all guest writes by marking page table entry as read-only in EPT entry [ I am using Intel machine with vmx and ept support ]. Looks like EPT support re-uses shadow page table(SPT) code and hence some of SPT routines. I was thinking of below possible approach. Use pte_list_walk() to traverse through list of sptes and use mmu_spte_update() to flip the PT_WRITABLE_MASK flag. But all SPTEs are not part of any single list; but on separate lists (based on gfn, page level, memory_slot). So, recording all the faulted guest GFN and then using above method work ? There are two ways to write-protect all sptes: - use kvm_mmu_slot_remove_write_access() on all memslots - walk the shadow page cache to get the shadow pages in the highest level (level = 4 on EPT), then write-protect its entries. If you just want to do it for the specified gfn, you can use rmap_write_protect(). Just inquisitive, what is your purpose? :) Thanks Xiao ! Just hands on with virtualization hardware. Trying to preserve guest state after migration. This is actually a very common technique for tracking guest pfn accesses -- triggering false page faults by write protecting the page tables. I used a similar approach to compute guest working sets and build miss rate curves. - Davidlohr -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: KVM: MMU: Tracking guest writes through EPT entries ?
On 08/31/2012 02:59 AM, Hugo wrote: On Thu, Aug 30, 2012 at 5:22 AM, Xiao Guangrong xiaoguangr...@linux.vnet.ibm.com wrote: On 08/28/2012 11:30 AM, Felix wrote: Xiao Guangrong xiaoguangrong at linux.vnet.ibm.com writes: On 07/31/2012 01:18 AM, Sunil wrote: Hello List, I am a KVM newbie and studying KVM mmu code. On the existing guest, I am trying to track all guest writes by marking page table entry as read-only in EPT entry [ I am using Intel machine with vmx and ept support ]. Looks like EPT support re-uses shadow page table(SPT) code and hence some of SPT routines. I was thinking of below possible approach. Use pte_list_walk() to traverse through list of sptes and use mmu_spte_update() to flip the PT_WRITABLE_MASK flag. But all SPTEs are not part of any single list; but on separate lists (based on gfn, page level, memory_slot). So, recording all the faulted guest GFN and then using above method work ? There are two ways to write-protect all sptes: - use kvm_mmu_slot_remove_write_access() on all memslots - walk the shadow page cache to get the shadow pages in the highest level (level = 4 on EPT), then write-protect its entries. If you just want to do it for the specified gfn, you can use rmap_write_protect(). Just inquisitive, what is your purpose? :) -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majordomo at vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Hi, Guangrong, I have done similar things like Sunil did. Simply for study purpose. However, I found some very weird situations. Basically, in the guest vm, I allocate a chunk of memory (with size of a page) in a user level program. Through a guest kernel level module and my self defined hypercall, I pass the gva of this memory to kvm. Then I try different methods in the hypercall handler to write protect this page of memory. You can see that I want to write protect it through ETP instead of write protected in the guest page tables. 1. I use kvm_mmu_gva_to_gpa_read to translate the gva into gpa. Based on the function, kvm_mmu_get_spte_hierarchy(vcpu, gpa, spte[4]), I change the codes to read sptep (the pointer to spte) instead of spte, so I can modify the spte corresponding to this gpa. What I observe is that if I modify spte[0] (I think this is the lowest level page table entry corresponding to EPT table; I can successfully modify it as the changes are reflected in the result of calling kvm_mmu_get_spte_hierarchy again), but my user level program in vm can still write to this page. In your this blog post, you mentioned (the shadow pages in the highest level (level = 4 on EPT)), I don't understand this part. Does this mean I have to modify spte[3] instead of spte[0]? I just try modify spte[1] and spte[3], both can cause vmexit. So I am totally confused about the meaning of level used in shadow page table and its relations to shadow page table. Can you help me to understand this? 2. As suggested by this post, I also use rmap_write_protect() to write protect this page. With kvm_mmu_get_spte_hierarchy(vcpu, gpa, spte[4]), I still can see that spte[0] gives me xx005 such result, this means that the function is called successfully. But still I can write to this page. I even try the function kvm_age_hva() to remove this spte, this gives me 0 of spte[0], but I still can write to this page. So I am further confused about the level used in the shadow page? kvm_mmu_get_spte_hierarchy get sptes out of mmu-lock, you can hold spin_lock(vcpu-kvm-mmu_lock) and use for_each_shadow_entry instead. And, after change, did you flush all tlbs? I do apply the lock in my codes and I do flush tlb. If it can not work, please post your code. Here is my codes. The modifications are made in x86/x86.c in KVM_HC_HL_EPTPER is my hypercall number. Method 1: int kvm_emulate_hypercall(struct kvm_vcpu *vcpu){ case KVM_HC_HL_EPTPER : This method is not working localGpa = kvm_mmu_gva_to_gpa_read(vcpu, a0, localEx); if(localGpa == UNMAPPED_GVA){ printk(read is not correct\n); return -KVM_ENOSYS; } hl_kvm_mmu_update_spte(vcpu, localGpa, 5); hl_result = kvm_mmu_get_spte_hierarchy(vcpu, localGpa, hl_sptes); printk(after changes return result is %d , gpa: %llx sptes: %llx , %llx , %llx , %llx \n, hl_result, localGpa, hl_sptes[0], hl_sptes[1], hl_sptes[2], hl_sptes[3]); kvm_flush_remote_tlbs(vcpu-kvm); ... } The function hl_kvm_mmu_update_spte is defined as int hl_kvm_mmu_update_spte(struct kvm_vcpu *vcpu, u64 addr, u64 mask) { struct kvm_shadow_walk_iterator iterator; int nr_sptes = 0;
Re: KVM: MMU: Tracking guest writes through EPT entries ?
Xiao Guangrong xiaoguangrong at linux.vnet.ibm.com writes: On 07/31/2012 01:18 AM, Sunil wrote: Hello List, I am a KVM newbie and studying KVM mmu code. On the existing guest, I am trying to track all guest writes by marking page table entry as read-only in EPT entry [ I am using Intel machine with vmx and ept support ]. Looks like EPT support re-uses shadow page table(SPT) code and hence some of SPT routines. I was thinking of below possible approach. Use pte_list_walk() to traverse through list of sptes and use mmu_spte_update() to flip the PT_WRITABLE_MASK flag. But all SPTEs are not part of any single list; but on separate lists (based on gfn, page level, memory_slot). So, recording all the faulted guest GFN and then using above method work ? There are two ways to write-protect all sptes: - use kvm_mmu_slot_remove_write_access() on all memslots - walk the shadow page cache to get the shadow pages in the highest level (level = 4 on EPT), then write-protect its entries. If you just want to do it for the specified gfn, you can use rmap_write_protect(). Just inquisitive, what is your purpose? :) -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majordomo at vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Hi, Guangrong, I have done similar things like Sunil did. Simply for study purpose. However, I found some very weird situations. Basically, in the guest vm, I allocate a chunk of memory (with size of a page) in a user level program. Through a guest kernel level module and my self defined hypercall, I pass the gva of this memory to kvm. Then I try different methods in the hypercall handler to write protect this page of memory. You can see that I want to write protect it through ETP instead of write protected in the guest page tables. 1. I use kvm_mmu_gva_to_gpa_read to translate the gva into gpa. Based on the function, kvm_mmu_get_spte_hierarchy(vcpu, gpa, spte[4]), I change the codes to read sptep (the pointer to spte) instead of spte, so I can modify the spte corresponding to this gpa. What I observe is that if I modify spte[0] (I think this is the lowest level page table entry corresponding to EPT table; I can successfully modify it as the changes are reflected in the result of calling kvm_mmu_get_spte_hierarchy again), but my user level program in vm can still write to this page. In your this blog post, you mentioned (the shadow pages in the highest level (level = 4 on EPT)), I don't understand this part. Does this mean I have to modify spte[3] instead of spte[0]? I just try modify spte[1] and spte[3], both can cause vmexit. So I am totally confused about the meaning of level used in shadow page table and its relations to shadow page table. Can you help me to understand this? 2. As suggested by this post, I also use rmap_write_protect() to write protect this page. With kvm_mmu_get_spte_hierarchy(vcpu, gpa, spte[4]), I still can see that spte[0] gives me xx005 such result, this means that the function is called successfully. But still I can write to this page. I even try the function kvm_age_hva() to remove this spte, this gives me 0 of spte[0], but I still can write to this page. So I am further confused about the level used in the shadow page? Really thanks and appreciate your reply. Felix -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: KVM: MMU: Tracking guest writes through EPT entries
Hi, I have done similar things posted in http://article.gmane.org/gmane.comp.emulators.kvm.devel/95342/match=tracking+guest+writes+ept . However, I found some very weird situations. Basically, in the guest vm, I allocate a chunk of memory (with size of a page) in a user level program. Through a guest kernel level module and my self defined hypercall, I pass the gva of this memory to kvm. Then I try different methods in the hypercall handler to write protect this page of memory. You can see that I want to write protect it through ETP instead of write protected in the guest page tables. 1. I use kvm_mmu_gva_to_gpa_read to translate the gva into gpa. Based on the function, kvm_mmu_get_spte_hierarchy(vcpu, gpa, spte[4]), I change the codes to read sptep (the pointer to spte) instead of spte, so I can modify the spte corresponding to this gpa. What I observe is that if I modify spte[0] (I think this is the lowest level page table entry corresponding to EPT table; I can successfully modify it as the changes are reflected in the result of calling kvm_mmu_get_spte_hierarchy again), but my user level program in vm can still write to this page. In this post, it mentioned (the shadow pages in the highest level (level = 4 on EPT)), I don't understand this part. Does this mean I have to modify spte[3] instead of spte[0]? I just try modify spte[1] and spte[3], both can cause vmexit. So I am totally confused about the meaning of level used in shadow page table and its relations to shadow page table. Can you help me to understand this? 2. As suggested by this post, I also use rmap_write_protect() to write protect this page. With kvm_mmu_get_spte_hierarchy(vcpu, gpa, spte[4]), I still can see that spte[0] gives me results like xx005, this means that the function is called successfully and write protected bit is cleared in pte. But still I can write to this page. I even try the function kvm_age_hva() to remove this spte, this gives me 0 of spte[0], but I still can write to this page. So I am further confused about the level used in the shadow page? Really thanks and appreciate your reply. Hugo -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: KVM: MMU: Tracking guest writes through EPT entries ?
On Mon, Jul 30, 2012 at 10:49 PM, Xiao Guangrong xiaoguangr...@linux.vnet.ibm.com wrote: On 07/31/2012 01:18 AM, Sunil wrote: Hello List, I am a KVM newbie and studying KVM mmu code. On the existing guest, I am trying to track all guest writes by marking page table entry as read-only in EPT entry [ I am using Intel machine with vmx and ept support ]. Looks like EPT support re-uses shadow page table(SPT) code and hence some of SPT routines. I was thinking of below possible approach. Use pte_list_walk() to traverse through list of sptes and use mmu_spte_update() to flip the PT_WRITABLE_MASK flag. But all SPTEs are not part of any single list; but on separate lists (based on gfn, page level, memory_slot). So, recording all the faulted guest GFN and then using above method work ? There are two ways to write-protect all sptes: - use kvm_mmu_slot_remove_write_access() on all memslots - walk the shadow page cache to get the shadow pages in the highest level (level = 4 on EPT), then write-protect its entries. If you just want to do it for the specified gfn, you can use rmap_write_protect(). Just inquisitive, what is your purpose? :) Thanks Xiao ! Just hands on with virtualization hardware. Trying to preserve guest state after migration. -- Sunl -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: KVM: MMU: Tracking guest writes through EPT entries ?
On 07/31/2012 01:18 AM, Sunil wrote: Hello List, I am a KVM newbie and studying KVM mmu code. On the existing guest, I am trying to track all guest writes by marking page table entry as read-only in EPT entry [ I am using Intel machine with vmx and ept support ]. Looks like EPT support re-uses shadow page table(SPT) code and hence some of SPT routines. I was thinking of below possible approach. Use pte_list_walk() to traverse through list of sptes and use mmu_spte_update() to flip the PT_WRITABLE_MASK flag. But all SPTEs are not part of any single list; but on separate lists (based on gfn, page level, memory_slot). So, recording all the faulted guest GFN and then using above method work ? There are two ways to write-protect all sptes: - use kvm_mmu_slot_remove_write_access() on all memslots - walk the shadow page cache to get the shadow pages in the highest level (level = 4 on EPT), then write-protect its entries. If you just want to do it for the specified gfn, you can use rmap_write_protect(). Just inquisitive, what is your purpose? :) -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html