Re: [XEN PATCH v1 1/1] x86/domctl: add gva_to_gfn command

2023-03-21 Thread Ковалёв Сергей

Thanks to all for suggestions and notes.

Though as Andrew Cooper noticed current approach is too over simplified.
As Tams K Lengyel noticed the effect could be too negligible and some
OS specific logic should be present.

So as for today we could drop the patch.

20.03.2023 19:32, Ковалёв Сергей пишет:

gva_to_gfn command used for fast address translation in LibVMI project.
With such a command it is possible to perform address translation in
single call instead of series of queries to get every page table.

Thanks to Dmitry Isaykin for involvement.

Signed-off-by: Sergey Kovalev 

---
Cc: Jan Beulich 
Cc: Andrew Cooper 
Cc: "Roger Pau Monné" 
Cc: Wei Liu 
Cc: George Dunlap 
Cc: Julien Grall 
Cc: Stefano Stabellini 
Cc: Tamas K Lengyel 
Cc: xen-devel@lists.xenproject.org
---

---
  xen/arch/x86/domctl.c   | 17 +
  xen/include/public/domctl.h | 13 +
  2 files changed, 30 insertions(+)

diff --git a/xen/arch/x86/domctl.c b/xen/arch/x86/domctl.c
index 2118fcad5d..0c9706ea0a 100644
--- a/xen/arch/x86/domctl.c
+++ b/xen/arch/x86/domctl.c
@@ -1364,6 +1364,23 @@ long arch_do_domctl(
  copyback = true;
  break;

+    case XEN_DOMCTL_gva_to_gfn:
+    {
+    uint64_t ga = domctl->u.gva_to_gfn.addr;
+    uint64_t cr3 = domctl->u.gva_to_gfn.cr3;
+    struct vcpu* v = d->vcpu[0];
+    uint32_t pfec = PFEC_page_present;
+    unsigned int page_order;
+
+    uint64_t gfn = paging_ga_to_gfn_cr3(v, cr3, ga, , 
_order);

+    domctl->u.gva_to_gfn.addr = gfn;
+    domctl->u.gva_to_gfn.page_order = page_order;
+    if ( __copy_to_guest(u_domctl, domctl, 1) )
+    ret = -EFAULT;
+
+    break;
+    }
+
  default:
  ret = -ENOSYS;
  break;
diff --git a/xen/include/public/domctl.h b/xen/include/public/domctl.h
index 51be28c3de..628dfc68fd 100644
--- a/xen/include/public/domctl.h
+++ b/xen/include/public/domctl.h
@@ -948,6 +948,17 @@ struct xen_domctl_paging_mempool {
  uint64_aligned_t size; /* Size in bytes. */
  };

+/*
+ * XEN_DOMCTL_gva_to_gfn.
+ *
+ * Get the guest virtual to guest physicall address translated.
+ */
+struct xen_domctl_gva_to_gfn {
+    uint64_aligned_t addr;
+    uint64_aligned_t cr3;
+    uint64_aligned_t page_order;
+};
+
  #if defined(__i386__) || defined(__x86_64__)
  struct xen_domctl_vcpu_msr {
  uint32_t index;
@@ -1278,6 +1289,7 @@ struct xen_domctl {
  #define XEN_DOMCTL_vmtrace_op    84
  #define XEN_DOMCTL_get_paging_mempool_size   85
  #define XEN_DOMCTL_set_paging_mempool_size   86
+#define XEN_DOMCTL_gva_to_gfn    87
  #define XEN_DOMCTL_gdbsx_guestmemio    1000
  #define XEN_DOMCTL_gdbsx_pausevcpu 1001
  #define XEN_DOMCTL_gdbsx_unpausevcpu   1002
@@ -1340,6 +1352,7 @@ struct xen_domctl {
  struct xen_domctl_vuart_op  vuart_op;
  struct xen_domctl_vmtrace_op    vmtrace_op;
  struct xen_domctl_paging_mempool    paging_mempool;
+    struct xen_domctl_gva_to_gfn    gva_to_gfn;
  uint8_t pad[128];
  } u;
  };


--
Best regards,
Sergey Kovalev




Re: [XEN PATCH v1 1/1] x86/domctl: add gva_to_gfn command

2023-03-21 Thread Ковалёв Сергей




21.03.2023 2:34, Tamas K Lengyel пишет:



On Mon, Mar 20, 2023 at 3:23 PM Ковалёв Сергей <mailto:va...@list.ru>> wrote:

 >
 >
 >
 > 21.03.2023 1:51, Tamas K Lengyel wrote:
 > >
 > >
 > > On Mon, Mar 20, 2023 at 12:32 PM Ковалёв Сергей <mailto:va...@list.ru>

 > > <mailto:va...@list.ru <mailto:va...@list.ru>>> wrote:
 > >  >
 > >  > gva_to_gfn command used for fast address translation in LibVMI 
project.

 > >  > With such a command it is possible to perform address translation in
 > >  > single call instead of series of queries to get every page table.
 > >
 > > You have a couple assumptions here:
 > >   - Xen will always have a direct map of the entire guest memory - 
there

 > > are already plans to move away from that. Without that this approach
 > > won't have any advantage over doing the same mapping by LibVMI
 >
 > Thanks! I didn't know about the plan. Though I use this patch
 > back ported into 4.16.
 >
 > >   - LibVMI has to map every page for each page table for every lookup -
 > > you have to do that only for the first, afterwards the pages on which
 > > the pagetable is are kept in a cache and subsequent lookups would be
 > > actually faster then having to do this domctl since you can keep being
 > > in the same process instead of having to jump to Xen.
 >
 > Yes. I know about the page cache. But I have faced with several issues
 > with cache like this one https://github.com/libvmi/libvmi/pull/1058 
<https://github.com/libvmi/libvmi/pull/1058> .

 > So I had to disable the cache.

The issue you linked to is an issue with a stale v2p cache, which is a 
virtual TLB. The cache I talked about is the page cache, which is just 
maintaining a list of the pages that were accessed by LibVMI for future 
accesses. You can have one and not the other (ie. ./configure  
--disable-address-cache --enable-page-cache).


Tamas


Thanks. I know about the page cache. Though I'm not familiar with
it close enough.

As far as I understand at the time the page cache implementation in
LibVMI looks like this:
1. Call sequence: vmi_read > vmi_read_page > driver_read_page >
   xen_read_page > memory_cache_insert ..> get_memory_data >
   xen_get_memory > xen_get_memory_pfn > xc_map_foreign_range
2. This is perfectly valid while guest OS keeps page there. And
   physical pages are always there.
3. To renew cache the "age_limit" counter is used.
4. In Xen driver implementation in LibVMI the "age_limit" is
   disabled.
5. Also it is possible to invalidate cache with "xen_write" or
   "vmi_pagecache_flush". But it is not used.
6. Other way to avoid too big cache is cache size limit. So on
   every insert half of the cache is dropped on size overflow.

So the only thing we should know is valid mapping of guest
virtual address to guest physical address.

And the slow paths are:
1. A first traversal of new page table set. E.g. for the new process.
2. Or new subset of page tables for known process.
3. Subsequent page access after cache clean on size overflow.

Am I right?

The main idea behind the patch:
1. For the very first time it would be done faster with hypercall.
2. For subsequent calls v2p translation cache could be used (used in
   my current work in LibVMI).
3. To avoid errors with stale cache v2p cache could be invalidated
   on every event (VMI_FLUSH_RATE = 1).

--
Best regards,
Sergey Kovalev




Re: [XEN PATCH v1 1/1] x86/domctl: add gva_to_gfn command

2023-03-20 Thread Ковалёв Сергей




20.03.2023 22:07, Andrew Cooper пишет:

On 20/03/2023 4:32 pm, Ковалёв Сергей wrote:

gva_to_gfn command used for fast address translation in LibVMI project.
With such a command it is possible to perform address translation in
single call instead of series of queries to get every page table.

Thanks to Dmitry Isaykin for involvement.

Signed-off-by: Sergey Kovalev 


I fully appreciate why you want this hypercall, and I've said several
times that libvmi wants something better than it has, but...


diff --git a/xen/arch/x86/domctl.c b/xen/arch/x86/domctl.c
index 2118fcad5d..0c9706ea0a 100644
--- a/xen/arch/x86/domctl.c
+++ b/xen/arch/x86/domctl.c
@@ -1364,6 +1364,23 @@ long arch_do_domctl(
  copyback = true;
  break;

+    case XEN_DOMCTL_gva_to_gfn:
+    {
+    uint64_t ga = domctl->u.gva_to_gfn.addr;
+    uint64_t cr3 = domctl->u.gva_to_gfn.cr3;
+    struct vcpu* v = d->vcpu[0];


... this isn't safe if you happen to issue this hypercall too early in a
domain's lifecycle.

If nothing else, you want to do a domain_vcpu() check and return -ENOENT
in the failure case.


Thanks!



More generally, issuing the hypercall under vcpu0 isn't necessarily
correct.  It is common for all vCPUs to have equivalent paging settings,
but e.g. Xen transiently disables CR4.CET and CR0.WP in order to make
self-modifying code changes.

Furthermore, the setting of CR4.{PAE,PSE} determines reserved bits, so
you can't even ignore the access rights and hope that the translation
works out correctly.


Thanks! I didn't think about such things earlier. I should to think
this know carefully.



Ideally we'd have a pagewalk algorithm which didn't require taking a
vcpu, and instead just took a set of paging configuration, but it is all
chronically entangled right now.



Do You mean to add new implementation of "paging_ga_to_gfn_cr3"?


I think, at a minimum, you need to take a vcpu_id as an input, but I
suspect to make this a usable API you want an altp2m view id too.



Why we should consider altp2m while translating guest virtual address to
guest physical one?


Also, I'm pretty sure this is only safe for a paused vCPU.  If the vCPU
isn't paused, then there's a TOCTOU race in the pagewalk code when
inspecting control registers.



Thanks! Should we pause the domain?


+    uint32_t pfec = PFEC_page_present;
+    unsigned int page_order;
+
+    uint64_t gfn = paging_ga_to_gfn_cr3(v, cr3, ga, ,
_order);
+    domctl->u.gva_to_gfn.addr = gfn;
+    domctl->u.gva_to_gfn.page_order = page_order;


page_order is only not stack rubble if gfn is different to INVALID_GFN.



Sorry but I don't understand "is only not stack rubble". Do you mean
that I should initialize "page_order" while defining it?


+    if ( __copy_to_guest(u_domctl, domctl, 1) )
+    ret = -EFAULT;


You want to restrict this to just the gva_to_gfn sub-portion.  No point
copying back more than necessary.

~Andrew


Thanks a lot!

--
Best regards,
Sergey Kovalev




Re: [XEN PATCH v1 1/1] x86/domctl: add gva_to_gfn command

2023-03-20 Thread Ковалёв Сергей




21.03.2023 1:51, Tamas K Lengyel wrote:



On Mon, Mar 20, 2023 at 12:32 PM Ковалёв Сергей <mailto:va...@list.ru>> wrote:

 >
 > gva_to_gfn command used for fast address translation in LibVMI project.
 > With such a command it is possible to perform address translation in
 > single call instead of series of queries to get every page table.

You have a couple assumptions here:
  - Xen will always have a direct map of the entire guest memory - there 
are already plans to move away from that. Without that this approach 
won't have any advantage over doing the same mapping by LibVMI


Thanks! I didn't know about the plan. Though I use this patch
back ported into 4.16.

  - LibVMI has to map every page for each page table for every lookup - 
you have to do that only for the first, afterwards the pages on which 
the pagetable is are kept in a cache and subsequent lookups would be 
actually faster then having to do this domctl since you can keep being 
in the same process instead of having to jump to Xen.


Yes. I know about the page cache. But I have faced with several issues
with cache like this one https://github.com/libvmi/libvmi/pull/1058 .
So I had to disable the cache.



With these perspectives in mind I don't think this would be a useful 
addition. Please prove me wrong with performance numbers and a specific 
use-case that warrants adding this and how you plan to introduce it into 
LibVMI without causing performance regression to all other use-cases.


I will send You a PR into LibVMI in a day or two. I don't have any
performance numbers at the time. I send this patch to share my current
work as soon as possible.

To prevent regression in all use-cases we could add a configure option.
Thanks to make me notice that!



Tamas



--
С уважением,
Ковалёв Сергей.




[XEN PATCH v1 1/1] x86/domctl: add gva_to_gfn command

2023-03-20 Thread Ковалёв Сергей

gva_to_gfn command used for fast address translation in LibVMI project.
With such a command it is possible to perform address translation in
single call instead of series of queries to get every page table.

Thanks to Dmitry Isaykin for involvement.

Signed-off-by: Sergey Kovalev 

---
Cc: Jan Beulich 
Cc: Andrew Cooper 
Cc: "Roger Pau Monné" 
Cc: Wei Liu 
Cc: George Dunlap 
Cc: Julien Grall 
Cc: Stefano Stabellini 
Cc: Tamas K Lengyel 
Cc: xen-devel@lists.xenproject.org
---

---
 xen/arch/x86/domctl.c   | 17 +
 xen/include/public/domctl.h | 13 +
 2 files changed, 30 insertions(+)

diff --git a/xen/arch/x86/domctl.c b/xen/arch/x86/domctl.c
index 2118fcad5d..0c9706ea0a 100644
--- a/xen/arch/x86/domctl.c
+++ b/xen/arch/x86/domctl.c
@@ -1364,6 +1364,23 @@ long arch_do_domctl(
 copyback = true;
 break;

+case XEN_DOMCTL_gva_to_gfn:
+{
+uint64_t ga = domctl->u.gva_to_gfn.addr;
+uint64_t cr3 = domctl->u.gva_to_gfn.cr3;
+struct vcpu* v = d->vcpu[0];
+uint32_t pfec = PFEC_page_present;
+unsigned int page_order;
+
+uint64_t gfn = paging_ga_to_gfn_cr3(v, cr3, ga, , 
_order);

+domctl->u.gva_to_gfn.addr = gfn;
+domctl->u.gva_to_gfn.page_order = page_order;
+if ( __copy_to_guest(u_domctl, domctl, 1) )
+ret = -EFAULT;
+
+break;
+}
+
 default:
 ret = -ENOSYS;
 break;
diff --git a/xen/include/public/domctl.h b/xen/include/public/domctl.h
index 51be28c3de..628dfc68fd 100644
--- a/xen/include/public/domctl.h
+++ b/xen/include/public/domctl.h
@@ -948,6 +948,17 @@ struct xen_domctl_paging_mempool {
 uint64_aligned_t size; /* Size in bytes. */
 };

+/*
+ * XEN_DOMCTL_gva_to_gfn.
+ *
+ * Get the guest virtual to guest physicall address translated.
+ */
+struct xen_domctl_gva_to_gfn {
+uint64_aligned_t addr;
+uint64_aligned_t cr3;
+uint64_aligned_t page_order;
+};
+
 #if defined(__i386__) || defined(__x86_64__)
 struct xen_domctl_vcpu_msr {
 uint32_t index;
@@ -1278,6 +1289,7 @@ struct xen_domctl {
 #define XEN_DOMCTL_vmtrace_op84
 #define XEN_DOMCTL_get_paging_mempool_size   85
 #define XEN_DOMCTL_set_paging_mempool_size   86
+#define XEN_DOMCTL_gva_to_gfn87
 #define XEN_DOMCTL_gdbsx_guestmemio1000
 #define XEN_DOMCTL_gdbsx_pausevcpu 1001
 #define XEN_DOMCTL_gdbsx_unpausevcpu   1002
@@ -1340,6 +1352,7 @@ struct xen_domctl {
 struct xen_domctl_vuart_op  vuart_op;
 struct xen_domctl_vmtrace_opvmtrace_op;
 struct xen_domctl_paging_mempoolpaging_mempool;
+struct xen_domctl_gva_to_gfngva_to_gfn;
 uint8_t pad[128];
 } u;
 };
--
2.38.1




Xen Kdump analysis with crash utility

2023-01-24 Thread Ковалёв Сергей

Hello,

I'm trying to start use of Kdump in my Xen 4.16 setup with Ubuntu 
18.04.6 ( 5.4.0-137-generic ).


I was able to load "dump-capture kernel" with kexec-tools and collect 
crashdump with makedumpfile like this:

```
makedumpfile -E -X -d 0 /proc/vmcore /var/crash/dump
```

This dump file could be used to analyze Dom0 panics.

Though I have some issues while analyzing dump file for Xen kernel:
```
 ~/src/crash/crash --hyper ~/xen-syms-dbg/usr/lib/debug/xen-syms 
/var/crash/202301241536/dump.202301241536


crash 8.0.2++
...
GNU gdb (GDB) 10.2
...
crash: invalid kernel virtual address: 1ef8  type: "fill_pcpu_struct"
WARNING: cannot fill pcpu_struct.

crash: cannot read cpu_info.
```

As far as I know developers community of crash utility doesn't actively 
support Xen. From 
https://github.com/crash-utility/crash/issues/21#issuecomment-330847410 :

```
I cannot help you with Xen-related issues because Red Hat stopped releasing
Xen kernels several years ago (RHEL5 was the last Red Hat kernel that 
contained

a Xen kernel).  Since then, ongoing Xen kernel support in the crash utility
has been maintained by engineers who work for other distributions that still
offer Xen kernels.
```

Does anybody use kdump to analyze Xen crashes? Could anybody share some 
tips and tricks with me to use crash or other tools with such dumps?


Thanks a lot.
--
Best regards,
Sergey Kovalev




Re: [Xen-devel] [XEN PATCH v1 1/1] x86/vm_event: add fast single step

2019-12-17 Thread Ковалёв Сергей
Andrew, Tamas thank you very much. I will improve the patch.

December 17, 2019 3:13:42 PM UTC, Andrew Cooper  
пишет:
>On 17/12/2019 15:10, Tamas K Lengyel wrote:
>> On Tue, Dec 17, 2019 at 8:08 AM Tamas K Lengyel 
>wrote:
>>> On Tue, Dec 17, 2019 at 7:48 AM Andrew Cooper
> wrote:
 On 17/12/2019 14:40, Sergey Kovalev wrote:
> On break point event eight context switches occures.
>
> With fast single step it is possible to shorten path for two
>context
> switches
> and gain 35% spead-up.
>
> Was tested on Debian branch of Xen 4.12. See at:
>
>https://github.com/skvl/xen/tree/debian/knorrie/4.12/fast-singlestep
>
> Rebased on master:
> https://github.com/skvl/xen/tree/fast-singlestep
>
> Signed-off-by: Sergey Kovalev 
 35% looks like a good number, but what is "fast single step"?  All
>this
 appears to be is plumbing for to cause an altp2m switch on single
>step.
>>> Yes, a better explanation would be much needed here and I'm not 100%
>>> sure it correctly implements what I think it tries to.
>>>
>>> This is my interpretation of what the idea is: when using DRAKVUF
>(or
>>> another system using altp2m with shadow pages similar to what I
>>> describe in
>https://xenproject.org/2016/04/13/stealthy-monitoring-with-xen-altp2m),
>>> after a breakpoint is hit the system switches to the default
>>> unrestricted altp2m view with singlestep enabled. When the
>singlestep
>>> traps to Xen another vm_event is sent to the monitor agent, which
>then
>>> normally disables singlestepping and switches the altp2m view back
>to
>>> the restricted view. This patch looks like its short-circuiting that
>>> last part so that it doesn't need to send the vm_event out for the
>>> singlestep event and should switch back to the restricted view in
>Xen
>>> automatically. It's a nice optimization. But what seems to be
>missing
>>> is the altp2m switch itself.
>> Never mind, p2m_altp2m_check does the altp2m switch as well, so this
>> patch implements what I described above. Please update the patch
>> message to be more descriptive (you can copy my description from
>> above).
>
>Also please read CODING_STYLE in the root of the xen repository.  The
>important ones you need to fix are spaces in "if ( ... )" statements,
>and binary operators on the end of the first line rather than the
>beginning of the continuation.
>
>~Andrew

-- 
Простите за краткость, создано в K-9 Mail.

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel