Re: Number of planes from fourcc code

2018-09-13 Thread Oleksandr Andrushchenko

On 09/13/2018 02:46 PM, Hans Verkuil wrote:

On 09/13/18 13:29, Oleksandr Andrushchenko wrote:

Hi, all!

Is there a way in V4L2 to get number of planes from fourcc code

or specifically I need number of planes for a given pixel format

expressed as V4L2_PIX_FMT_* value.

Sadly not. It's part of the documentation for the formats, but there
is no naming scheme through which you can deduce this or even helper
functions for it.

Ok, then I'll probably try to explain what I want to do:
I am implementing a Xen frontend driver which implements
a para-virtual camera protocol [1] which can support
multiple pixel formats. Thus, the driver will support
both single and multi plane formats. For that I need to implement
both single and multi plane format enumerators:
    .vidioc_enum_fmt_vid_cap
    .vidioc_enum_fmt_vid_cap_mplane

and for .vidioc_enum_fmt_vid_cap I have to filter out of supported
pixel formats those which are multi-planar. So, I hoped I can
use some helper (like DRM provides) to get number of planes for
a given pixel format.

So, it seems that I'll have to code similar table as DRM does
for various V4L2 encoded pixel formats to get num_planes...


I think the main reason why this never happened is that drivers tend to
have custom code for this anyway.

I have proposed in the past that some of this information is exposed
via VIDIOC_ENUM_FMT, but it never got traction.


I know that DRM has such a helper [1], but I am not quite sure

if I can call it with V4L2_PIX_FMT_* as argument to get what I need.

I am a bit confused here because there are different definitions

for DRM [2] and V4L2 [3].

I know. Each subsystem has traditionally been assigning fourccs independently.
In all fairness, this seems to be the case for fourccs throughout the whole
industry.

Regards,

Hans


Thank you,

Oleksandr

[1]
https://elixir.bootlin.com/linux/v4.19-rc3/source/drivers/gpu/drm/drm_fourcc.c#L199

[2]
https://elixir.bootlin.com/linux/v4.19-rc3/source/include/uapi/drm/drm_fourcc.h

[3]
https://elixir.bootlin.com/linux/v4.19-rc3/source/include/uapi/linux/videodev2.h


[1] https://patchwork.kernel.org/patch/10595259/


Number of planes from fourcc code

2018-09-13 Thread Oleksandr Andrushchenko

Hi, all!

Is there a way in V4L2 to get number of planes from fourcc code

or specifically I need number of planes for a given pixel format

expressed as V4L2_PIX_FMT_* value.

I know that DRM has such a helper [1], but I am not quite sure

if I can call it with V4L2_PIX_FMT_* as argument to get what I need.

I am a bit confused here because there are different definitions

for DRM [2] and V4L2 [3].

Thank you,

Oleksandr

[1] 
https://elixir.bootlin.com/linux/v4.19-rc3/source/drivers/gpu/drm/drm_fourcc.c#L199


[2] 
https://elixir.bootlin.com/linux/v4.19-rc3/source/include/uapi/drm/drm_fourcc.h


[3] 
https://elixir.bootlin.com/linux/v4.19-rc3/source/include/uapi/linux/videodev2.h




Re: [Xen-devel] [PATCH v2 5/9] xen/gntdev: Allow mappings for DMA buffers

2018-06-11 Thread Oleksandr Andrushchenko

On 06/08/2018 10:21 PM, Boris Ostrovsky wrote:

On 06/08/2018 01:59 PM, Stefano Stabellini wrote:

     @@ -325,6 +401,14 @@ static int map_grant_pages(struct
grant_map
*map)
     map->unmap_ops[i].handle = map->map_ops[i].handle;
     if (use_ptemod)
     map->kunmap_ops[i].handle =
map->kmap_ops[i].handle;
+#ifdef CONFIG_XEN_GRANT_DMA_ALLOC
+    else if (map->dma_vaddr) {
+    unsigned long mfn;
+
+    mfn = __pfn_to_mfn(page_to_pfn(map->pages[i]));

Not pfn_to_mfn()?

I'd love to, but pfn_to_mfn is only defined for x86, not ARM: [1]
and [2]
Thus,

drivers/xen/gntdev.c:408:10: error: implicit declaration of function
‘pfn_to_mfn’ [-Werror=implicit-function-declaration]
   mfn = pfn_to_mfn(page_to_pfn(map->pages[i]));

So, I'll keep __pfn_to_mfn

How will this work on non-PV x86?

So, you mean I need:
#ifdef CONFIG_X86
mfn = pfn_to_mfn(page_to_pfn(map->pages[i]));
#else
mfn = __pfn_to_mfn(page_to_pfn(map->pages[i]));
#endif


I'd rather fix it in ARM code. Stefano, why does ARM uses the
underscored version?

Do you want me to add one more patch for ARM to wrap __pfn_to_mfn
with static inline for ARM? e.g.
static inline ...pfn_to_mfn(...)
{
     __pfn_to_mfn();
}

A Xen on ARM guest doesn't actually know the mfns behind its own
pseudo-physical pages. This is why we stopped using pfn_to_mfn and
started using pfn_to_bfn instead, which will generally return "pfn",
unless the page is a foreign grant. See include/xen/arm/page.h.
pfn_to_bfn was also introduced on x86. For example, see the usage of
pfn_to_bfn in drivers/xen/swiotlb-xen.c. Otherwise, if you don't care
about other mapped grants, you can just use pfn_to_gfn, that always
returns pfn.


I think then this code needs to use pfn_to_bfn().

Ok




Also, for your information, we support different page granularities in
Linux as a Xen guest, see the comment at include/xen/arm/page.h:

   /*
* The pseudo-physical frame (pfn) used in all the helpers is always based
* on Xen page granularity (i.e 4KB).
*
* A Linux page may be split across multiple non-contiguous Xen page so we
* have to keep track with frame based on 4KB page granularity.
*
* PV drivers should never make a direct usage of those helpers (particularly
* pfn_to_gfn and gfn_to_pfn).
*/

A Linux page could be 64K, but a Xen page is always 4K. A granted page
is also 4K. We have helpers to take into account the offsets to map
multiple Xen grants in a single Linux page, see for example
drivers/xen/grant-table.c:gnttab_foreach_grant. Most PV drivers have
been converted to be able to work with 64K pages correctly, but if I
remember correctly gntdev.c is the only remaining driver that doesn't
support 64K pages yet, so you don't have to deal with it if you don't
want to.


I believe somewhere in this series there is a test for PAGE_SIZE vs.
XEN_PAGE_SIZE. Right, Oleksandr?

Not in gntdev. You might have seen this in xen-drmfront/xen-sndfront,
but I didn't touch gntdev for that. Do you want me to add yet another patch
in the series to check for that?

Thanks for the explanation.

-boris

Thank you,
Oleksandr


[PATCH 0/8] xen: dma-buf support for grant device

2018-05-25 Thread Oleksandr Andrushchenko
From: Oleksandr Andrushchenko <oleksandr_andrushche...@epam.com>

This work is in response to my previous attempt to introduce Xen/DRM
zero-copy driver [1] to enable Linux dma-buf API [2] for Xen based
frontends/backends. There is also an existing hyper_dmabuf approach
available [3] which, if reworked to utilize the proposed solution,
can greatly benefit as well.

RFC for this series was published and discussed [9], comments addressed.

The original rationale behind this work was to enable zero-copying
use-cases while working with Xen para-virtual display driver [4]:
when using Xen PV DRM frontend driver then on backend side one will
need to do copying of display buffers' contents (filled by the
frontend's user-space) into buffers allocated at the backend side.
Taking into account the size of display buffers and frames per
second it may result in unneeded huge data bus occupation and
performance loss.

The helper driver [4] allows implementing zero-copying use-cases
when using Xen para-virtualized frontend display driver by implementing
a DRM/KMS helper driver running on backend's side.
It utilizes PRIME buffers API (implemented on top of Linux dma-buf)
to share frontend's buffers with physical device drivers on
backend's side:

 - a dumb buffer created on backend's side can be shared
   with the Xen PV frontend driver, so it directly writes
   into backend's domain memory (into the buffer exported from
   DRM/KMS driver of a physical display device)
 - a dumb buffer allocated by the frontend can be imported
   into physical device DRM/KMS driver, thus allowing to
   achieve no copying as well

Finally, it was discussed and decided ([1], [5]) that it is worth
implementing such use-cases via extension of the existing Xen gntdev
driver instead of introducing new DRM specific driver.
Please note, that the support of dma-buf is Linux only,
as dma-buf is a Linux only thing.

Now to the proposed solution. The changes  to the existing Xen drivers
in the Linux kernel fall into 2 categories:
1. DMA-able memory buffer allocation and increasing/decreasing memory
   reservation of the pages of such a buffer.
   This is required if we are about to share dma-buf with the hardware
   that does require those to be allocated with dma_alloc_xxx API.
   (It is still possible to allocate a dma-buf from any system memory,
   e.g. system pages).
2. Extension of the gntdev driver to enable it to import/export dma-buf’s.

The first four patches are in preparation for Xen dma-buf support,
but I consider those usable regardless of the dma-buf use-case,
e.g. other frontend/backend kernel modules may also benefit from these
for better code reuse:
   0001-xen-grant-table-Make-set-clear-page-private-code-sha.patch
   0002-xen-balloon-Move-common-memory-reservation-routines-.patch
   0003-xen-grant-table-Allow-allocating-buffers-suitable-fo.patch
   0004-xen-gntdev-Allow-mappings-for-DMA-buffers.patch

The next three patches are Xen implementation of dma-buf as part of
the grant device:
   0005-xen-gntdev-Add-initial-support-for-dma-buf-UAPI.patch
   0006-xen-gntdev-Implement-dma-buf-export-functionality.patch
   0007-xen-gntdev-Implement-dma-buf-import-functionality.patch

The last patch makes it possible for in-kernel use of Xen dma-buf API:
  0008-xen-gntdev-Expose-gntdev-s-dma-buf-API-for-in-kernel.patch

The corresponding libxengnttab changes are available at [6].

All the above was tested with display backend [7] and its accompanying
helper library [8] on Renesas ARM64 based board.

*To all the communities*: I would like to ask you to review the proposed
solution and give feedback on it, so I can improve and send final
patches for review (this is still work in progress, but enough to start
discussing the implementation).


Thank you in advance,
Oleksandr Andrushchenko

[1] https://lists.freedesktop.org/archives/dri-devel/2018-April/173163.html
[2] 
https://elixir.bootlin.com/linux/v4.17-rc5/source/Documentation/driver-api/dma-buf.rst
[3] https://lists.xenproject.org/archives/html/xen-devel/2018-02/msg01202.html
[4] https://cgit.freedesktop.org/drm/drm-misc/tree/drivers/gpu/drm/xen
[5] https://patchwork.kernel.org/patch/10279681/
[6] https://github.com/andr2000/xen/tree/xen_dma_buf_v1
[7] https://github.com/andr2000/displ_be/tree/xen_dma_buf_v1
[8] https://github.com/andr2000/libxenbe/tree/xen_dma_buf_v1
[9] https://lkml.org/lkml/2018/5/17/215

Oleksandr Andrushchenko (8):
  xen/grant-table: Make set/clear page private code shared
  xen/balloon: Move common memory reservation routines to a module
  xen/grant-table: Allow allocating buffers suitable for DMA
  xen/gntdev: Allow mappings for DMA buffers
  xen/gntdev: Add initial support for dma-buf UAPI
  xen/gntdev: Implement dma-buf export functionality
  xen/gntdev: Implement dma-buf import functionality
  xen/gntdev: Expose gntdev's dma-buf API for in-kernel use

 drivers/xen/Kconfig   |   23 +
 drivers/xen/Makefile  |1 +
 drivers/xen/balloon.c 

[PATCH 3/8] xen/grant-table: Allow allocating buffers suitable for DMA

2018-05-25 Thread Oleksandr Andrushchenko
From: Oleksandr Andrushchenko <oleksandr_andrushche...@epam.com>

Extend grant table module API to allow allocating buffers that can
be used for DMA operations and mapping foreign grant references
on top of those.
The resulting buffer is similar to the one allocated by the balloon
driver in terms that proper memory reservation is made
({increase|decrease}_reservation and VA mappings updated if needed).
This is useful for sharing foreign buffers with HW drivers which
cannot work with scattered buffers provided by the balloon driver,
but require DMAable memory instead.

Signed-off-by: Oleksandr Andrushchenko <oleksandr_andrushche...@epam.com>
---
 drivers/xen/Kconfig   |  13 
 drivers/xen/grant-table.c | 124 ++
 include/xen/grant_table.h |  25 
 3 files changed, 162 insertions(+)

diff --git a/drivers/xen/Kconfig b/drivers/xen/Kconfig
index e5d0c28372ea..3431fe210624 100644
--- a/drivers/xen/Kconfig
+++ b/drivers/xen/Kconfig
@@ -161,6 +161,19 @@ config XEN_GRANT_DEV_ALLOC
  to other domains. This can be used to implement frontend drivers
  or as part of an inter-domain shared memory channel.
 
+config XEN_GRANT_DMA_ALLOC
+   bool "Allow allocating DMA capable buffers with grant reference module"
+   depends on XEN
+   help
+ Extends grant table module API to allow allocating DMA capable
+ buffers and mapping foreign grant references on top of it.
+ The resulting buffer is similar to one allocated by the balloon
+ driver in terms that proper memory reservation is made
+ ({increase|decrease}_reservation and VA mappings updated if needed).
+ This is useful for sharing foreign buffers with HW drivers which
+ cannot work with scattered buffers provided by the balloon driver,
+ but require DMAable memory instead.
+
 config SWIOTLB_XEN
def_bool y
select SWIOTLB
diff --git a/drivers/xen/grant-table.c b/drivers/xen/grant-table.c
index d7488226e1f2..06fe6e7f639c 100644
--- a/drivers/xen/grant-table.c
+++ b/drivers/xen/grant-table.c
@@ -45,6 +45,9 @@
 #include 
 #include 
 #include 
+#ifdef CONFIG_XEN_GRANT_DMA_ALLOC
+#include 
+#endif
 
 #include 
 #include 
@@ -57,6 +60,7 @@
 #ifdef CONFIG_X86
 #include 
 #endif
+#include 
 #include 
 #include 
 
@@ -811,6 +815,82 @@ int gnttab_alloc_pages(int nr_pages, struct page **pages)
 }
 EXPORT_SYMBOL(gnttab_alloc_pages);
 
+#ifdef CONFIG_XEN_GRANT_DMA_ALLOC
+/**
+ * gnttab_dma_alloc_pages - alloc DMAable pages suitable for grant mapping into
+ * @args: arguments to the function
+ */
+int gnttab_dma_alloc_pages(struct gnttab_dma_alloc_args *args)
+{
+   unsigned long pfn, start_pfn;
+   xen_pfn_t *frames;
+   size_t size;
+   int i, ret;
+
+   frames = kcalloc(args->nr_pages, sizeof(*frames), GFP_KERNEL);
+   if (!frames)
+   return -ENOMEM;
+
+   size = args->nr_pages << PAGE_SHIFT;
+   if (args->coherent)
+   args->vaddr = dma_alloc_coherent(args->dev, size,
+>dev_bus_addr,
+GFP_KERNEL | __GFP_NOWARN);
+   else
+   args->vaddr = dma_alloc_wc(args->dev, size,
+  >dev_bus_addr,
+  GFP_KERNEL | __GFP_NOWARN);
+   if (!args->vaddr) {
+   pr_err("Failed to allocate DMA buffer of size %zu\n", size);
+   ret = -ENOMEM;
+   goto fail_free_frames;
+   }
+
+   start_pfn = __phys_to_pfn(args->dev_bus_addr);
+   for (pfn = start_pfn, i = 0; pfn < start_pfn + args->nr_pages;
+   pfn++, i++) {
+   struct page *page = pfn_to_page(pfn);
+
+   args->pages[i] = page;
+   frames[i] = xen_page_to_gfn(page);
+   xenmem_reservation_scrub_page(page);
+   }
+
+   xenmem_reservation_va_mapping_reset(args->nr_pages, args->pages);
+
+   ret = xenmem_reservation_decrease(args->nr_pages, frames);
+   if (ret != args->nr_pages) {
+   pr_err("Failed to decrease reservation for DMA buffer\n");
+   xenmem_reservation_increase(ret, frames);
+   ret = -EFAULT;
+   goto fail_free_dma;
+   }
+
+   ret = gnttab_pages_set_private(args->nr_pages, args->pages);
+   if (ret < 0)
+   goto fail_clear_private;
+
+   kfree(frames);
+   return 0;
+
+fail_clear_private:
+   gnttab_pages_clear_private(args->nr_pages, args->pages);
+fail_free_dma:
+   xenmem_reservation_va_mapping_update(args->nr_pages, args->pages,
+frames);
+   if (args->coherent)
+   dma_free_coherent(args->dev, size,
+

[PATCH 1/8] xen/grant-table: Make set/clear page private code shared

2018-05-25 Thread Oleksandr Andrushchenko
From: Oleksandr Andrushchenko <oleksandr_andrushche...@epam.com>

Make set/clear page private code shared and accessible to
other kernel modules which can re-use these instead of open-coding.

Signed-off-by: Oleksandr Andrushchenko <oleksandr_andrushche...@epam.com>
---
 drivers/xen/grant-table.c | 54 +--
 include/xen/grant_table.h |  3 +++
 2 files changed, 38 insertions(+), 19 deletions(-)

diff --git a/drivers/xen/grant-table.c b/drivers/xen/grant-table.c
index 27be107d6480..d7488226e1f2 100644
--- a/drivers/xen/grant-table.c
+++ b/drivers/xen/grant-table.c
@@ -769,29 +769,18 @@ void gnttab_free_auto_xlat_frames(void)
 }
 EXPORT_SYMBOL_GPL(gnttab_free_auto_xlat_frames);
 
-/**
- * gnttab_alloc_pages - alloc pages suitable for grant mapping into
- * @nr_pages: number of pages to alloc
- * @pages: returns the pages
- */
-int gnttab_alloc_pages(int nr_pages, struct page **pages)
+int gnttab_pages_set_private(int nr_pages, struct page **pages)
 {
int i;
-   int ret;
-
-   ret = alloc_xenballooned_pages(nr_pages, pages);
-   if (ret < 0)
-   return ret;
 
for (i = 0; i < nr_pages; i++) {
 #if BITS_PER_LONG < 64
struct xen_page_foreign *foreign;
 
foreign = kzalloc(sizeof(*foreign), GFP_KERNEL);
-   if (!foreign) {
-   gnttab_free_pages(nr_pages, pages);
+   if (!foreign)
return -ENOMEM;
-   }
+
set_page_private(pages[i], (unsigned long)foreign);
 #endif
SetPagePrivate(pages[i]);
@@ -799,14 +788,30 @@ int gnttab_alloc_pages(int nr_pages, struct page **pages)
 
return 0;
 }
-EXPORT_SYMBOL(gnttab_alloc_pages);
+EXPORT_SYMBOL(gnttab_pages_set_private);
 
 /**
- * gnttab_free_pages - free pages allocated by gnttab_alloc_pages()
- * @nr_pages; number of pages to free
- * @pages: the pages
+ * gnttab_alloc_pages - alloc pages suitable for grant mapping into
+ * @nr_pages: number of pages to alloc
+ * @pages: returns the pages
  */
-void gnttab_free_pages(int nr_pages, struct page **pages)
+int gnttab_alloc_pages(int nr_pages, struct page **pages)
+{
+   int ret;
+
+   ret = alloc_xenballooned_pages(nr_pages, pages);
+   if (ret < 0)
+   return ret;
+
+   ret = gnttab_pages_set_private(nr_pages, pages);
+   if (ret < 0)
+   gnttab_free_pages(nr_pages, pages);
+
+   return ret;
+}
+EXPORT_SYMBOL(gnttab_alloc_pages);
+
+void gnttab_pages_clear_private(int nr_pages, struct page **pages)
 {
int i;
 
@@ -818,6 +823,17 @@ void gnttab_free_pages(int nr_pages, struct page **pages)
ClearPagePrivate(pages[i]);
}
}
+}
+EXPORT_SYMBOL(gnttab_pages_clear_private);
+
+/**
+ * gnttab_free_pages - free pages allocated by gnttab_alloc_pages()
+ * @nr_pages; number of pages to free
+ * @pages: the pages
+ */
+void gnttab_free_pages(int nr_pages, struct page **pages)
+{
+   gnttab_pages_clear_private(nr_pages, pages);
free_xenballooned_pages(nr_pages, pages);
 }
 EXPORT_SYMBOL(gnttab_free_pages);
diff --git a/include/xen/grant_table.h b/include/xen/grant_table.h
index 2e37741f6b8d..de03f2542bb7 100644
--- a/include/xen/grant_table.h
+++ b/include/xen/grant_table.h
@@ -198,6 +198,9 @@ void gnttab_free_auto_xlat_frames(void);
 int gnttab_alloc_pages(int nr_pages, struct page **pages);
 void gnttab_free_pages(int nr_pages, struct page **pages);
 
+int gnttab_pages_set_private(int nr_pages, struct page **pages);
+void gnttab_pages_clear_private(int nr_pages, struct page **pages);
+
 int gnttab_map_refs(struct gnttab_map_grant_ref *map_ops,
struct gnttab_map_grant_ref *kmap_ops,
struct page **pages, unsigned int count);
-- 
2.17.0



[PATCH 2/8] xen/balloon: Move common memory reservation routines to a module

2018-05-25 Thread Oleksandr Andrushchenko
From: Oleksandr Andrushchenko <oleksandr_andrushche...@epam.com>

Memory {increase|decrease}_reservation and VA mappings update/reset
code used in balloon driver can be made common, so other drivers can
also re-use the same functionality without open-coding.
Create a dedicated module for the shared code and export corresponding
symbols for other kernel modules.

Signed-off-by: Oleksandr Andrushchenko <oleksandr_andrushche...@epam.com>
---
 drivers/xen/Makefile  |   1 +
 drivers/xen/balloon.c |  71 ++
 drivers/xen/mem-reservation.c | 134 ++
 include/xen/mem_reservation.h |  29 
 4 files changed, 170 insertions(+), 65 deletions(-)
 create mode 100644 drivers/xen/mem-reservation.c
 create mode 100644 include/xen/mem_reservation.h

diff --git a/drivers/xen/Makefile b/drivers/xen/Makefile
index 451e833f5931..3c87b0c3aca6 100644
--- a/drivers/xen/Makefile
+++ b/drivers/xen/Makefile
@@ -2,6 +2,7 @@
 obj-$(CONFIG_HOTPLUG_CPU)  += cpu_hotplug.o
 obj-$(CONFIG_X86)  += fallback.o
 obj-y  += grant-table.o features.o balloon.o manage.o preempt.o time.o
+obj-y  += mem-reservation.o
 obj-y  += events/
 obj-y  += xenbus/
 
diff --git a/drivers/xen/balloon.c b/drivers/xen/balloon.c
index 065f0b607373..57b482d67a3a 100644
--- a/drivers/xen/balloon.c
+++ b/drivers/xen/balloon.c
@@ -71,6 +71,7 @@
 #include 
 #include 
 #include 
+#include 
 
 static int xen_hotplug_unpopulated;
 
@@ -157,13 +158,6 @@ static DECLARE_DELAYED_WORK(balloon_worker, 
balloon_process);
 #define GFP_BALLOON \
(GFP_HIGHUSER | __GFP_NOWARN | __GFP_NORETRY | __GFP_NOMEMALLOC)
 
-static void scrub_page(struct page *page)
-{
-#ifdef CONFIG_XEN_SCRUB_PAGES
-   clear_highpage(page);
-#endif
-}
-
 /* balloon_append: add the given page to the balloon. */
 static void __balloon_append(struct page *page)
 {
@@ -463,11 +457,6 @@ static enum bp_state increase_reservation(unsigned long 
nr_pages)
int rc;
unsigned long i;
struct page   *page;
-   struct xen_memory_reservation reservation = {
-   .address_bits = 0,
-   .extent_order = EXTENT_ORDER,
-   .domid= DOMID_SELF
-   };
 
if (nr_pages > ARRAY_SIZE(frame_list))
nr_pages = ARRAY_SIZE(frame_list);
@@ -486,9 +475,7 @@ static enum bp_state increase_reservation(unsigned long 
nr_pages)
page = balloon_next_page(page);
}
 
-   set_xen_guest_handle(reservation.extent_start, frame_list);
-   reservation.nr_extents = nr_pages;
-   rc = HYPERVISOR_memory_op(XENMEM_populate_physmap, );
+   rc = xenmem_reservation_increase(nr_pages, frame_list);
if (rc <= 0)
return BP_EAGAIN;
 
@@ -496,29 +483,7 @@ static enum bp_state increase_reservation(unsigned long 
nr_pages)
page = balloon_retrieve(false);
BUG_ON(page == NULL);
 
-#ifdef CONFIG_XEN_HAVE_PVMMU
-   /*
-* We don't support PV MMU when Linux and Xen is using
-* different page granularity.
-*/
-   BUILD_BUG_ON(XEN_PAGE_SIZE != PAGE_SIZE);
-
-   if (!xen_feature(XENFEAT_auto_translated_physmap)) {
-   unsigned long pfn = page_to_pfn(page);
-
-   set_phys_to_machine(pfn, frame_list[i]);
-
-   /* Link back into the page tables if not highmem. */
-   if (!PageHighMem(page)) {
-   int ret;
-   ret = HYPERVISOR_update_va_mapping(
-   (unsigned long)__va(pfn << 
PAGE_SHIFT),
-   mfn_pte(frame_list[i], 
PAGE_KERNEL),
-   0);
-   BUG_ON(ret);
-   }
-   }
-#endif
+   xenmem_reservation_va_mapping_update(1, , _list[i]);
 
/* Relinquish the page back to the allocator. */
free_reserved_page(page);
@@ -535,11 +500,6 @@ static enum bp_state decrease_reservation(unsigned long 
nr_pages, gfp_t gfp)
unsigned long i;
struct page *page, *tmp;
int ret;
-   struct xen_memory_reservation reservation = {
-   .address_bits = 0,
-   .extent_order = EXTENT_ORDER,
-   .domid= DOMID_SELF
-   };
LIST_HEAD(pages);
 
if (nr_pages > ARRAY_SIZE(frame_list))
@@ -553,7 +513,7 @@ static enum bp_state decrease_reservation(unsigned long 
nr_pages, gfp_t gfp)
break;
}
adjust_managed_page_count(page, -1);
-   scrub_page(page);
+   xenmem_reservation_scrub_page(page);
list_add(>lru, );
}
 
@@ -575,25 +535,8 @@ static enum bp_state decr

[PATCH 4/8] xen/gntdev: Allow mappings for DMA buffers

2018-05-25 Thread Oleksandr Andrushchenko
From: Oleksandr Andrushchenko <oleksandr_andrushche...@epam.com>

Allow mappings for DMA backed  buffers if grant table module
supports such: this extends grant device to not only map buffers
made of balloon pages, but also from buffers allocated with
dma_alloc_xxx.

Signed-off-by: Oleksandr Andrushchenko <oleksandr_andrushche...@epam.com>
---
 drivers/xen/gntdev.c  | 100 +-
 include/uapi/xen/gntdev.h |  15 ++
 2 files changed, 113 insertions(+), 2 deletions(-)

diff --git a/drivers/xen/gntdev.c b/drivers/xen/gntdev.c
index bd56653b9bbc..640a579f42ea 100644
--- a/drivers/xen/gntdev.c
+++ b/drivers/xen/gntdev.c
@@ -37,6 +37,9 @@
 #include 
 #include 
 #include 
+#ifdef CONFIG_XEN_GRANT_DMA_ALLOC
+#include 
+#endif
 
 #include 
 #include 
@@ -72,6 +75,11 @@ struct gntdev_priv {
struct mutex lock;
struct mm_struct *mm;
struct mmu_notifier mn;
+
+#ifdef CONFIG_XEN_GRANT_DMA_ALLOC
+   /* Device for which DMA memory is allocated. */
+   struct device *dma_dev;
+#endif
 };
 
 struct unmap_notify {
@@ -96,10 +104,28 @@ struct grant_map {
struct gnttab_unmap_grant_ref *kunmap_ops;
struct page **pages;
unsigned long pages_vm_start;
+
+#ifdef CONFIG_XEN_GRANT_DMA_ALLOC
+   /*
+* If dmabuf_vaddr is not NULL then this mapping is backed by DMA
+* capable memory.
+*/
+
+   /* Device for which DMA memory is allocated. */
+   struct device *dma_dev;
+   /* Flags used to create this DMA buffer: GNTDEV_DMABUF_FLAG_XXX. */
+   bool dma_flags;
+   /* Virtual/CPU address of the DMA buffer. */
+   void *dma_vaddr;
+   /* Bus address of the DMA buffer. */
+   dma_addr_t dma_bus_addr;
+#endif
 };
 
 static int unmap_grant_pages(struct grant_map *map, int offset, int pages);
 
+static struct miscdevice gntdev_miscdev;
+
 /* -- */
 
 static void gntdev_print_maps(struct gntdev_priv *priv,
@@ -121,8 +147,26 @@ static void gntdev_free_map(struct grant_map *map)
if (map == NULL)
return;
 
+#ifdef CONFIG_XEN_GRANT_DMA_ALLOC
+   if (map->dma_vaddr) {
+   struct gnttab_dma_alloc_args args;
+
+   args.dev = map->dma_dev;
+   args.coherent = map->dma_flags & GNTDEV_DMA_FLAG_COHERENT;
+   args.nr_pages = map->count;
+   args.pages = map->pages;
+   args.vaddr = map->dma_vaddr;
+   args.dev_bus_addr = map->dma_bus_addr;
+
+   gnttab_dma_free_pages();
+   } else if (map->pages) {
+   gnttab_free_pages(map->count, map->pages);
+   }
+#else
if (map->pages)
gnttab_free_pages(map->count, map->pages);
+#endif
+
kfree(map->pages);
kfree(map->grants);
kfree(map->map_ops);
@@ -132,7 +176,8 @@ static void gntdev_free_map(struct grant_map *map)
kfree(map);
 }
 
-static struct grant_map *gntdev_alloc_map(struct gntdev_priv *priv, int count)
+static struct grant_map *gntdev_alloc_map(struct gntdev_priv *priv, int count,
+ int dma_flags)
 {
struct grant_map *add;
int i;
@@ -155,8 +200,37 @@ static struct grant_map *gntdev_alloc_map(struct 
gntdev_priv *priv, int count)
NULL == add->pages)
goto err;
 
+#ifdef CONFIG_XEN_GRANT_DMA_ALLOC
+   add->dma_flags = dma_flags;
+
+   /*
+* Check if this mapping is requested to be backed
+* by a DMA buffer.
+*/
+   if (dma_flags & (GNTDEV_DMA_FLAG_WC | GNTDEV_DMA_FLAG_COHERENT)) {
+   struct gnttab_dma_alloc_args args;
+
+   /* Remember the device, so we can free DMA memory. */
+   add->dma_dev = priv->dma_dev;
+
+   args.dev = priv->dma_dev;
+   args.coherent = dma_flags & GNTDEV_DMA_FLAG_COHERENT;
+   args.nr_pages = count;
+   args.pages = add->pages;
+
+   if (gnttab_dma_alloc_pages())
+   goto err;
+
+   add->dma_vaddr = args.vaddr;
+   add->dma_bus_addr = args.dev_bus_addr;
+   } else {
+   if (gnttab_alloc_pages(count, add->pages))
+   goto err;
+   }
+#else
if (gnttab_alloc_pages(count, add->pages))
goto err;
+#endif
 
for (i = 0; i < count; i++) {
add->map_ops[i].handle = -1;
@@ -323,8 +397,19 @@ static int map_grant_pages(struct grant_map *map)
}
 
map->unmap_ops[i].handle = map->map_ops[i].handle;
+#ifdef CONFIG_XEN_GRANT_DMA_ALLOC
+   if (use_ptemod) {
+   map->kunmap_ops[i].handle = map->kmap_ops[i].handle;
+   } else if (map->dm

[PATCH 6/8] xen/gntdev: Implement dma-buf export functionality

2018-05-25 Thread Oleksandr Andrushchenko
From: Oleksandr Andrushchenko <oleksandr_andrushche...@epam.com>

1. Create a dma-buf from grant references provided by the foreign
   domain. By default dma-buf is backed by system memory pages, but
   by providing GNTDEV_DMA_FLAG_XXX flags it can also be created
   as a DMA write-combine/coherent buffer, e.g. allocated with
   corresponding dma_alloc_xxx API.
   Export the resulting buffer as a new dma-buf.

2. Implement waiting for the dma-buf to be released: block until the
   dma-buf with the file descriptor provided is released.
   If within the time-out provided the buffer is not released then
   -ETIMEDOUT error is returned. If the buffer with the file descriptor
   does not exist or has already been released, then -ENOENT is returned.
   For valid file descriptors this must not be treated as error.

Signed-off-by: Oleksandr Andrushchenko <oleksandr_andrushche...@epam.com>
---
 drivers/xen/gntdev.c | 478 ++-
 1 file changed, 476 insertions(+), 2 deletions(-)

diff --git a/drivers/xen/gntdev.c b/drivers/xen/gntdev.c
index 9e450622af1a..52abc6cd5846 100644
--- a/drivers/xen/gntdev.c
+++ b/drivers/xen/gntdev.c
@@ -4,6 +4,8 @@
  * Device for accessing (in user-space) pages that have been granted by other
  * domains.
  *
+ * DMA buffer implementation is based on drivers/gpu/drm/drm_prime.c.
+ *
  * Copyright (c) 2006-2007, D G Murray.
  *   (c) 2009 Gerd Hoffmann <kra...@redhat.com>
  *   (c) 2018 Oleksandr Andrushchenko, EPAM Systems Inc.
@@ -41,6 +43,9 @@
 #ifdef CONFIG_XEN_GRANT_DMA_ALLOC
 #include 
 #endif
+#ifdef CONFIG_XEN_GNTDEV_DMABUF
+#include 
+#endif
 
 #include 
 #include 
@@ -81,6 +86,17 @@ struct gntdev_priv {
/* Device for which DMA memory is allocated. */
struct device *dma_dev;
 #endif
+
+#ifdef CONFIG_XEN_GNTDEV_DMABUF
+   /* Private data of the hyper DMA buffers. */
+
+   /* List of exported DMA buffers. */
+   struct list_head dmabuf_exp_list;
+   /* List of wait objects. */
+   struct list_head dmabuf_exp_wait_list;
+   /* This is the lock which protects dma_buf_xxx lists. */
+   struct mutex dmabuf_lock;
+#endif
 };
 
 struct unmap_notify {
@@ -125,12 +141,38 @@ struct grant_map {
 
 #ifdef CONFIG_XEN_GNTDEV_DMABUF
 struct xen_dmabuf {
+   struct gntdev_priv *priv;
+   struct dma_buf *dmabuf;
+   struct list_head next;
+   int fd;
+
union {
+   struct {
+   /* Exported buffers are reference counted. */
+   struct kref refcount;
+   struct grant_map *map;
+   } exp;
struct {
/* Granted references of the imported buffer. */
grant_ref_t *refs;
} imp;
} u;
+
+   /* Number of pages this buffer has. */
+   int nr_pages;
+   /* Pages of this buffer. */
+   struct page **pages;
+};
+
+struct xen_dmabuf_wait_obj {
+   struct list_head next;
+   struct xen_dmabuf *xen_dmabuf;
+   struct completion completion;
+};
+
+struct xen_dmabuf_attachment {
+   struct sg_table *sgt;
+   enum dma_data_direction dir;
 };
 #endif
 
@@ -320,6 +362,16 @@ static void gntdev_put_map(struct gntdev_priv *priv, 
struct grant_map *map)
gntdev_free_map(map);
 }
 
+#ifdef CONFIG_XEN_GNTDEV_DMABUF
+static void gntdev_remove_map(struct gntdev_priv *priv, struct grant_map *map)
+{
+   mutex_lock(>lock);
+   list_del(>next);
+   gntdev_put_map(NULL /* already removed */, map);
+   mutex_unlock(>lock);
+}
+#endif
+
 /* -- */
 
 static int find_grant_ptes(pte_t *pte, pgtable_t token,
@@ -628,6 +680,12 @@ static int gntdev_open(struct inode *inode, struct file 
*flip)
INIT_LIST_HEAD(>freeable_maps);
mutex_init(>lock);
 
+#ifdef CONFIG_XEN_GNTDEV_DMABUF
+   mutex_init(>dmabuf_lock);
+   INIT_LIST_HEAD(>dmabuf_exp_list);
+   INIT_LIST_HEAD(>dmabuf_exp_wait_list);
+#endif
+
if (use_ptemod) {
priv->mm = get_task_mm(current);
if (!priv->mm) {
@@ -1053,17 +,433 @@ static long gntdev_ioctl_grant_copy(struct gntdev_priv 
*priv, void __user *u)
 /* DMA buffer export support. */
 /* -- */
 
+/* -- */
+/* Implementation of wait for exported DMA buffer to be released. */
+/* -- */
+
+static void dmabuf_exp_release(struct kref *kref);
+
+static struct xen_dmabuf_wait_obj *
+dmabuf_exp_wait_obj_new(struct gntdev_priv *priv,
+   struct xen_dmabuf *xen_dmabuf)
+{
+   struct xen_dmabuf_wait_obj *obj;
+
+   obj = kzalloc(sizeof(*obj), GFP_KERNEL);

[PATCH 5/8] xen/gntdev: Add initial support for dma-buf UAPI

2018-05-25 Thread Oleksandr Andrushchenko
From: Oleksandr Andrushchenko <oleksandr_andrushche...@epam.com>

Add UAPI and IOCTLs for dma-buf grant device driver extension:
the extension allows userspace processes and kernel modules to
use Xen backed dma-buf implementation. With this extension grant
references to the pages of an imported dma-buf can be exported
for other domain use and grant references coming from a foreign
domain can be converted into a local dma-buf for local export.
Implement basic initialization and stubs for Xen DMA buffers'
support.

Signed-off-by: Oleksandr Andrushchenko <oleksandr_andrushche...@epam.com>
---
 drivers/xen/Kconfig   |  10 +++
 drivers/xen/gntdev.c  | 148 ++
 include/uapi/xen/gntdev.h |  91 +++
 3 files changed, 249 insertions(+)

diff --git a/drivers/xen/Kconfig b/drivers/xen/Kconfig
index 3431fe210624..eaf63a2c7ae6 100644
--- a/drivers/xen/Kconfig
+++ b/drivers/xen/Kconfig
@@ -152,6 +152,16 @@ config XEN_GNTDEV
help
  Allows userspace processes to use grants.
 
+config XEN_GNTDEV_DMABUF
+   bool "Add support for dma-buf grant access device driver extension"
+   depends on XEN_GNTDEV && XEN_GRANT_DMA_ALLOC
+   help
+ Allows userspace processes and kernel modules to use Xen backed
+ dma-buf implementation. With this extension grant references to
+ the pages of an imported dma-buf can be exported for other domain
+ use and grant references coming from a foreign domain can be
+ converted into a local dma-buf for local export.
+
 config XEN_GRANT_DEV_ALLOC
tristate "User-space grant reference allocator driver"
depends on XEN
diff --git a/drivers/xen/gntdev.c b/drivers/xen/gntdev.c
index 640a579f42ea..9e450622af1a 100644
--- a/drivers/xen/gntdev.c
+++ b/drivers/xen/gntdev.c
@@ -6,6 +6,7 @@
  *
  * Copyright (c) 2006-2007, D G Murray.
  *   (c) 2009 Gerd Hoffmann <kra...@redhat.com>
+ *   (c) 2018 Oleksandr Andrushchenko, EPAM Systems Inc.
  *
  * This program is distributed in the hope that it will be useful,
  * but WITHOUT ANY WARRANTY; without even the implied warranty of
@@ -122,6 +123,17 @@ struct grant_map {
 #endif
 };
 
+#ifdef CONFIG_XEN_GNTDEV_DMABUF
+struct xen_dmabuf {
+   union {
+   struct {
+   /* Granted references of the imported buffer. */
+   grant_ref_t *refs;
+   } imp;
+   } u;
+};
+#endif
+
 static int unmap_grant_pages(struct grant_map *map, int offset, int pages);
 
 static struct miscdevice gntdev_miscdev;
@@ -1036,6 +1048,128 @@ static long gntdev_ioctl_grant_copy(struct gntdev_priv 
*priv, void __user *u)
return ret;
 }
 
+#ifdef CONFIG_XEN_GNTDEV_DMABUF
+/* -- */
+/* DMA buffer export support. */
+/* -- */
+
+static int dmabuf_exp_wait_released(struct gntdev_priv *priv, int fd,
+   int wait_to_ms)
+{
+   return -ETIMEDOUT;
+}
+
+static int dmabuf_exp_from_refs(struct gntdev_priv *priv, int flags,
+   int count, u32 domid, u32 *refs, u32 *fd)
+{
+   *fd = -1;
+   return -EINVAL;
+}
+
+/* -- */
+/* DMA buffer import support. */
+/* -- */
+
+static int dmabuf_imp_release(struct gntdev_priv *priv, u32 fd)
+{
+   return 0;
+}
+
+static struct xen_dmabuf *
+dmabuf_imp_to_refs(struct gntdev_priv *priv, int fd, int count, int domid)
+{
+   return ERR_PTR(-ENOMEM);
+}
+
+/* -- */
+/* DMA buffer IOCTL support.  */
+/* -- */
+
+static long
+gntdev_ioctl_dmabuf_exp_from_refs(struct gntdev_priv *priv,
+ struct ioctl_gntdev_dmabuf_exp_from_refs 
__user *u)
+{
+   struct ioctl_gntdev_dmabuf_exp_from_refs op;
+   u32 *refs;
+   long ret;
+
+   if (copy_from_user(, u, sizeof(op)) != 0)
+   return -EFAULT;
+
+   refs = kcalloc(op.count, sizeof(*refs), GFP_KERNEL);
+   if (!refs)
+   return -ENOMEM;
+
+   if (copy_from_user(refs, u->refs, sizeof(*refs) * op.count) != 0) {
+   ret = -EFAULT;
+   goto out;
+   }
+
+   ret = dmabuf_exp_from_refs(priv, op.flags, op.count,
+  op.domid, refs, );
+   if (ret)
+   goto out;
+
+   if (copy_to_user(u, , sizeof(op)) != 0)
+   ret = -EFAULT;
+
+out:
+   kfree(refs);
+   return ret;
+}
+
+static long
+gntdev_ioctl_

[PATCH 7/8] xen/gntdev: Implement dma-buf import functionality

2018-05-25 Thread Oleksandr Andrushchenko
From: Oleksandr Andrushchenko <oleksandr_andrushche...@epam.com>

1. Import a dma-buf with the file descriptor provided and export
   granted references to the pages of that dma-buf into the array
   of grant references.

2. Add API to close all references to an imported buffer, so it can be
   released by the owner. This is only valid for buffers created with
   IOCTL_GNTDEV_DMABUF_IMP_TO_REFS.

Signed-off-by: Oleksandr Andrushchenko <oleksandr_andrushche...@epam.com>
---
 drivers/xen/gntdev.c | 237 ++-
 1 file changed, 234 insertions(+), 3 deletions(-)

diff --git a/drivers/xen/gntdev.c b/drivers/xen/gntdev.c
index 52abc6cd5846..d8b6168f2cd9 100644
--- a/drivers/xen/gntdev.c
+++ b/drivers/xen/gntdev.c
@@ -71,6 +71,17 @@ static atomic_t pages_mapped = ATOMIC_INIT(0);
 static int use_ptemod;
 #define populate_freeable_maps use_ptemod
 
+#ifdef CONFIG_XEN_GNTDEV_DMABUF
+#ifndef GRANT_INVALID_REF
+/*
+ * Note on usage of grant reference 0 as invalid grant reference:
+ * grant reference 0 is valid, but never exposed to a driver,
+ * because of the fact it is already in use/reserved by the PV console.
+ */
+#define GRANT_INVALID_REF  0
+#endif
+#endif
+
 struct gntdev_priv {
/* maps with visible offsets in the file descriptor */
struct list_head maps;
@@ -94,6 +105,8 @@ struct gntdev_priv {
struct list_head dmabuf_exp_list;
/* List of wait objects. */
struct list_head dmabuf_exp_wait_list;
+   /* List of imported DMA buffers. */
+   struct list_head dmabuf_imp_list;
/* This is the lock which protects dma_buf_xxx lists. */
struct mutex dmabuf_lock;
 #endif
@@ -155,6 +168,10 @@ struct xen_dmabuf {
struct {
/* Granted references of the imported buffer. */
grant_ref_t *refs;
+   /* Scatter-gather table of the imported buffer. */
+   struct sg_table *sgt;
+   /* dma-buf attachment of the imported buffer. */
+   struct dma_buf_attachment *attach;
} imp;
} u;
 
@@ -684,6 +701,7 @@ static int gntdev_open(struct inode *inode, struct file 
*flip)
mutex_init(>dmabuf_lock);
INIT_LIST_HEAD(>dmabuf_exp_list);
INIT_LIST_HEAD(>dmabuf_exp_wait_list);
+   INIT_LIST_HEAD(>dmabuf_imp_list);
 #endif
 
if (use_ptemod) {
@@ -1544,15 +1562,228 @@ static int dmabuf_exp_from_refs(struct gntdev_priv 
*priv, int flags,
 /* DMA buffer import support. */
 /* -- */
 
-static int dmabuf_imp_release(struct gntdev_priv *priv, u32 fd)
+static int
+dmabuf_imp_grant_foreign_access(struct page **pages, u32 *refs,
+   int count, int domid)
 {
-   return 0;
+   grant_ref_t priv_gref_head;
+   int i, ret;
+
+   ret = gnttab_alloc_grant_references(count, _gref_head);
+   if (ret < 0) {
+   pr_err("Cannot allocate grant references, ret %d\n", ret);
+   return ret;
+   }
+
+   for (i = 0; i < count; i++) {
+   int cur_ref;
+
+   cur_ref = gnttab_claim_grant_reference(_gref_head);
+   if (cur_ref < 0) {
+   ret = cur_ref;
+   pr_err("Cannot claim grant reference, ret %d\n", ret);
+   goto out;
+   }
+
+   gnttab_grant_foreign_access_ref(cur_ref, domid,
+   xen_page_to_gfn(pages[i]), 0);
+   refs[i] = cur_ref;
+   }
+
+   ret = 0;
+
+out:
+   gnttab_free_grant_references(priv_gref_head);
+   return ret;
+}
+
+static void dmabuf_imp_end_foreign_access(u32 *refs, int count)
+{
+   int i;
+
+   for (i = 0; i < count; i++)
+   if (refs[i] != GRANT_INVALID_REF)
+   gnttab_end_foreign_access(refs[i], 0, 0UL);
+}
+
+static void dmabuf_imp_free_storage(struct xen_dmabuf *xen_dmabuf)
+{
+   kfree(xen_dmabuf->pages);
+   kfree(xen_dmabuf->u.imp.refs);
+   kfree(xen_dmabuf);
+}
+
+static struct xen_dmabuf *dmabuf_imp_alloc_storage(int count)
+{
+   struct xen_dmabuf *xen_dmabuf;
+   int i;
+
+   xen_dmabuf = kzalloc(sizeof(*xen_dmabuf), GFP_KERNEL);
+   if (!xen_dmabuf)
+   goto fail;
+
+   xen_dmabuf->u.imp.refs = kcalloc(count,
+sizeof(xen_dmabuf->u.imp.refs[0]),
+GFP_KERNEL);
+   if (!xen_dmabuf->u.imp.refs)
+   goto fail;
+
+   xen_dmabuf->pages = kcalloc(count,
+   sizeof(xen_dmabuf->pages[0]),
+   GFP_KERNEL);
+   if (!xen_dmabuf->pages)
+   goto 

[PATCH 8/8] xen/gntdev: Expose gntdev's dma-buf API for in-kernel use

2018-05-25 Thread Oleksandr Andrushchenko
From: Oleksandr Andrushchenko <oleksandr_andrushche...@epam.com>

Allow creating grant device context for use by kernel modules which
require functionality, provided by gntdev. Export symbols for dma-buf
API provided by the module.

Signed-off-by: Oleksandr Andrushchenko <oleksandr_andrushche...@epam.com>
---
 drivers/xen/gntdev.c| 116 ++--
 include/xen/grant_dev.h |  37 +
 2 files changed, 113 insertions(+), 40 deletions(-)
 create mode 100644 include/xen/grant_dev.h

diff --git a/drivers/xen/gntdev.c b/drivers/xen/gntdev.c
index d8b6168f2cd9..912056f3e909 100644
--- a/drivers/xen/gntdev.c
+++ b/drivers/xen/gntdev.c
@@ -684,14 +684,33 @@ static const struct mmu_notifier_ops gntdev_mmu_ops = {
 
 /* -- */
 
-static int gntdev_open(struct inode *inode, struct file *flip)
+void gntdev_free_context(struct gntdev_priv *priv)
+{
+   struct grant_map *map;
+
+   pr_debug("priv %p\n", priv);
+
+   mutex_lock(>lock);
+   while (!list_empty(>maps)) {
+   map = list_entry(priv->maps.next, struct grant_map, next);
+   list_del(>next);
+   gntdev_put_map(NULL /* already removed */, map);
+   }
+   WARN_ON(!list_empty(>freeable_maps));
+
+   mutex_unlock(>lock);
+
+   kfree(priv);
+}
+EXPORT_SYMBOL(gntdev_free_context);
+
+struct gntdev_priv *gntdev_alloc_context(struct device *dev)
 {
struct gntdev_priv *priv;
-   int ret = 0;
 
priv = kzalloc(sizeof(*priv), GFP_KERNEL);
if (!priv)
-   return -ENOMEM;
+   return ERR_PTR(-ENOMEM);
 
INIT_LIST_HEAD(>maps);
INIT_LIST_HEAD(>freeable_maps);
@@ -704,6 +723,32 @@ static int gntdev_open(struct inode *inode, struct file 
*flip)
INIT_LIST_HEAD(>dmabuf_imp_list);
 #endif
 
+#ifdef CONFIG_XEN_GRANT_DMA_ALLOC
+   priv->dma_dev = dev;
+
+   /*
+* The device is not spawn from a device tree, so arch_setup_dma_ops
+* is not called, thus leaving the device with dummy DMA ops.
+* Fix this call of_dma_configure() with a NULL node to set
+* default DMA ops.
+*/
+   of_dma_configure(priv->dma_dev, NULL);
+#endif
+   pr_debug("priv %p\n", priv);
+
+   return priv;
+}
+EXPORT_SYMBOL(gntdev_alloc_context);
+
+static int gntdev_open(struct inode *inode, struct file *flip)
+{
+   struct gntdev_priv *priv;
+   int ret = 0;
+
+   priv = gntdev_alloc_context(gntdev_miscdev.this_device);
+   if (IS_ERR(priv))
+   return PTR_ERR(priv);
+
if (use_ptemod) {
priv->mm = get_task_mm(current);
if (!priv->mm) {
@@ -716,23 +761,11 @@ static int gntdev_open(struct inode *inode, struct file 
*flip)
}
 
if (ret) {
-   kfree(priv);
+   gntdev_free_context(priv);
return ret;
}
 
flip->private_data = priv;
-#ifdef CONFIG_XEN_GRANT_DMA_ALLOC
-   priv->dma_dev = gntdev_miscdev.this_device;
-
-   /*
-* The device is not spawn from a device tree, so arch_setup_dma_ops
-* is not called, thus leaving the device with dummy DMA ops.
-* Fix this call of_dma_configure() with a NULL node to set
-* default DMA ops.
-*/
-   of_dma_configure(priv->dma_dev, NULL);
-#endif
-   pr_debug("priv %p\n", priv);
 
return 0;
 }
@@ -740,22 +773,11 @@ static int gntdev_open(struct inode *inode, struct file 
*flip)
 static int gntdev_release(struct inode *inode, struct file *flip)
 {
struct gntdev_priv *priv = flip->private_data;
-   struct grant_map *map;
-
-   pr_debug("priv %p\n", priv);
-
-   mutex_lock(>lock);
-   while (!list_empty(>maps)) {
-   map = list_entry(priv->maps.next, struct grant_map, next);
-   list_del(>next);
-   gntdev_put_map(NULL /* already removed */, map);
-   }
-   WARN_ON(!list_empty(>freeable_maps));
-   mutex_unlock(>lock);
 
if (use_ptemod)
mmu_notifier_unregister(>mn, priv->mm);
-   kfree(priv);
+
+   gntdev_free_context(priv);
return 0;
 }
 
@@ -1210,7 +1232,7 @@ dmabuf_exp_wait_obj_get_by_fd(struct gntdev_priv *priv, 
int fd)
return ret;
 }
 
-static int dmabuf_exp_wait_released(struct gntdev_priv *priv, int fd,
+int gntdev_dmabuf_exp_wait_released(struct gntdev_priv *priv, int fd,
int wait_to_ms)
 {
struct xen_dmabuf *xen_dmabuf;
@@ -1242,6 +1264,7 @@ static int dmabuf_exp_wait_released(struct gntdev_priv 
*priv, int fd,
dmabuf_exp_wait_obj_free(priv, obj);
return ret;
 }
+EXPORT_SYMBOL(gntdev_dmabuf_exp_wait_released);
 
 /* ---

Re: [Xen-devel] [RFC 1/3] xen/balloon: Allow allocating DMA buffers

2018-05-22 Thread Oleksandr Andrushchenko

On 05/22/2018 09:02 PM, Boris Ostrovsky wrote:

On 05/22/2018 11:00 AM, Oleksandr Andrushchenko wrote:

On 05/22/2018 05:33 PM, Boris Ostrovsky wrote:

On 05/22/2018 01:55 AM, Oleksandr Andrushchenko wrote:

On 05/21/2018 11:36 PM, Boris Ostrovsky wrote:

On 05/21/2018 03:13 PM, Oleksandr Andrushchenko wrote:

On 05/21/2018 09:53 PM, Boris Ostrovsky wrote:

On 05/21/2018 01:32 PM, Oleksandr Andrushchenko wrote:

On 05/21/2018 07:35 PM, Boris Ostrovsky wrote:

On 05/21/2018 01:40 AM, Oleksandr Andrushchenko wrote:

On 05/19/2018 01:04 AM, Boris Ostrovsky wrote:

On 05/17/2018 04:26 AM, Oleksandr Andrushchenko wrote:

From: Oleksandr Andrushchenko
<oleksandr_andrushche...@epam.com>

A commit message would be useful.

Sure, v1 will have it

Signed-off-by: Oleksandr Andrushchenko
<oleksandr_andrushche...@epam.com>

   for (i = 0; i < nr_pages; i++) {
-    page = alloc_page(gfp);
-    if (page == NULL) {
-    nr_pages = i;
-    state = BP_EAGAIN;
-    break;
+    if (ext_pages) {
+    page = ext_pages[i];
+    } else {
+    page = alloc_page(gfp);
+    if (page == NULL) {
+    nr_pages = i;
+    state = BP_EAGAIN;
+    break;
+    }
   }
   scrub_page(page);
   list_add(>lru, );
@@ -529,7 +565,7 @@ static enum bp_state
decrease_reservation(unsigned long nr_pages, gfp_t gfp)
   i = 0;
   list_for_each_entry_safe(page, tmp, , lru) {
   /* XENMEM_decrease_reservation requires a GFN */
-    frame_list[i++] = xen_page_to_gfn(page);
+    frames[i++] = xen_page_to_gfn(page);
     #ifdef CONFIG_XEN_HAVE_PVMMU
   /*
@@ -552,18 +588,22 @@ static enum bp_state
decrease_reservation(unsigned long nr_pages, gfp_t gfp)
   #endif
   list_del(>lru);
   -    balloon_append(page);
+    if (!ext_pages)
+    balloon_append(page);

So what you are proposing is not really ballooning. You are just
piggybacking on existing interfaces, aren't you?

Sort of. Basically I need to {increase|decrease}_reservation, not
actually
allocating ballooned pages.
Do you think I can simply EXPORT_SYMBOL for
{increase|decrease}_reservation?
Any other suggestion?

I am actually wondering how much of that code you end up reusing.
You
pretty much create new code paths in both routines and common code
ends
up being essentially the hypercall.

Well, I hoped that it would be easier to maintain if I modify
existing
code
to support both use-cases, but I am also ok to create new
routines if
this
seems to be reasonable - please let me know

  So the question is --- would it make
sense to do all of this separately from the balloon driver?

This can be done, but which driver will host this code then? If we
move from
the balloon driver, then this could go to either gntdev or
grant-table.
What's your preference?

A separate module?
Is there any use for this feature outside of your zero-copy DRM
driver?

Intel's hyper dma-buf (Dongwon/Matt CC'ed), V4L/GPU at least.

At the time I tried to upstream zcopy driver it was discussed and
decided that
it would be better if I remove all DRM specific code and move it to
Xen drivers.
Thus, this RFC.

But it can also be implemented as a dedicated Xen dma-buf driver
which
will have all the
code from this RFC + a bit more (char/misc device handling at least).
This will also require a dedicated user-space library, just like
libxengnttab.so
for gntdev (now I have all new IOCTLs covered there).

If the idea of a dedicated Xen dma-buf driver seems to be more
attractive we
can work toward this solution. BTW, I do support this idea, but
was not
sure if Xen community accepts yet another driver which duplicates
quite some code
of the existing gntdev/balloon/grant-table. And now after this RFC I
hope that all cons
and pros of both dedicated driver and gntdev/balloon/grant-table
extension are
clearly seen and we can make a decision.

IIRC the objection for a separate module was in the context of gntdev
was discussion, because (among other things) people didn't want to
have
yet another file in /dev/xen/

Here we are talking about (a new) balloon-like module which doesn't
create any new user-visible interfaces. And as for duplicating code
---
as I said, I am not convinced there is much of duplication.

I might even argue that we should add a new config option for this
module.

I am not quite sure I am fully following you here: so, you suggest
that we have balloon.c unchanged, but instead create a new
module (namely a file under the same folder as balloon.c, e.g.
dma-buf-reservation.c) and move those {increase|decrease}_reservation
routines (specific to dma-buf) to that new file? And make it selectable
via Kconfig? If so, then how about the changes to grant-table and
gntdev?
Those will look inconsistent then.

Inconsistent with what? The changes to grant code will also be under 

Re: [Xen-devel] [RFC 1/3] xen/balloon: Allow allocating DMA buffers

2018-05-22 Thread Oleksandr Andrushchenko

On 05/22/2018 05:33 PM, Boris Ostrovsky wrote:

On 05/22/2018 01:55 AM, Oleksandr Andrushchenko wrote:

On 05/21/2018 11:36 PM, Boris Ostrovsky wrote:

On 05/21/2018 03:13 PM, Oleksandr Andrushchenko wrote:

On 05/21/2018 09:53 PM, Boris Ostrovsky wrote:

On 05/21/2018 01:32 PM, Oleksandr Andrushchenko wrote:

On 05/21/2018 07:35 PM, Boris Ostrovsky wrote:

On 05/21/2018 01:40 AM, Oleksandr Andrushchenko wrote:

On 05/19/2018 01:04 AM, Boris Ostrovsky wrote:

On 05/17/2018 04:26 AM, Oleksandr Andrushchenko wrote:

From: Oleksandr Andrushchenko <oleksandr_andrushche...@epam.com>

A commit message would be useful.

Sure, v1 will have it

Signed-off-by: Oleksandr Andrushchenko
<oleksandr_andrushche...@epam.com>

  for (i = 0; i < nr_pages; i++) {
-    page = alloc_page(gfp);
-    if (page == NULL) {
-    nr_pages = i;
-    state = BP_EAGAIN;
-    break;
+    if (ext_pages) {
+    page = ext_pages[i];
+    } else {
+    page = alloc_page(gfp);
+    if (page == NULL) {
+    nr_pages = i;
+    state = BP_EAGAIN;
+    break;
+    }
  }
  scrub_page(page);
  list_add(>lru, );
@@ -529,7 +565,7 @@ static enum bp_state
decrease_reservation(unsigned long nr_pages, gfp_t gfp)
  i = 0;
  list_for_each_entry_safe(page, tmp, , lru) {
  /* XENMEM_decrease_reservation requires a GFN */
-    frame_list[i++] = xen_page_to_gfn(page);
+    frames[i++] = xen_page_to_gfn(page);
    #ifdef CONFIG_XEN_HAVE_PVMMU
  /*
@@ -552,18 +588,22 @@ static enum bp_state
decrease_reservation(unsigned long nr_pages, gfp_t gfp)
  #endif
  list_del(>lru);
  -    balloon_append(page);
+    if (!ext_pages)
+    balloon_append(page);

So what you are proposing is not really ballooning. You are just
piggybacking on existing interfaces, aren't you?

Sort of. Basically I need to {increase|decrease}_reservation, not
actually
allocating ballooned pages.
Do you think I can simply EXPORT_SYMBOL for
{increase|decrease}_reservation?
Any other suggestion?

I am actually wondering how much of that code you end up reusing.
You
pretty much create new code paths in both routines and common code
ends
up being essentially the hypercall.

Well, I hoped that it would be easier to maintain if I modify
existing
code
to support both use-cases, but I am also ok to create new routines if
this
seems to be reasonable - please let me know

     So the question is --- would it make
sense to do all of this separately from the balloon driver?

This can be done, but which driver will host this code then? If we
move from
the balloon driver, then this could go to either gntdev or
grant-table.
What's your preference?

A separate module?
Is there any use for this feature outside of your zero-copy DRM
driver?

Intel's hyper dma-buf (Dongwon/Matt CC'ed), V4L/GPU at least.

At the time I tried to upstream zcopy driver it was discussed and
decided that
it would be better if I remove all DRM specific code and move it to
Xen drivers.
Thus, this RFC.

But it can also be implemented as a dedicated Xen dma-buf driver which
will have all the
code from this RFC + a bit more (char/misc device handling at least).
This will also require a dedicated user-space library, just like
libxengnttab.so
for gntdev (now I have all new IOCTLs covered there).

If the idea of a dedicated Xen dma-buf driver seems to be more
attractive we
can work toward this solution. BTW, I do support this idea, but was not
sure if Xen community accepts yet another driver which duplicates
quite some code
of the existing gntdev/balloon/grant-table. And now after this RFC I
hope that all cons
and pros of both dedicated driver and gntdev/balloon/grant-table
extension are
clearly seen and we can make a decision.

IIRC the objection for a separate module was in the context of gntdev
was discussion, because (among other things) people didn't want to have
yet another file in /dev/xen/

Here we are talking about (a new) balloon-like module which doesn't
create any new user-visible interfaces. And as for duplicating code ---
as I said, I am not convinced there is much of duplication.

I might even argue that we should add a new config option for this
module.

I am not quite sure I am fully following you here: so, you suggest
that we have balloon.c unchanged, but instead create a new
module (namely a file under the same folder as balloon.c, e.g.
dma-buf-reservation.c) and move those {increase|decrease}_reservation
routines (specific to dma-buf) to that new file? And make it selectable
via Kconfig? If so, then how about the changes to grant-table and gntdev?
Those will look inconsistent then.

Inconsistent with what? The changes to grant code will also be under the
new config option.

Ah, ok.

Option 1. We will have Kconfig option which will cover dma-buf
changes in balloon, g

Re: [Xen-devel][RFC 3/3] xen/gntdev: Add support for Linux dma buffers

2018-05-21 Thread Oleksandr Andrushchenko

On 05/22/2018 12:31 AM, Dongwon Kim wrote:

Still need more time to review the whole code changes

Take your time, I just wanted to make sure that all interested parties
are in the discussion, so we all finally have what we all want, not
a thing covering only my use-cases

  but I noticed one thing.

We've been using the term "hyper_dmabuf" for hypervisor-agnostic linux dmabuf
solution and we are planning to call any of our future solution for other
hypervisors the same name. So having same name for this xen-specific structure
or functions you implemented is confusing. Would you change it to something
else like... "xen_"?

Np, will rename


On Thu, May 17, 2018 at 11:26:04AM +0300, Oleksandr Andrushchenko wrote:

From: Oleksandr Andrushchenko <oleksandr_andrushche...@epam.com>

Signed-off-by: Oleksandr Andrushchenko <oleksandr_andrushche...@epam.com>
---
  drivers/xen/gntdev.c  | 954 +-
  include/uapi/xen/gntdev.h | 101 
  include/xen/gntdev_exp.h  |  23 +
  3 files changed, 1066 insertions(+), 12 deletions(-)
  create mode 100644 include/xen/gntdev_exp.h

diff --git a/drivers/xen/gntdev.c b/drivers/xen/gntdev.c
index 9510f228efe9..0ee88e193362 100644
--- a/drivers/xen/gntdev.c
+++ b/drivers/xen/gntdev.c
@@ -4,6 +4,8 @@
   * Device for accessing (in user-space) pages that have been granted by other
   * domains.
   *
+ * DMA buffer implementation is based on drivers/gpu/drm/drm_prime.c.
+ *
   * Copyright (c) 2006-2007, D G Murray.
   *   (c) 2009 Gerd Hoffmann <kra...@redhat.com>
   *
@@ -37,6 +39,9 @@
  #include 
  #include 
  
+#include 

+#include 
+
  #include 
  #include 
  #include 
@@ -61,16 +66,39 @@ static atomic_t pages_mapped = ATOMIC_INIT(0);
  static int use_ptemod;
  #define populate_freeable_maps use_ptemod
  
+#ifndef GRANT_INVALID_REF

+/*
+ * Note on usage of grant reference 0 as invalid grant reference:
+ * grant reference 0 is valid, but never exposed to a driver,
+ * because of the fact it is already in use/reserved by the PV console.
+ */
+#define GRANT_INVALID_REF  0
+#endif
+
  struct gntdev_priv {
/* maps with visible offsets in the file descriptor */
struct list_head maps;
/* maps that are not visible; will be freed on munmap.
 * Only populated if populate_freeable_maps == 1 */
struct list_head freeable_maps;
+   /* List of dma-bufs. */
+   struct list_head dma_bufs;
/* lock protects maps and freeable_maps */
struct mutex lock;
struct mm_struct *mm;
struct mmu_notifier mn;
+
+   /* Private data of the hyper DMA buffers. */
+
+   struct device *dev;
+   /* List of exported DMA buffers. */
+   struct list_head dmabuf_exp_list;
+   /* List of wait objects. */
+   struct list_head dmabuf_exp_wait_list;
+   /* List of imported DMA buffers. */
+   struct list_head dmabuf_imp_list;
+   /* This is the lock which protects dma_buf_xxx lists. */
+   struct mutex dmabuf_lock;
  };
  
  struct unmap_notify {

@@ -95,10 +123,65 @@ struct grant_map {
struct gnttab_unmap_grant_ref *kunmap_ops;
struct page **pages;
unsigned long pages_vm_start;
+
+   /*
+* All the fields starting with dmabuf_ are only valid if this
+* mapping is used for exporting a DMA buffer.
+* If dmabuf_vaddr is not NULL then this mapping is backed by DMA
+* capable memory.
+*/
+
+   /* Flags used to create this DMA buffer: GNTDEV_DMABUF_FLAG_XXX. */
+   bool dmabuf_flags;
+   /* Virtual/CPU address of the DMA buffer. */
+   void *dmabuf_vaddr;
+   /* Bus address of the DMA buffer. */
+   dma_addr_t dmabuf_bus_addr;
+};
+
+struct hyper_dmabuf {
+   struct gntdev_priv *priv;
+   struct dma_buf *dmabuf;
+   struct list_head next;
+   int fd;
+
+   union {
+   struct {
+   /* Exported buffers are reference counted. */
+   struct kref refcount;
+   struct grant_map *map;
+   } exp;
+   struct {
+   /* Granted references of the imported buffer. */
+   grant_ref_t *refs;
+   /* Scatter-gather table of the imported buffer. */
+   struct sg_table *sgt;
+   /* dma-buf attachment of the imported buffer. */
+   struct dma_buf_attachment *attach;
+   } imp;
+   } u;
+
+   /* Number of pages this buffer has. */
+   int nr_pages;
+   /* Pages of this buffer. */
+   struct page **pages;
+};
+
+struct hyper_dmabuf_wait_obj {
+   struct list_head next;
+   struct hyper_dmabuf *hyper_dmabuf;
+   struct completion completion;
+};
+
+struct hyper_dambuf_attachment {

minor typo: dam->dma (same thing in other places as well.)

sure, thanks



+   struct sg_ta

Re: [Xen-devel] [RFC 1/3] xen/balloon: Allow allocating DMA buffers

2018-05-21 Thread Oleksandr Andrushchenko

On 05/21/2018 11:36 PM, Boris Ostrovsky wrote:

On 05/21/2018 03:13 PM, Oleksandr Andrushchenko wrote:

On 05/21/2018 09:53 PM, Boris Ostrovsky wrote:

On 05/21/2018 01:32 PM, Oleksandr Andrushchenko wrote:

On 05/21/2018 07:35 PM, Boris Ostrovsky wrote:

On 05/21/2018 01:40 AM, Oleksandr Andrushchenko wrote:

On 05/19/2018 01:04 AM, Boris Ostrovsky wrote:

On 05/17/2018 04:26 AM, Oleksandr Andrushchenko wrote:

From: Oleksandr Andrushchenko <oleksandr_andrushche...@epam.com>

A commit message would be useful.

Sure, v1 will have it

Signed-off-by: Oleksandr Andrushchenko
<oleksandr_andrushche...@epam.com>

     for (i = 0; i < nr_pages; i++) {
-    page = alloc_page(gfp);
-    if (page == NULL) {
-    nr_pages = i;
-    state = BP_EAGAIN;
-    break;
+    if (ext_pages) {
+    page = ext_pages[i];
+    } else {
+    page = alloc_page(gfp);
+    if (page == NULL) {
+    nr_pages = i;
+    state = BP_EAGAIN;
+    break;
+    }
     }
     scrub_page(page);
     list_add(>lru, );
@@ -529,7 +565,7 @@ static enum bp_state
decrease_reservation(unsigned long nr_pages, gfp_t gfp)
     i = 0;
     list_for_each_entry_safe(page, tmp, , lru) {
     /* XENMEM_decrease_reservation requires a GFN */
-    frame_list[i++] = xen_page_to_gfn(page);
+    frames[i++] = xen_page_to_gfn(page);
       #ifdef CONFIG_XEN_HAVE_PVMMU
     /*
@@ -552,18 +588,22 @@ static enum bp_state
decrease_reservation(unsigned long nr_pages, gfp_t gfp)
     #endif
     list_del(>lru);
     -    balloon_append(page);
+    if (!ext_pages)
+    balloon_append(page);

So what you are proposing is not really ballooning. You are just
piggybacking on existing interfaces, aren't you?

Sort of. Basically I need to {increase|decrease}_reservation, not
actually
allocating ballooned pages.
Do you think I can simply EXPORT_SYMBOL for
{increase|decrease}_reservation?
Any other suggestion?

I am actually wondering how much of that code you end up reusing. You
pretty much create new code paths in both routines and common code
ends
up being essentially the hypercall.

Well, I hoped that it would be easier to maintain if I modify existing
code
to support both use-cases, but I am also ok to create new routines if
this
seems to be reasonable - please let me know

    So the question is --- would it make
sense to do all of this separately from the balloon driver?

This can be done, but which driver will host this code then? If we
move from
the balloon driver, then this could go to either gntdev or grant-table.
What's your preference?

A separate module?
Is there any use for this feature outside of your zero-copy DRM driver?

Intel's hyper dma-buf (Dongwon/Matt CC'ed), V4L/GPU at least.

At the time I tried to upstream zcopy driver it was discussed and
decided that
it would be better if I remove all DRM specific code and move it to
Xen drivers.
Thus, this RFC.

But it can also be implemented as a dedicated Xen dma-buf driver which
will have all the
code from this RFC + a bit more (char/misc device handling at least).
This will also require a dedicated user-space library, just like
libxengnttab.so
for gntdev (now I have all new IOCTLs covered there).

If the idea of a dedicated Xen dma-buf driver seems to be more
attractive we
can work toward this solution. BTW, I do support this idea, but was not
sure if Xen community accepts yet another driver which duplicates
quite some code
of the existing gntdev/balloon/grant-table. And now after this RFC I
hope that all cons
and pros of both dedicated driver and gntdev/balloon/grant-table
extension are
clearly seen and we can make a decision.


IIRC the objection for a separate module was in the context of gntdev
was discussion, because (among other things) people didn't want to have
yet another file in /dev/xen/

Here we are talking about (a new) balloon-like module which doesn't
create any new user-visible interfaces. And as for duplicating code ---
as I said, I am not convinced there is much of duplication.

I might even argue that we should add a new config option for this module.

I am not quite sure I am fully following you here: so, you suggest
that we have balloon.c unchanged, but instead create a new
module (namely a file under the same folder as balloon.c, e.g.
dma-buf-reservation.c) and move those {increase|decrease}_reservation
routines (specific to dma-buf) to that new file? And make it selectable
via Kconfig? If so, then how about the changes to grant-table and gntdev?
Those will look inconsistent then.

If you suggest a new kernel driver module:
IMO, there is nothing bad if we create a dedicated kernel module
(driver) for Xen dma-buf handling selectable under Kconfig option.
Yes, this will create a yet another device under /dev/xen,
but most people will never see it if we set Kconfig to default to &qu

Re: [Xen-devel] [RFC 1/3] xen/balloon: Allow allocating DMA buffers

2018-05-21 Thread Oleksandr Andrushchenko

On 05/21/2018 09:53 PM, Boris Ostrovsky wrote:

On 05/21/2018 01:32 PM, Oleksandr Andrushchenko wrote:

On 05/21/2018 07:35 PM, Boris Ostrovsky wrote:

On 05/21/2018 01:40 AM, Oleksandr Andrushchenko wrote:

On 05/19/2018 01:04 AM, Boris Ostrovsky wrote:

On 05/17/2018 04:26 AM, Oleksandr Andrushchenko wrote:

From: Oleksandr Andrushchenko <oleksandr_andrushche...@epam.com>

A commit message would be useful.

Sure, v1 will have it

Signed-off-by: Oleksandr Andrushchenko
<oleksandr_andrushche...@epam.com>

    for (i = 0; i < nr_pages; i++) {
-    page = alloc_page(gfp);
-    if (page == NULL) {
-    nr_pages = i;
-    state = BP_EAGAIN;
-    break;
+    if (ext_pages) {
+    page = ext_pages[i];
+    } else {
+    page = alloc_page(gfp);
+    if (page == NULL) {
+    nr_pages = i;
+    state = BP_EAGAIN;
+    break;
+    }
    }
    scrub_page(page);
    list_add(>lru, );
@@ -529,7 +565,7 @@ static enum bp_state
decrease_reservation(unsigned long nr_pages, gfp_t gfp)
    i = 0;
    list_for_each_entry_safe(page, tmp, , lru) {
    /* XENMEM_decrease_reservation requires a GFN */
-    frame_list[i++] = xen_page_to_gfn(page);
+    frames[i++] = xen_page_to_gfn(page);
      #ifdef CONFIG_XEN_HAVE_PVMMU
    /*
@@ -552,18 +588,22 @@ static enum bp_state
decrease_reservation(unsigned long nr_pages, gfp_t gfp)
    #endif
    list_del(>lru);
    -    balloon_append(page);
+    if (!ext_pages)
+    balloon_append(page);

So what you are proposing is not really ballooning. You are just
piggybacking on existing interfaces, aren't you?

Sort of. Basically I need to {increase|decrease}_reservation, not
actually
allocating ballooned pages.
Do you think I can simply EXPORT_SYMBOL for
{increase|decrease}_reservation?
Any other suggestion?

I am actually wondering how much of that code you end up reusing. You
pretty much create new code paths in both routines and common code ends
up being essentially the hypercall.

Well, I hoped that it would be easier to maintain if I modify existing
code
to support both use-cases, but I am also ok to create new routines if
this
seems to be reasonable - please let me know

   So the question is --- would it make
sense to do all of this separately from the balloon driver?

This can be done, but which driver will host this code then? If we
move from
the balloon driver, then this could go to either gntdev or grant-table.
What's your preference?

A separate module?



Is there any use for this feature outside of your zero-copy DRM driver?

Intel's hyper dma-buf (Dongwon/Matt CC'ed), V4L/GPU at least.

At the time I tried to upstream zcopy driver it was discussed and 
decided that
it would be better if I remove all DRM specific code and move it to Xen 
drivers.

Thus, this RFC.

But it can also be implemented as a dedicated Xen dma-buf driver which 
will have all the

code from this RFC + a bit more (char/misc device handling at least).
This will also require a dedicated user-space library, just like 
libxengnttab.so

for gntdev (now I have all new IOCTLs covered there).

If the idea of a dedicated Xen dma-buf driver seems to be more attractive we
can work toward this solution. BTW, I do support this idea, but was not
sure if Xen community accepts yet another driver which duplicates quite 
some code
of the existing gntdev/balloon/grant-table. And now after this RFC I 
hope that all cons
and pros of both dedicated driver and gntdev/balloon/grant-table 
extension are

clearly seen and we can make a decision.



-boris

Thank you,
Oleksandr
[1] https://lists.freedesktop.org/archives/dri-devel/2018-April/173163.html


Re: [Xen-devel] [RFC 1/3] xen/balloon: Allow allocating DMA buffers

2018-05-21 Thread Oleksandr Andrushchenko

On 05/21/2018 07:35 PM, Boris Ostrovsky wrote:

On 05/21/2018 01:40 AM, Oleksandr Andrushchenko wrote:

On 05/19/2018 01:04 AM, Boris Ostrovsky wrote:

On 05/17/2018 04:26 AM, Oleksandr Andrushchenko wrote:

From: Oleksandr Andrushchenko <oleksandr_andrushche...@epam.com>

A commit message would be useful.

Sure, v1 will have it

Signed-off-by: Oleksandr Andrushchenko
<oleksandr_andrushche...@epam.com>

   for (i = 0; i < nr_pages; i++) {
-    page = alloc_page(gfp);
-    if (page == NULL) {
-    nr_pages = i;
-    state = BP_EAGAIN;
-    break;
+    if (ext_pages) {
+    page = ext_pages[i];
+    } else {
+    page = alloc_page(gfp);
+    if (page == NULL) {
+    nr_pages = i;
+    state = BP_EAGAIN;
+    break;
+    }
   }
   scrub_page(page);
   list_add(>lru, );
@@ -529,7 +565,7 @@ static enum bp_state
decrease_reservation(unsigned long nr_pages, gfp_t gfp)
   i = 0;
   list_for_each_entry_safe(page, tmp, , lru) {
   /* XENMEM_decrease_reservation requires a GFN */
-    frame_list[i++] = xen_page_to_gfn(page);
+    frames[i++] = xen_page_to_gfn(page);
     #ifdef CONFIG_XEN_HAVE_PVMMU
   /*
@@ -552,18 +588,22 @@ static enum bp_state
decrease_reservation(unsigned long nr_pages, gfp_t gfp)
   #endif
   list_del(>lru);
   -    balloon_append(page);
+    if (!ext_pages)
+    balloon_append(page);

So what you are proposing is not really ballooning. You are just
piggybacking on existing interfaces, aren't you?

Sort of. Basically I need to {increase|decrease}_reservation, not
actually
allocating ballooned pages.
Do you think I can simply EXPORT_SYMBOL for
{increase|decrease}_reservation?
Any other suggestion?


I am actually wondering how much of that code you end up reusing. You
pretty much create new code paths in both routines and common code ends
up being essentially the hypercall.

Well, I hoped that it would be easier to maintain if I modify existing code
to support both use-cases, but I am also ok to create new routines if this
seems to be reasonable - please let me know

  So the question is --- would it make
sense to do all of this separately from the balloon driver?

This can be done, but which driver will host this code then? If we move from
the balloon driver, then this could go to either gntdev or grant-table.
What's your preference?


-boris

Thank you,
Oleksandr


Re: [Xen-devel][RFC 2/3] xen/grant-table: Extend API to work with DMA buffers

2018-05-20 Thread Oleksandr Andrushchenko

On 05/19/2018 01:19 AM, Boris Ostrovsky wrote:

On 05/17/2018 04:26 AM, Oleksandr Andrushchenko wrote:

From: Oleksandr Andrushchenko <oleksandr_andrushche...@epam.com>

Signed-off-by: Oleksandr Andrushchenko <oleksandr_andrushche...@epam.com>
---
  drivers/xen/grant-table.c | 49 +++
  include/xen/grant_table.h |  7 ++
  2 files changed, 56 insertions(+)

diff --git a/drivers/xen/grant-table.c b/drivers/xen/grant-table.c
index bb36b1e1dbcc..c27bcc420575 100644
--- a/drivers/xen/grant-table.c
+++ b/drivers/xen/grant-table.c
@@ -729,6 +729,55 @@ void gnttab_free_pages(int nr_pages, struct page **pages)
  }
  EXPORT_SYMBOL(gnttab_free_pages);
  
+int gnttab_dma_alloc_pages(struct device *dev, bool coherent,

+  int nr_pages, struct page **pages,
+  void **vaddr, dma_addr_t *dev_bus_addr)
+{
+   int i;
+   int ret;
+
+   ret = alloc_dma_xenballooned_pages(dev, coherent, nr_pages, pages,
+  vaddr, dev_bus_addr);
+   if (ret < 0)
+   return ret;
+
+   for (i = 0; i < nr_pages; i++) {
+#if BITS_PER_LONG < 64
+   struct xen_page_foreign *foreign;
+
+   foreign = kzalloc(sizeof(*foreign), GFP_KERNEL);
+   if (!foreign) {
+   gnttab_dma_free_pages(dev, flags, nr_pages, pages,
+ *vaddr, *dev_bus_addr);
+   return -ENOMEM;
+   }
+   set_page_private(pages[i], (unsigned long)foreign);
+#endif
+   SetPagePrivate(pages[i]);
+   }
+   return 0;
+}
+EXPORT_SYMBOL(gnttab_dma_alloc_pages);
+
+void gnttab_dma_free_pages(struct device *dev, bool coherent,
+  int nr_pages, struct page **pages,
+  void *vaddr, dma_addr_t dev_bus_addr)
+{
+   int i;
+
+   for (i = 0; i < nr_pages; i++) {
+   if (PagePrivate(pages[i])) {
+#if BITS_PER_LONG < 64
+   kfree((void *)page_private(pages[i]));
+#endif
+   ClearPagePrivate(pages[i]);
+   }
+   }
+   free_dma_xenballooned_pages(dev, coherent, nr_pages, pages,
+   vaddr, dev_bus_addr);
+}
+EXPORT_SYMBOL(gnttab_dma_free_pages);


Given that these routines look almost exactly like their non-dma
counterparts I wonder whether common code could be factored out.

Yes, this can be done

-boris





+
  /* Handling of paged out grant targets (GNTST_eagain) */
  #define MAX_DELAY 256
  static inline void
diff --git a/include/xen/grant_table.h b/include/xen/grant_table.h
index 34b1379f9777..20ee2b5ba965 100644
--- a/include/xen/grant_table.h
+++ b/include/xen/grant_table.h
@@ -195,6 +195,13 @@ void gnttab_free_auto_xlat_frames(void);
  int gnttab_alloc_pages(int nr_pages, struct page **pages);
  void gnttab_free_pages(int nr_pages, struct page **pages);
  
+int gnttab_dma_alloc_pages(struct device *dev, bool coherent,

+  int nr_pages, struct page **pages,
+  void **vaddr, dma_addr_t *dev_bus_addr);
+void gnttab_dma_free_pages(struct device *dev, bool coherent,
+  int nr_pages, struct page **pages,
+  void *vaddr, dma_addr_t dev_bus_addr);
+
  int gnttab_map_refs(struct gnttab_map_grant_ref *map_ops,
struct gnttab_map_grant_ref *kmap_ops,
struct page **pages, unsigned int count);




Re: [Xen-devel] [RFC 1/3] xen/balloon: Allow allocating DMA buffers

2018-05-20 Thread Oleksandr Andrushchenko

On 05/19/2018 01:04 AM, Boris Ostrovsky wrote:

On 05/17/2018 04:26 AM, Oleksandr Andrushchenko wrote:

From: Oleksandr Andrushchenko <oleksandr_andrushche...@epam.com>


A commit message would be useful.

Sure, v1 will have it



Signed-off-by: Oleksandr Andrushchenko <oleksandr_andrushche...@epam.com>

for (i = 0; i < nr_pages; i++) {
-   page = alloc_page(gfp);
-   if (page == NULL) {
-   nr_pages = i;
-   state = BP_EAGAIN;
-   break;
+   if (ext_pages) {
+   page = ext_pages[i];
+   } else {
+   page = alloc_page(gfp);
+   if (page == NULL) {
+   nr_pages = i;
+   state = BP_EAGAIN;
+   break;
+   }
}
scrub_page(page);
list_add(>lru, );
@@ -529,7 +565,7 @@ static enum bp_state decrease_reservation(unsigned long 
nr_pages, gfp_t gfp)
i = 0;
list_for_each_entry_safe(page, tmp, , lru) {
/* XENMEM_decrease_reservation requires a GFN */
-   frame_list[i++] = xen_page_to_gfn(page);
+   frames[i++] = xen_page_to_gfn(page);
  
  #ifdef CONFIG_XEN_HAVE_PVMMU

/*
@@ -552,18 +588,22 @@ static enum bp_state decrease_reservation(unsigned long 
nr_pages, gfp_t gfp)
  #endif
list_del(>lru);
  
-		balloon_append(page);

+   if (!ext_pages)
+   balloon_append(page);


So what you are proposing is not really ballooning. You are just
piggybacking on existing interfaces, aren't you?

Sort of. Basically I need to {increase|decrease}_reservation, not actually
allocating ballooned pages.
Do you think I can simply EXPORT_SYMBOL for {increase|decrease}_reservation?
Any other suggestion?

-boris



Thank you,
Oleksandr


[Xen-devel][RFC 2/3] xen/grant-table: Extend API to work with DMA buffers

2018-05-17 Thread Oleksandr Andrushchenko
From: Oleksandr Andrushchenko <oleksandr_andrushche...@epam.com>

Signed-off-by: Oleksandr Andrushchenko <oleksandr_andrushche...@epam.com>
---
 drivers/xen/grant-table.c | 49 +++
 include/xen/grant_table.h |  7 ++
 2 files changed, 56 insertions(+)

diff --git a/drivers/xen/grant-table.c b/drivers/xen/grant-table.c
index bb36b1e1dbcc..c27bcc420575 100644
--- a/drivers/xen/grant-table.c
+++ b/drivers/xen/grant-table.c
@@ -729,6 +729,55 @@ void gnttab_free_pages(int nr_pages, struct page **pages)
 }
 EXPORT_SYMBOL(gnttab_free_pages);
 
+int gnttab_dma_alloc_pages(struct device *dev, bool coherent,
+  int nr_pages, struct page **pages,
+  void **vaddr, dma_addr_t *dev_bus_addr)
+{
+   int i;
+   int ret;
+
+   ret = alloc_dma_xenballooned_pages(dev, coherent, nr_pages, pages,
+  vaddr, dev_bus_addr);
+   if (ret < 0)
+   return ret;
+
+   for (i = 0; i < nr_pages; i++) {
+#if BITS_PER_LONG < 64
+   struct xen_page_foreign *foreign;
+
+   foreign = kzalloc(sizeof(*foreign), GFP_KERNEL);
+   if (!foreign) {
+   gnttab_dma_free_pages(dev, flags, nr_pages, pages,
+ *vaddr, *dev_bus_addr);
+   return -ENOMEM;
+   }
+   set_page_private(pages[i], (unsigned long)foreign);
+#endif
+   SetPagePrivate(pages[i]);
+   }
+   return 0;
+}
+EXPORT_SYMBOL(gnttab_dma_alloc_pages);
+
+void gnttab_dma_free_pages(struct device *dev, bool coherent,
+  int nr_pages, struct page **pages,
+  void *vaddr, dma_addr_t dev_bus_addr)
+{
+   int i;
+
+   for (i = 0; i < nr_pages; i++) {
+   if (PagePrivate(pages[i])) {
+#if BITS_PER_LONG < 64
+   kfree((void *)page_private(pages[i]));
+#endif
+   ClearPagePrivate(pages[i]);
+   }
+   }
+   free_dma_xenballooned_pages(dev, coherent, nr_pages, pages,
+   vaddr, dev_bus_addr);
+}
+EXPORT_SYMBOL(gnttab_dma_free_pages);
+
 /* Handling of paged out grant targets (GNTST_eagain) */
 #define MAX_DELAY 256
 static inline void
diff --git a/include/xen/grant_table.h b/include/xen/grant_table.h
index 34b1379f9777..20ee2b5ba965 100644
--- a/include/xen/grant_table.h
+++ b/include/xen/grant_table.h
@@ -195,6 +195,13 @@ void gnttab_free_auto_xlat_frames(void);
 int gnttab_alloc_pages(int nr_pages, struct page **pages);
 void gnttab_free_pages(int nr_pages, struct page **pages);
 
+int gnttab_dma_alloc_pages(struct device *dev, bool coherent,
+  int nr_pages, struct page **pages,
+  void **vaddr, dma_addr_t *dev_bus_addr);
+void gnttab_dma_free_pages(struct device *dev, bool coherent,
+  int nr_pages, struct page **pages,
+  void *vaddr, dma_addr_t dev_bus_addr);
+
 int gnttab_map_refs(struct gnttab_map_grant_ref *map_ops,
struct gnttab_map_grant_ref *kmap_ops,
struct page **pages, unsigned int count);
-- 
2.17.0



[Xen-devel][RFC 1/3] xen/balloon: Allow allocating DMA buffers

2018-05-17 Thread Oleksandr Andrushchenko
From: Oleksandr Andrushchenko <oleksandr_andrushche...@epam.com>

Signed-off-by: Oleksandr Andrushchenko <oleksandr_andrushche...@epam.com>
---
 drivers/xen/balloon.c | 214 +++---
 drivers/xen/xen-balloon.c |   2 +
 include/xen/balloon.h |  11 +-
 3 files changed, 188 insertions(+), 39 deletions(-)

diff --git a/drivers/xen/balloon.c b/drivers/xen/balloon.c
index e4db19e88ab1..e3a145aa9f29 100644
--- a/drivers/xen/balloon.c
+++ b/drivers/xen/balloon.c
@@ -415,8 +415,10 @@ static bool balloon_is_inflated(void)
return balloon_stats.balloon_low || balloon_stats.balloon_high;
 }
 
-static enum bp_state increase_reservation(unsigned long nr_pages)
+static enum bp_state increase_reservation(unsigned long nr_pages,
+ struct page **ext_pages)
 {
+   enum bp_state ret = BP_DONE;
int rc;
unsigned long i;
struct page   *page;
@@ -425,32 +427,49 @@ static enum bp_state increase_reservation(unsigned long 
nr_pages)
.extent_order = EXTENT_ORDER,
.domid= DOMID_SELF
};
+   xen_pfn_t *frames;
 
-   if (nr_pages > ARRAY_SIZE(frame_list))
-   nr_pages = ARRAY_SIZE(frame_list);
+   if (nr_pages > ARRAY_SIZE(frame_list)) {
+   frames = kcalloc(nr_pages, sizeof(xen_pfn_t), GFP_KERNEL);
+   if (!frames)
+   return BP_ECANCELED;
+   } else {
+   frames = frame_list;
+   }
 
-   page = list_first_entry_or_null(_pages, struct page, lru);
-   for (i = 0; i < nr_pages; i++) {
-   if (!page) {
-   nr_pages = i;
-   break;
-   }
+   /* XENMEM_populate_physmap requires a PFN based on Xen
+* granularity.
+*/
+   if (ext_pages) {
+   for (i = 0; i < nr_pages; i++)
+   frames[i] = page_to_xen_pfn(ext_pages[i]);
+   } else {
+   page = list_first_entry_or_null(_pages,
+   struct page, lru);
+   for (i = 0; i < nr_pages; i++) {
+   if (!page) {
+   nr_pages = i;
+   break;
+   }
 
-   /* XENMEM_populate_physmap requires a PFN based on Xen
-* granularity.
-*/
-   frame_list[i] = page_to_xen_pfn(page);
-   page = balloon_next_page(page);
+   frames[i] = page_to_xen_pfn(page);
+   page = balloon_next_page(page);
+   }
}
 
-   set_xen_guest_handle(reservation.extent_start, frame_list);
+   set_xen_guest_handle(reservation.extent_start, frames);
reservation.nr_extents = nr_pages;
rc = HYPERVISOR_memory_op(XENMEM_populate_physmap, );
-   if (rc <= 0)
-   return BP_EAGAIN;
+   if (rc <= 0) {
+   ret = BP_EAGAIN;
+   goto out;
+   }
 
for (i = 0; i < rc; i++) {
-   page = balloon_retrieve(false);
+   if (ext_pages)
+   page = ext_pages[i];
+   else
+   page = balloon_retrieve(false);
BUG_ON(page == NULL);
 
 #ifdef CONFIG_XEN_HAVE_PVMMU
@@ -463,14 +482,14 @@ static enum bp_state increase_reservation(unsigned long 
nr_pages)
if (!xen_feature(XENFEAT_auto_translated_physmap)) {
unsigned long pfn = page_to_pfn(page);
 
-   set_phys_to_machine(pfn, frame_list[i]);
+   set_phys_to_machine(pfn, frames[i]);
 
/* Link back into the page tables if not highmem. */
if (!PageHighMem(page)) {
int ret;
ret = HYPERVISOR_update_va_mapping(
(unsigned long)__va(pfn << 
PAGE_SHIFT),
-   mfn_pte(frame_list[i], 
PAGE_KERNEL),
+   mfn_pte(frames[i], PAGE_KERNEL),
0);
BUG_ON(ret);
}
@@ -478,15 +497,22 @@ static enum bp_state increase_reservation(unsigned long 
nr_pages)
 #endif
 
/* Relinquish the page back to the allocator. */
-   __free_reserved_page(page);
+   if (!ext_pages)
+   __free_reserved_page(page);
}
 
-   balloon_stats.current_pages += rc;
+   if (!ext_pages)
+   balloon_stats.current_pages += rc;
 
-   return BP_DONE;
+out:
+   if (frames != frame_list)
+   kfree(frames);
+
+   return ret;
 }
 
-static enum bp_state decrease_reservation(unsigned long 

[Xen-devel][RFC 3/3] xen/gntdev: Add support for Linux dma buffers

2018-05-17 Thread Oleksandr Andrushchenko
From: Oleksandr Andrushchenko <oleksandr_andrushche...@epam.com>

Signed-off-by: Oleksandr Andrushchenko <oleksandr_andrushche...@epam.com>
---
 drivers/xen/gntdev.c  | 954 +-
 include/uapi/xen/gntdev.h | 101 
 include/xen/gntdev_exp.h  |  23 +
 3 files changed, 1066 insertions(+), 12 deletions(-)
 create mode 100644 include/xen/gntdev_exp.h

diff --git a/drivers/xen/gntdev.c b/drivers/xen/gntdev.c
index 9510f228efe9..0ee88e193362 100644
--- a/drivers/xen/gntdev.c
+++ b/drivers/xen/gntdev.c
@@ -4,6 +4,8 @@
  * Device for accessing (in user-space) pages that have been granted by other
  * domains.
  *
+ * DMA buffer implementation is based on drivers/gpu/drm/drm_prime.c.
+ *
  * Copyright (c) 2006-2007, D G Murray.
  *   (c) 2009 Gerd Hoffmann <kra...@redhat.com>
  *
@@ -37,6 +39,9 @@
 #include 
 #include 
 
+#include 
+#include 
+
 #include 
 #include 
 #include 
@@ -61,16 +66,39 @@ static atomic_t pages_mapped = ATOMIC_INIT(0);
 static int use_ptemod;
 #define populate_freeable_maps use_ptemod
 
+#ifndef GRANT_INVALID_REF
+/*
+ * Note on usage of grant reference 0 as invalid grant reference:
+ * grant reference 0 is valid, but never exposed to a driver,
+ * because of the fact it is already in use/reserved by the PV console.
+ */
+#define GRANT_INVALID_REF  0
+#endif
+
 struct gntdev_priv {
/* maps with visible offsets in the file descriptor */
struct list_head maps;
/* maps that are not visible; will be freed on munmap.
 * Only populated if populate_freeable_maps == 1 */
struct list_head freeable_maps;
+   /* List of dma-bufs. */
+   struct list_head dma_bufs;
/* lock protects maps and freeable_maps */
struct mutex lock;
struct mm_struct *mm;
struct mmu_notifier mn;
+
+   /* Private data of the hyper DMA buffers. */
+
+   struct device *dev;
+   /* List of exported DMA buffers. */
+   struct list_head dmabuf_exp_list;
+   /* List of wait objects. */
+   struct list_head dmabuf_exp_wait_list;
+   /* List of imported DMA buffers. */
+   struct list_head dmabuf_imp_list;
+   /* This is the lock which protects dma_buf_xxx lists. */
+   struct mutex dmabuf_lock;
 };
 
 struct unmap_notify {
@@ -95,10 +123,65 @@ struct grant_map {
struct gnttab_unmap_grant_ref *kunmap_ops;
struct page **pages;
unsigned long pages_vm_start;
+
+   /*
+* All the fields starting with dmabuf_ are only valid if this
+* mapping is used for exporting a DMA buffer.
+* If dmabuf_vaddr is not NULL then this mapping is backed by DMA
+* capable memory.
+*/
+
+   /* Flags used to create this DMA buffer: GNTDEV_DMABUF_FLAG_XXX. */
+   bool dmabuf_flags;
+   /* Virtual/CPU address of the DMA buffer. */
+   void *dmabuf_vaddr;
+   /* Bus address of the DMA buffer. */
+   dma_addr_t dmabuf_bus_addr;
+};
+
+struct hyper_dmabuf {
+   struct gntdev_priv *priv;
+   struct dma_buf *dmabuf;
+   struct list_head next;
+   int fd;
+
+   union {
+   struct {
+   /* Exported buffers are reference counted. */
+   struct kref refcount;
+   struct grant_map *map;
+   } exp;
+   struct {
+   /* Granted references of the imported buffer. */
+   grant_ref_t *refs;
+   /* Scatter-gather table of the imported buffer. */
+   struct sg_table *sgt;
+   /* dma-buf attachment of the imported buffer. */
+   struct dma_buf_attachment *attach;
+   } imp;
+   } u;
+
+   /* Number of pages this buffer has. */
+   int nr_pages;
+   /* Pages of this buffer. */
+   struct page **pages;
+};
+
+struct hyper_dmabuf_wait_obj {
+   struct list_head next;
+   struct hyper_dmabuf *hyper_dmabuf;
+   struct completion completion;
+};
+
+struct hyper_dambuf_attachment {
+   struct sg_table *sgt;
+   enum dma_data_direction dir;
 };
 
 static int unmap_grant_pages(struct grant_map *map, int offset, int pages);
 
+static struct miscdevice gntdev_miscdev;
+
 /* -- */
 
 static void gntdev_print_maps(struct gntdev_priv *priv,
@@ -120,8 +203,17 @@ static void gntdev_free_map(struct grant_map *map)
if (map == NULL)
return;
 
-   if (map->pages)
+   if (map->dmabuf_vaddr) {
+   bool coherent = map->dmabuf_flags &
+   GNTDEV_DMABUF_FLAG_DMA_COHERENT;
+
+   gnttab_dma_free_pages(gntdev_miscdev.this_device,
+ coherent, map->count, map->pages,
+ map->dmabuf_vaddr, map->dmab

[Xen-devel][RFC 0/3] dma-buf support for gntdev

2018-05-17 Thread Oleksandr Andrushchenko
From: Oleksandr Andrushchenko <oleksandr_andrushche...@epam.com>

This work is in response to my previous attempt to introduce Xen/DRM
zero-copy driver [1] to enable Linux dma-buf API [2] for Xen based
frontends/backends. There is also an existing hyper_dmabuf approach
available [3] which, if reworked to utilize the proposed solution,
can greatly benefit as well.

The original rationale behind this work was to enable zero-copying
use-cases while working with Xen para-virtual display driver [4]:
when using Xen PV DRM frontend driver then on backend side one will
need to do copying of display buffers' contents (filled by the
frontend's user-space) into buffers allocated at the backend side.
Taking into account the size of display buffers and frames per
second it may result in unneeded huge data bus occupation and
performance loss.

The helper driver [4] allows implementing zero-copying use-cases
when using Xen para-virtualized frontend display driver by implementing
a DRM/KMS helper driver running on backend's side.
It utilizes PRIME buffers API (implemented on top of Linux dma-buf)
to share frontend's buffers with physical device drivers on
backend's side:

 - a dumb buffer created on backend's side can be shared
   with the Xen PV frontend driver, so it directly writes
   into backend's domain memory (into the buffer exported from
   DRM/KMS driver of a physical display device)
 - a dumb buffer allocated by the frontend can be imported
   into physical device DRM/KMS driver, thus allowing to
   achieve no copying as well

Finally, it was discussed and decided ([1], [5]) that it is worth
implementing such use-cases via extension of the existing Xen gntdev
driver instead of introducing new DRM specific driver.
Please note, that the support of dma-buf is Linux only,
as dma-buf is a Linux only thing.

Now to the proposed solution. The changes  to the existing Xen drivers
in the Linux kernel fall into 2 categories:
1. DMA-able memory buffer allocation and ballooning in/out the pages
   of such a buffer.
   This is required if we are about to share dma-buf with the hardware
   that does require those to be allocated with dma_alloc_xxx API.
   (It is still possible to allocate a dma-buf from any other memory,
   e.g. system pages).
2. Extension of the gntdev driver to enable it to import/export dma-buf’s.

The first two patches in this series solve #1 and the last one is for #2.
The corresponding libxengnttab changes are available at [6].

All the above was tested with display backend [7] and its accompanying
helper library [8] on Renesas ARM64 based board.


*To all the communities*: I would like to ask you to review the proposed
solution and give feedback on it, so I can improve and send final
patches for review (this is still work in progress, but enough to start
discussing the implementation).


Thank you in advance,
Oleksandr Andrushchenko

[1] https://lists.freedesktop.org/archives/dri-devel/2018-April/173163.html
[2] 
https://elixir.bootlin.com/linux/v4.17-rc5/source/Documentation/driver-api/dma-buf.rst
[3] https://lists.xenproject.org/archives/html/xen-devel/2018-02/msg01202.html
[4] https://cgit.freedesktop.org/drm/drm-misc/tree/drivers/gpu/drm/xen
[5] https://patchwork.kernel.org/patch/10279681/
[6] https://github.com/andr2000/xen/tree/xen_dma_buf_v1
[7] https://github.com/andr2000/displ_be/tree/xen_dma_buf_v1
[8] https://github.com/andr2000/libxenbe/tree/xen_dma_buf_v1

Oleksandr Andrushchenko (3):
  xen/balloon: Allow allocating DMA buffers
  xen/grant-table: Extend API to work with DMA buffers
  xen/gntdev: Add support for Linux dma buffers

 drivers/xen/balloon.c | 214 +++--
 drivers/xen/gntdev.c  | 954 +-
 drivers/xen/grant-table.c |  49 ++
 drivers/xen/xen-balloon.c |   2 +
 include/uapi/xen/gntdev.h | 101 
 include/xen/balloon.h |  11 +-
 include/xen/gntdev_exp.h  |  23 +
 include/xen/grant_table.h |   7 +
 8 files changed, 1310 insertions(+), 51 deletions(-)
 create mode 100644 include/xen/gntdev_exp.h

-- 
2.17.0



Re: [RfC PATCH] Add udmabuf misc device

2018-04-16 Thread Oleksandr Andrushchenko

On 04/16/2018 12:32 PM, Daniel Vetter wrote:

On Mon, Apr 16, 2018 at 10:22 AM, Oleksandr Andrushchenko
<andr2...@gmail.com> wrote:

On 04/16/2018 10:43 AM, Daniel Vetter wrote:

On Mon, Apr 16, 2018 at 10:16:31AM +0300, Oleksandr Andrushchenko wrote:

On 04/13/2018 06:37 PM, Daniel Vetter wrote:

On Wed, Apr 11, 2018 at 08:59:32AM +0300, Oleksandr Andrushchenko wrote:

On 04/10/2018 08:26 PM, Dongwon Kim wrote:

On Tue, Apr 10, 2018 at 09:37:53AM +0300, Oleksandr Andrushchenko
wrote:

On 04/06/2018 09:57 PM, Dongwon Kim wrote:

On Fri, Apr 06, 2018 at 03:36:03PM +0300, Oleksandr Andrushchenko
wrote:

On 04/06/2018 02:57 PM, Gerd Hoffmann wrote:

  Hi,


I fail to see any common ground for xen-zcopy and udmabuf ...

Does the above mean you can assume that xen-zcopy and udmabuf
can co-exist as two different solutions?

Well, udmabuf route isn't fully clear yet, but yes.

See also gvt (intel vgpu), where the hypervisor interface is
abstracted
away into a separate kernel modules even though most of the actual
vgpu
emulation code is common.

Thank you for your input, I'm just trying to figure out
which of the three z-copy solutions intersect and how much

And what about hyper-dmabuf?

xen z-copy solution is pretty similar fundamentally to hyper_dmabuf
in terms of these core sharing feature:

1. the sharing process - import prime/dmabuf from the producer ->
extract
underlying pages and get those shared -> return references for
shared pages

Another thing is danvet was kind of against to the idea of importing
existing
dmabuf/prime buffer and forward it to the other domain due to
synchronization
issues. He proposed to make hyper_dmabuf only work as an exporter so
that it
can have a full control over the buffer. I think we need to talk about
this
further as well.

Yes, I saw this. But this limits the use-cases so much.
For instance, running Android as a Guest (which uses ION to allocate
buffers) means that finally HW composer will import dma-buf into
the DRM driver. Then, in case of xen-front for example, it needs to be
shared with the backend (Host side). Of course, we can change
user-space
to make xen-front allocate the buffers (make it exporter), but what we
try
to avoid is to change user-space which in normal world would have
remain
unchanged otherwise.
So, I do think we have to support this use-case and just have to
understand
the complexity.

Erm, why do you need importer capability for this use-case?

guest1 -> ION -> xen-front -> hypervisor -> guest 2 -> xen-zcopy exposes
that dma-buf -> import to the real display hw

No where in this chain do you need xen-zcopy to be able to import a
dma-buf (within linux, it needs to import a bunch of pages from the
hypervisor).

Now if your plan is to use xen-zcopy in the guest1 instead of xen-front,
then you indeed need to import.

This is the exact use-case I was referring to while saying
we need to import on Guest1 side. If hyper-dmabuf is so
generic that there is no xen-front in the picture, then
it needs to import a dma-buf, so it can be exported at Guest2 side.

But that imo doesn't make sense:
- xen-front gives you clearly defined flip events you can forward to the
 hypervisor. xen-zcopy would need to add that again.

xen-zcopy is a helper driver which doesn't handle page flips
and is not a KMS driver as one might think of: the DRM UAPI it uses is
just to export a dma-buf as a PRIME buffer, but that's it.
Flipping etc. is done by the backend [1], not xen-zcopy.

Same for
 hyperdmabuf (and really we're not going to shuffle struct dma_fence
over
 the wire in a generic fashion between hypervisor guests).

- xen-front already has the idea of pixel format for the buffer (and any
 other metadata). Again, xen-zcopy and hyperdmabuf lack that, would
need
 to add it shoehorned in somehow.

Again, here you are talking of something which is implemented in
Xen display backend, not xen-zcopy, e.g. display backend can
implement para-virtual display w/o xen-zcopy at all, but in this case
there is a memory copying for each frame. With the help of xen-zcopy
the backend feeds xen-front's buffers directly into Guest2 DRM/KMS or
Weston or whatever as xen-zcopy exports remote buffers as PRIME buffers,
thus no buffer copying is required.

Why do you need to copy on every frame for xen-front? In the above
pipeline, using xen-front I see 0 architectural reasons to have a copy
anywhere.

This seems to be the core of the confusion we're having here.

Ok, so I'll try to explain:
1. xen-front - produces a display buffer to be shown at Guest2
by the backend, shares its grant references with the backend
2. xen-front sends page flip event to the backend specifying the
buffer in question
3. Backend takes the shared buffer (which is only a buffer mapped into
backend's memory, it is not a dma-buf/PRIME one) and makes memcpy from
it to a local dumb/surface

Why do you even do that? The copying here I mean - why don't you just
di

Re: [RfC PATCH] Add udmabuf misc device

2018-04-16 Thread Oleksandr Andrushchenko

On 04/16/2018 10:43 AM, Daniel Vetter wrote:

On Mon, Apr 16, 2018 at 10:16:31AM +0300, Oleksandr Andrushchenko wrote:

On 04/13/2018 06:37 PM, Daniel Vetter wrote:

On Wed, Apr 11, 2018 at 08:59:32AM +0300, Oleksandr Andrushchenko wrote:

On 04/10/2018 08:26 PM, Dongwon Kim wrote:

On Tue, Apr 10, 2018 at 09:37:53AM +0300, Oleksandr Andrushchenko wrote:

On 04/06/2018 09:57 PM, Dongwon Kim wrote:

On Fri, Apr 06, 2018 at 03:36:03PM +0300, Oleksandr Andrushchenko wrote:

On 04/06/2018 02:57 PM, Gerd Hoffmann wrote:

 Hi,


I fail to see any common ground for xen-zcopy and udmabuf ...

Does the above mean you can assume that xen-zcopy and udmabuf
can co-exist as two different solutions?

Well, udmabuf route isn't fully clear yet, but yes.

See also gvt (intel vgpu), where the hypervisor interface is abstracted
away into a separate kernel modules even though most of the actual vgpu
emulation code is common.

Thank you for your input, I'm just trying to figure out
which of the three z-copy solutions intersect and how much

And what about hyper-dmabuf?

xen z-copy solution is pretty similar fundamentally to hyper_dmabuf
in terms of these core sharing feature:

1. the sharing process - import prime/dmabuf from the producer -> extract
underlying pages and get those shared -> return references for shared pages

Another thing is danvet was kind of against to the idea of importing existing
dmabuf/prime buffer and forward it to the other domain due to synchronization
issues. He proposed to make hyper_dmabuf only work as an exporter so that it
can have a full control over the buffer. I think we need to talk about this
further as well.

Yes, I saw this. But this limits the use-cases so much.
For instance, running Android as a Guest (which uses ION to allocate
buffers) means that finally HW composer will import dma-buf into
the DRM driver. Then, in case of xen-front for example, it needs to be
shared with the backend (Host side). Of course, we can change user-space
to make xen-front allocate the buffers (make it exporter), but what we try
to avoid is to change user-space which in normal world would have remain
unchanged otherwise.
So, I do think we have to support this use-case and just have to understand
the complexity.

Erm, why do you need importer capability for this use-case?

guest1 -> ION -> xen-front -> hypervisor -> guest 2 -> xen-zcopy exposes
that dma-buf -> import to the real display hw

No where in this chain do you need xen-zcopy to be able to import a
dma-buf (within linux, it needs to import a bunch of pages from the
hypervisor).

Now if your plan is to use xen-zcopy in the guest1 instead of xen-front,
then you indeed need to import.

This is the exact use-case I was referring to while saying
we need to import on Guest1 side. If hyper-dmabuf is so
generic that there is no xen-front in the picture, then
it needs to import a dma-buf, so it can be exported at Guest2 side.

   But that imo doesn't make sense:
- xen-front gives you clearly defined flip events you can forward to the
hypervisor. xen-zcopy would need to add that again.

xen-zcopy is a helper driver which doesn't handle page flips
and is not a KMS driver as one might think of: the DRM UAPI it uses is
just to export a dma-buf as a PRIME buffer, but that's it.
Flipping etc. is done by the backend [1], not xen-zcopy.

   Same for
hyperdmabuf (and really we're not going to shuffle struct dma_fence over
the wire in a generic fashion between hypervisor guests).

- xen-front already has the idea of pixel format for the buffer (and any
other metadata). Again, xen-zcopy and hyperdmabuf lack that, would need
to add it shoehorned in somehow.

Again, here you are talking of something which is implemented in
Xen display backend, not xen-zcopy, e.g. display backend can
implement para-virtual display w/o xen-zcopy at all, but in this case
there is a memory copying for each frame. With the help of xen-zcopy
the backend feeds xen-front's buffers directly into Guest2 DRM/KMS or
Weston or whatever as xen-zcopy exports remote buffers as PRIME buffers,
thus no buffer copying is required.

Why do you need to copy on every frame for xen-front? In the above
pipeline, using xen-front I see 0 architectural reasons to have a copy
anywhere.

This seems to be the core of the confusion we're having here.

Ok, so I'll try to explain:
1. xen-front - produces a display buffer to be shown at Guest2
by the backend, shares its grant references with the backend
2. xen-front sends page flip event to the backend specifying the
buffer in question
3. Backend takes the shared buffer (which is only a buffer mapped into
backend's memory, it is not a dma-buf/PRIME one) and makes memcpy from
it to a local dumb/surface
4. Backend flips that local dumb buffer/surface

If I have a xen-zcopy helper driver then I can avoid doing step 3):
1) 2) remain the same as above
3) Initially for a new display buffer, backend calls xen-zcopy to create
a

Re: [RfC PATCH] Add udmabuf misc device

2018-04-16 Thread Oleksandr Andrushchenko

On 04/13/2018 06:37 PM, Daniel Vetter wrote:

On Wed, Apr 11, 2018 at 08:59:32AM +0300, Oleksandr Andrushchenko wrote:

On 04/10/2018 08:26 PM, Dongwon Kim wrote:

On Tue, Apr 10, 2018 at 09:37:53AM +0300, Oleksandr Andrushchenko wrote:

On 04/06/2018 09:57 PM, Dongwon Kim wrote:

On Fri, Apr 06, 2018 at 03:36:03PM +0300, Oleksandr Andrushchenko wrote:

On 04/06/2018 02:57 PM, Gerd Hoffmann wrote:

Hi,


I fail to see any common ground for xen-zcopy and udmabuf ...

Does the above mean you can assume that xen-zcopy and udmabuf
can co-exist as two different solutions?

Well, udmabuf route isn't fully clear yet, but yes.

See also gvt (intel vgpu), where the hypervisor interface is abstracted
away into a separate kernel modules even though most of the actual vgpu
emulation code is common.

Thank you for your input, I'm just trying to figure out
which of the three z-copy solutions intersect and how much

And what about hyper-dmabuf?

xen z-copy solution is pretty similar fundamentally to hyper_dmabuf
in terms of these core sharing feature:

1. the sharing process - import prime/dmabuf from the producer -> extract
underlying pages and get those shared -> return references for shared pages

Another thing is danvet was kind of against to the idea of importing existing
dmabuf/prime buffer and forward it to the other domain due to synchronization
issues. He proposed to make hyper_dmabuf only work as an exporter so that it
can have a full control over the buffer. I think we need to talk about this
further as well.

Yes, I saw this. But this limits the use-cases so much.
For instance, running Android as a Guest (which uses ION to allocate
buffers) means that finally HW composer will import dma-buf into
the DRM driver. Then, in case of xen-front for example, it needs to be
shared with the backend (Host side). Of course, we can change user-space
to make xen-front allocate the buffers (make it exporter), but what we try
to avoid is to change user-space which in normal world would have remain
unchanged otherwise.
So, I do think we have to support this use-case and just have to understand
the complexity.

Erm, why do you need importer capability for this use-case?

guest1 -> ION -> xen-front -> hypervisor -> guest 2 -> xen-zcopy exposes
that dma-buf -> import to the real display hw

No where in this chain do you need xen-zcopy to be able to import a
dma-buf (within linux, it needs to import a bunch of pages from the
hypervisor).

Now if your plan is to use xen-zcopy in the guest1 instead of xen-front,
then you indeed need to import.

This is the exact use-case I was referring to while saying
we need to import on Guest1 side. If hyper-dmabuf is so
generic that there is no xen-front in the picture, then
it needs to import a dma-buf, so it can be exported at Guest2 side.

  But that imo doesn't make sense:
- xen-front gives you clearly defined flip events you can forward to the
   hypervisor. xen-zcopy would need to add that again.

xen-zcopy is a helper driver which doesn't handle page flips
and is not a KMS driver as one might think of: the DRM UAPI it uses is
just to export a dma-buf as a PRIME buffer, but that's it.
Flipping etc. is done by the backend [1], not xen-zcopy.

  Same for
   hyperdmabuf (and really we're not going to shuffle struct dma_fence over
   the wire in a generic fashion between hypervisor guests).

- xen-front already has the idea of pixel format for the buffer (and any
   other metadata). Again, xen-zcopy and hyperdmabuf lack that, would need
   to add it shoehorned in somehow.

Again, here you are talking of something which is implemented in
Xen display backend, not xen-zcopy, e.g. display backend can
implement para-virtual display w/o xen-zcopy at all, but in this case
there is a memory copying for each frame. With the help of xen-zcopy
the backend feeds xen-front's buffers directly into Guest2 DRM/KMS or
Weston or whatever as xen-zcopy exports remote buffers as PRIME buffers,
thus no buffer copying is required.


Ofc you won't be able to shovel sound or media stream data over to another
guest like this, but that's what you have xen-v4l and xen-sound or
whatever else for. Trying to make a new uapi, which means userspace must
be changed for all the different use-case, instead of reusing standard
linux driver uapi (which just happens to send the data to another
hypervisor guest instead of real hw) imo just doesn't make much sense.

Also, at least for the gpu subsystem: Any new uapi must have full
userspace available for it, see:

https://dri.freedesktop.org/docs/drm/gpu/drm-uapi.html#open-source-userspace-requirements

Adding more uapi is definitely the most painful way to fix a use-case.
Personally I'd go as far and also change the xen-zcopy side on the
receiving guest to use some standard linux uapi. E.g. you could write an
output v4l driver to receive the frames from guest1.

So, we now know that xen-zcopy was not meant to handle page flips,
but to implement ne

Re: [RfC PATCH] Add udmabuf misc device

2018-04-11 Thread Oleksandr Andrushchenko

On 04/10/2018 08:26 PM, Dongwon Kim wrote:

On Tue, Apr 10, 2018 at 09:37:53AM +0300, Oleksandr Andrushchenko wrote:

On 04/06/2018 09:57 PM, Dongwon Kim wrote:

On Fri, Apr 06, 2018 at 03:36:03PM +0300, Oleksandr Andrushchenko wrote:

On 04/06/2018 02:57 PM, Gerd Hoffmann wrote:

   Hi,


I fail to see any common ground for xen-zcopy and udmabuf ...

Does the above mean you can assume that xen-zcopy and udmabuf
can co-exist as two different solutions?

Well, udmabuf route isn't fully clear yet, but yes.

See also gvt (intel vgpu), where the hypervisor interface is abstracted
away into a separate kernel modules even though most of the actual vgpu
emulation code is common.

Thank you for your input, I'm just trying to figure out
which of the three z-copy solutions intersect and how much

And what about hyper-dmabuf?

xen z-copy solution is pretty similar fundamentally to hyper_dmabuf
in terms of these core sharing feature:

1. the sharing process - import prime/dmabuf from the producer -> extract
underlying pages and get those shared -> return references for shared pages

Another thing is danvet was kind of against to the idea of importing existing
dmabuf/prime buffer and forward it to the other domain due to synchronization
issues. He proposed to make hyper_dmabuf only work as an exporter so that it
can have a full control over the buffer. I think we need to talk about this
further as well.

Yes, I saw this. But this limits the use-cases so much.
For instance, running Android as a Guest (which uses ION to allocate
buffers) means that finally HW composer will import dma-buf into
the DRM driver. Then, in case of xen-front for example, it needs to be
shared with the backend (Host side). Of course, we can change user-space
to make xen-front allocate the buffers (make it exporter), but what we try
to avoid is to change user-space which in normal world would have remain
unchanged otherwise.
So, I do think we have to support this use-case and just have to understand
the complexity.



danvet, can you comment on this topic?


2. the page sharing mechanism - it uses Xen-grant-table.

And to give you a quick summary of differences as far as I understand
between two implementations (please correct me if I am wrong, Oleksandr.)

1. xen-zcopy is DRM specific - can import only DRM prime buffer
while hyper_dmabuf can export any dmabuf regardless of originator

Well, this is true. And at the same time this is just a matter
of extending the API: xen-zcopy is a helper driver designed for
xen-front/back use-case, so this is why it only has DRM PRIME API

2. xen-zcopy doesn't seem to have dma-buf synchronization between two VMs
while (as danvet called it as remote dmabuf api sharing) hyper_dmabuf sends
out synchronization message to the exporting VM for synchronization.

This is true. Again, this is because of the use-cases it covers.
But having synchronization for a generic solution seems to be a good idea.

Yeah, understood xen-zcopy works ok with your use case. But I am just curious
if it is ok not to have any inter-domain synchronization in this sharing model.

The synchronization is done with displif protocol [1]

The buffer being shared is technically dma-buf and originator needs to be able
to keep track of it.

As I am working in DRM terms the tracking is done by the DRM core
for me for free. (This might be one of the reasons Daniel sees DRM
based implementation fit very good from code-reuse POV).



3. 1-level references - when using grant-table for sharing pages, there will
be same # of refs (each 8 byte)

To be precise, grant ref is 4 bytes

You are right. Thanks for correction.;)


as # of shared pages, which is passed to
the userspace to be shared with importing VM in case of xen-zcopy.

The reason for that is that xen-zcopy is a helper driver, e.g.
the grant references come from the display backend [1], which implements
Xen display protocol [2]. So, effectively the backend extracts references
from frontend's requests and passes those to xen-zcopy as an array
of refs.

  Compared
to this, hyper_dmabuf does multiple level addressing to generate only one
reference id that represents all shared pages.

In the protocol [2] only one reference to the gref directory is passed
between VMs
(and the gref directory is a single-linked list of shared pages containing
all
of the grefs of the buffer).

ok, good to know. I will look into its implementation in more details but is
this gref directory (chained grefs) something that can be used for any general
memory sharing use case or is it jsut for xen-display (in current code base)?

Not to mislead you: one grant ref is passed via displif protocol,
but the page it's referencing contains the rest of the grant refs.

As to if this can be used for any memory: yes. It is the same for
sndif and displif Xen protocols, but defined twice as strictly speaking
sndif and displif are two separate protocols.

While reviewing your RFC v2 one of the comments I had [2] was that if we
can

Re: [RfC PATCH] Add udmabuf misc device

2018-04-10 Thread Oleksandr Andrushchenko

On 04/06/2018 09:57 PM, Dongwon Kim wrote:

On Fri, Apr 06, 2018 at 03:36:03PM +0300, Oleksandr Andrushchenko wrote:

On 04/06/2018 02:57 PM, Gerd Hoffmann wrote:

   Hi,


I fail to see any common ground for xen-zcopy and udmabuf ...

Does the above mean you can assume that xen-zcopy and udmabuf
can co-exist as two different solutions?

Well, udmabuf route isn't fully clear yet, but yes.

See also gvt (intel vgpu), where the hypervisor interface is abstracted
away into a separate kernel modules even though most of the actual vgpu
emulation code is common.

Thank you for your input, I'm just trying to figure out
which of the three z-copy solutions intersect and how much

And what about hyper-dmabuf?

xen z-copy solution is pretty similar fundamentally to hyper_dmabuf
in terms of these core sharing feature:

1. the sharing process - import prime/dmabuf from the producer -> extract
underlying pages and get those shared -> return references for shared pages

2. the page sharing mechanism - it uses Xen-grant-table.

And to give you a quick summary of differences as far as I understand
between two implementations (please correct me if I am wrong, Oleksandr.)

1. xen-zcopy is DRM specific - can import only DRM prime buffer
while hyper_dmabuf can export any dmabuf regardless of originator

Well, this is true. And at the same time this is just a matter
of extending the API: xen-zcopy is a helper driver designed for
xen-front/back use-case, so this is why it only has DRM PRIME API


2. xen-zcopy doesn't seem to have dma-buf synchronization between two VMs
while (as danvet called it as remote dmabuf api sharing) hyper_dmabuf sends
out synchronization message to the exporting VM for synchronization.

This is true. Again, this is because of the use-cases it covers.
But having synchronization for a generic solution seems to be a good idea.


3. 1-level references - when using grant-table for sharing pages, there will
be same # of refs (each 8 byte)

To be precise, grant ref is 4 bytes

as # of shared pages, which is passed to
the userspace to be shared with importing VM in case of xen-zcopy.

The reason for that is that xen-zcopy is a helper driver, e.g.
the grant references come from the display backend [1], which implements
Xen display protocol [2]. So, effectively the backend extracts references
from frontend's requests and passes those to xen-zcopy as an array
of refs.

  Compared
to this, hyper_dmabuf does multiple level addressing to generate only one
reference id that represents all shared pages.
In the protocol [2] only one reference to the gref directory is passed 
between VMs
(and the gref directory is a single-linked list of shared pages 
containing all

of the grefs of the buffer).



4. inter VM messaging (hype_dmabuf only) - hyper_dmabuf has inter-vm msg
communication defined for dmabuf synchronization and private data (meta
info that Matt Roper mentioned) exchange.

This is true, xen-zcopy has no means for inter VM sync and meta-data,
simply because it doesn't have any code for inter VM exchange in it,
e.g. the inter VM protocol is handled by the backend [1].


5. driver-to-driver notification (hyper_dmabuf only) - importing VM gets
notified when newdmabuf is exported from other VM - uevent can be optionally
generated when this happens.

6. structure - hyper_dmabuf is targetting to provide a generic solution for
inter-domain dmabuf sharing for most hypervisors, which is why it has two
layers as mattrope mentioned, front-end that contains standard API and backend
that is specific to hypervisor.

Again, xen-zcopy is decoupled from inter VM communication

No idea, didn't look at it in detail.

Looks pretty complex from a distant view.  Maybe because it tries to
build a communication framework using dma-bufs instead of a simple
dma-buf passing mechanism.

we started with simple dma-buf sharing but realized there are many
things we need to consider in real use-case, so we added communication
, notification and dma-buf synchronization then re-structured it to
front-end and back-end (this made things more compicated..) since Xen
was not our only target. Also, we thought passing the reference for the
buffer (hyper_dmabuf_id) is not secure so added uvent mechanism later.


Yes, I am looking at it now, trying to figure out the full story
and its implementation. BTW, Intel guys were about to share some
test application for hyper-dmabuf, maybe I have missed one.
It could probably better explain the use-cases and the complexity
they have in hyper-dmabuf.

One example is actually in github. If you want take a look at it, please
visit:

https://github.com/downor/linux_hyper_dmabuf_test/tree/xen/simple_export

Thank you, I'll have a look

Like xen-zcopy it seems to depend on the idea that the hypervisor
manages all memory it is easy for guests to share pages with the help of
the hypervisor.

So, for xen-zcopy we were not trying to make it generic,
it just solves display (dumb) zero-copying use-cases for X

Re: [RfC PATCH] Add udmabuf misc device

2018-04-06 Thread Oleksandr Andrushchenko

On 04/06/2018 02:57 PM, Gerd Hoffmann wrote:

   Hi,


I fail to see any common ground for xen-zcopy and udmabuf ...

Does the above mean you can assume that xen-zcopy and udmabuf
can co-exist as two different solutions?

Well, udmabuf route isn't fully clear yet, but yes.

See also gvt (intel vgpu), where the hypervisor interface is abstracted
away into a separate kernel modules even though most of the actual vgpu
emulation code is common.

Thank you for your input, I'm just trying to figure out
which of the three z-copy solutions intersect and how much

And what about hyper-dmabuf?

No idea, didn't look at it in detail.

Looks pretty complex from a distant view.  Maybe because it tries to
build a communication framework using dma-bufs instead of a simple
dma-buf passing mechanism.

Yes, I am looking at it now, trying to figure out the full story
and its implementation. BTW, Intel guys were about to share some
test application for hyper-dmabuf, maybe I have missed one.
It could probably better explain the use-cases and the complexity
they have in hyper-dmabuf.


Like xen-zcopy it seems to depend on the idea that the hypervisor
manages all memory it is easy for guests to share pages with the help of
the hypervisor.

So, for xen-zcopy we were not trying to make it generic,
it just solves display (dumb) zero-copying use-cases for Xen.
We implemented it as a DRM helper driver because we can't see any
other use-cases as of now.
For example, we also have Xen para-virtualized sound driver, but
its buffer memory usage is not comparable to what display wants
and it works somewhat differently (e.g. there is no "frame done"
event, so one can't tell when the sound buffer can be "flipped").
At the same time, we do not use virtio-gpu, so this could probably
be one more candidate for shared dma-bufs some day.

   Which simply isn't the case on kvm.

hyper-dmabuf and xen-zcopy could maybe share code, or hyper-dmabuf build
on top of xen-zcopy.

Hm, I can imagine that: xen-zcopy could be a library code for hyper-dmabuf
in terms of implementing all that page sharing fun in multiple directions,
e.g. Host->Guest, Guest->Host, Guest<->Guest.
But I'll let Matt and Dongwon to comment on that.



cheers,
   Gerd


Thank you,
Oleksandr

P.S. Sorry for making your original mail thread to discuss things much
broader than your RFC...



Re: [RfC PATCH] Add udmabuf misc device

2018-04-06 Thread Oleksandr Andrushchenko

On 04/06/2018 12:07 PM, Gerd Hoffmann wrote:

I'm not sure we can create something which works on both kvm and xen.
The memory management model is quite different ...


On xen the hypervisor manages all memory.  Guests can allow other guests
to access specific pages (using grant tables).  In theory any guest <=>
guest communication is possible.  In practice is mostly guest <=> dom0
because guests access their virtual hardware that way.  dom0 is the
priviledged guest which owns any hardware not managed by xen itself.

Xen guests can ask the hypervisor to update the mapping of guest
physical pages.  They can ballon down (unmap and free pages).  They can
ballon up (ask the hypervisor to map fresh pages).  They can map pages
exported by other guests using grant tables.  xen-zcopy makes heavy use
of this.  It balloons down, to make room in the guest physical address
space, then goes map the exported pages there, finally composes a
dma-buf.


On kvm qemu manages all guest memory.  qemu also has all guest memory
mapped, so a grant-table like mechanism isn't needed to implement
virtual devices.  qemu can decide how it backs memory for the guest.
qemu propagates the guest memory map to the kvm driver in the linux
kernel.  kvm guests have some control over the guest memory map, for
example they can map pci bars wherever they want in their guest physical
address space by programming the base registers accordingly, but unlike
xen guests they can't ask the host to remap individual pages.

Due to qemu having all guest memory mapped virtual devices are typically
designed to have the guest allocate resources, then notify the host
where they are located.  This is where the udmabuf idea comes from:
Guest tells the host (qemu) where the gem object is, and qemu then can
create a dmabuf backed by those pages to pass it on to other processes
such as the wayland display server.  Possibly even without the guest
explicitly asking for it, i.e. export the framebuffer placed by the
guest in the (virtual) vga pci memory bar as dma-buf.  And I can imagine
that this is useful outsize virtualization too.


I fail to see any common ground for xen-zcopy and udmabuf ...

Does the above mean you can assume that xen-zcopy and udmabuf
can co-exist as two different solutions?
And what about hyper-dmabuf?

Thank you,
Oleksandr


Re: [RfC PATCH] Add udmabuf misc device

2018-04-06 Thread Oleksandr Andrushchenko

On 04/06/2018 12:07 PM, Gerd Hoffmann wrote:

   Hi,


   * The general interface should be able to express sharing from any
 guest:guest, not just guest:host.  Arbitrary G:G sharing might be
 something some hypervisors simply aren't able to support, but the
 userspace API itself shouldn't make assumptions or restrict that.  I
 think ideally the sharing API would include some kind of
 query_targets interface that would return a list of VM's that your
 current OS is allowed to share with; that list would be depend on the
 policy established by the system integrator, but obviously wouldn't
 include targets that the hypervisor itself wouldn't be capable of
 handling.

Can you give a use-case for this? I mean that the system integrator
is the one who defines which guests/hosts talk to each other,
but querying means that it is possible that VMs have some sort
of discovery mechanism, so they can decide on their own whom
to connect to.

Note that vsock (created by vmware, these days also has a virtio
transport for kvm) started with support for both guest <=> host and
guest <=> guest support.  But later on guest <=> guest was dropped.
As far I know the reasons where (a) lack of use cases and (b) security.

So, I likewise would know more details on the use cases you have in mind
here.  Unless we have a compelling use case here I'd suggest to drop the
guest <=> guest requirement as it makes the whole thing alot more
complex.

This is exactly the use-case we have: in our setup Dom0 doesn't
own any HW at all and all the HW is passed into a dedicated
driver domain (DomD) which is still a guest domain.
Then, buffers are shared between two guests, for example,
DomD and DomA (Android guest)

   * The sharing API could be used to share multiple kinds of content in a
 single system.  The sharing sink driver running in the content
 producer's VM should accept some additional metadata that will be
 passed over to the target VM as well.

Not sure this should be part of hyper-dmabuf.  A dma-buf is nothing but
a block of data, period.  Therefore protocols with dma-buf support
(wayland for example) typically already send over metadata describing
the content, so duplicating that in hyper-dmabuf looks pointless.


1. We are targeting ARM and one of the major requirements for the buffer
sharing is the ability to allocate physically contiguous buffers, which gets
even more complicated for systems not backed with an IOMMU. So, for some
use-cases it is enough to make the buffers contiguous in terms of IPA and
sometimes those need to be contiguous in terms of PA.

Which pretty much implies the host must to the allocation.


2. For Xen we would love to see UAPI to create a dma-buf from grant
references provided, so we can use this generic solution to implement
zero-copying without breaking the existing Xen protocols. This can
probably be extended to other hypervizors as well.

I'm not sure we can create something which works on both kvm and xen.
The memory management model is quite different ...


On xen the hypervisor manages all memory.  Guests can allow other guests
to access specific pages (using grant tables).  In theory any guest <=>
guest communication is possible.  In practice is mostly guest <=> dom0
because guests access their virtual hardware that way.  dom0 is the
priviledged guest which owns any hardware not managed by xen itself.

Please see above for our setup with DomD and Dom0 being
a generic ARMv8 domain, no HW

Xen guests can ask the hypervisor to update the mapping of guest
physical pages.  They can ballon down (unmap and free pages).  They can
ballon up (ask the hypervisor to map fresh pages).  They can map pages
exported by other guests using grant tables.  xen-zcopy makes heavy use
of this.  It balloons down, to make room in the guest physical address
space, then goes map the exported pages there, finally composes a
dma-buf.

This is what it does


On kvm qemu manages all guest memory.  qemu also has all guest memory
mapped, so a grant-table like mechanism isn't needed to implement
virtual devices.  qemu can decide how it backs memory for the guest.
qemu propagates the guest memory map to the kvm driver in the linux
kernel.  kvm guests have some control over the guest memory map, for
example they can map pci bars wherever they want in their guest physical
address space by programming the base registers accordingly, but unlike
xen guests they can't ask the host to remap individual pages.

Due to qemu having all guest memory mapped virtual devices are typically
designed to have the guest allocate resources, then notify the host
where they are located.  This is where the udmabuf idea comes from:
Guest tells the host (qemu) where the gem object is, and qemu then can
create a dmabuf backed by those pages to pass it on to other processes
such as the wayland display server.  Possibly even without the guest
explicitly asking for it, i.e. export the framebuffer placed by the

Re: [RfC PATCH] Add udmabuf misc device

2018-04-06 Thread Oleksandr Andrushchenko
ackend hypervisor-specific stuff.

I'm not super familiar with xen-zcopy


Let me describe the rationale and some implementation details of the Xen
zero-copy driver I posted recently [1].

The main requirement for us to implement such a helper driver was an ability
to avoid memory copying for large buffers in display use-cases. This is why
we only focused on DRM use-cases, not trying to implement something
generic. This is why the driver is somewhat coupled with Xen 
para-virtualized

DRM driver [2] by Xen para-virtual display protocol [3] grant references
sharing mechanism, e.g. backend receives an array of Xen grant references to
frontend's buffer pages. These grant references are then used to construct a
PRIME buffer. The same mechanism is used when backend shares a buffer 
with the

frontend, but in the other direction. More details on UAPI of the driver are
available at [1].

So, when discussing a possibility to share dma-bufs in a generic way I would
also like to have the following considered:

1. We are targeting ARM and one of the major requirements for the buffer
sharing is the ability to allocate physically contiguous buffers, which gets
even more complicated for systems not backed with an IOMMU. So, for some
use-cases it is enough to make the buffers contiguous in terms of IPA and
sometimes those need to be contiguous in terms of PA.
(The use-case is that you use Wayland-DRM/KMS or share the buffer with
the driver implemented with DRM CMA helpers).

2. For Xen we would love to see UAPI to create a dma-buf from grant 
references
provided, so we can use this generic solution to implement zero-copying 
without

breaking the existing Xen protocols. This can probably be extended to other
hypervizors as well.

Thank you,
Oleksandr Andrushchenko



  and udmabuf, but it sounds like
they're approaching similar problems from slightly different directions,
so we should make sure we can come up with something that satisfies
everyone's requirements.


Matt


On Wed, Mar 14, 2018 at 9:03 AM, Gerd Hoffmann <kra...@redhat.com> wrote:

   Hi,


Either mlock account (because it's mlocked defacto), and get_user_pages
won't do that for you.

Or you write the full-blown userptr implementation, including mmu_notifier
support (see i915 or amdgpu), but that also requires Christian Königs
latest ->invalidate_mapping RFC for dma-buf (since atm exporting userptr
buffers is a no-go).

I guess I'll look at mlock accounting for starters then.  Easier for
now, and leaves the door open to switch to userptr later as this should
be transparent to userspace.


Known issue:  Driver API isn't complete yet.  Need add some flags, for
example to support read-only buffers.

dma-buf has no concept of read-only. I don't think we can even enforce
that (not many iommus can enforce this iirc), so pretty much need to
require r/w memory.

Ah, ok.  Just saw the 'write' arg for get_user_pages_fast and figured we
might support that, but if iommus can't handle that anyway it's
pointless indeed.


Cc: David Airlie <airl...@linux.ie>
Cc: Tomeu Vizoso <tomeu.viz...@collabora.com>
Signed-off-by: Gerd Hoffmann <kra...@redhat.com>

btw there's also the hyperdmabuf stuff from the xen folks, but imo their
solution of forwarding the entire dma-buf api is over the top. This here
looks _much_ better, pls cc all the hyperdmabuf people on your next
version.

Fun fact: googling for "hyperdmabuf" found me your mail and nothing else :-o
(Trying "hyper dmabuf" instead worked better then).

Yes, will cc them on the next version.  Not sure it'll help much on xen
though due to the memory management being very different.  Basically xen
owns the memory, not the kernel of the control domain (dom0), so
creating dmabufs for guest memory chunks isn't that simple ...

Also it's not clear whenever they really need guest -> guest exports or
guest -> dom0 exports.


Overall I like the idea, but too lazy to review.

Cool.  General comments on the idea was all I was looking for for the
moment.  Spare yor review cycles for the next version ;)


Oh, some kselftests for this stuff would be lovely.

I'll look into it.

thanks,
   Gerd

___
dri-devel mailing list
dri-de...@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel



--
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch

[1] https://patchwork.freedesktop.org/series/40880/
[2] 
https://cgit.freedesktop.org/drm/drm-misc/commit/?id=c575b7eeb89f94356997abd62d6d5a0590e259b7
[3] 
https://elixir.bootlin.com/linux/v4.16-rc7/source/include/xen/interface/io/displif.h