Re: [RFC 0/8] Copy Offload with Peer-to-Peer PCI Memory

2017-04-25 Thread Knut Omang
On Mon, 2017-04-24 at 10:14 -0600, Logan Gunthorpe wrote:
> 
> On 24/04/17 01:36 AM, Knut Omang wrote:
> > My first reflex when reading this thread was to think that this whole domain
> > lends it self excellently to testing via Qemu. Could it be that doing this 
> > in 
> > the opposite direction might be a safer approach in the long run even 
> > though 
> > (significant) more work up-front?
> 
> That's an interesting idea. We did do some very limited testing on qemu
> with one iteration of our work. However, it's difficult because there is
> no support for any RDMA devices which are a part of our primary use
> case. 

Yes, that's why I used 'significant'. One good thing is that given resources 
it can easily be done in parallel with other development, and will give 
additional
insight of some form.

> I also imagine it would be quite difficult to develop those models
> given the array of hardware that needs to be supported and the deep
> functional knowledge required to figure out appropriate restrictions.

>From my naive perspective it seems it need not even be a full model to get 
>some benefits,
just low level functionality tests with some instances of a
device that offers some MMIO space 'playground'.

Or maybe you can leverage some of the already implemented emulated devices in 
Qemu?

Knut

> 
> Logan


Re: [RFC 0/8] Copy Offload with Peer-to-Peer PCI Memory

2017-04-24 Thread Knut Omang
On Mon, 2017-04-17 at 08:31 +1000, Benjamin Herrenschmidt wrote:
> On Sun, 2017-04-16 at 10:34 -0600, Logan Gunthorpe wrote:
> > 
> > On 16/04/17 09:53 AM, Dan Williams wrote:
> > > ZONE_DEVICE allows you to redirect via get_dev_pagemap() to retrieve
> > > context about the physical address in question. I'm thinking you can
> > > hang bus address translation data off of that structure. This seems
> > > vaguely similar to what HMM is doing.
> > 
> > Thanks! I didn't realize you had the infrastructure to look up a device
> > from a pfn/page. That would really come in handy for us.
> 
> It does indeed. I won't be able to play with that much for a few weeks
> (see my other email) so if you're going to tackle this while I'm away,
> can you work with Jerome to make sure you don't conflict with HMM ?
> 
> I really want a way for HMM to be able to layout struct pages over the
> GPU BARs rather than in "allocated free space" for the case where the
> BAR is big enough to cover all of the GPU memory.
> 
> In general, I'd like a simple & generic way for any driver to ask the
> core to layout DMA'ble struct pages over BAR space. I an not convinced
> this requires a "p2mem device" to be created on top of this though but
> that's a different discussion.
> 
> Of course the actual ability to perform the DMA mapping will be subject
> to various restrictions that will have to be implemented in the actual
> "dma_ops override" backend. We can have generic code to handle the case
> where devices reside on the same domain, which can deal with switch
> configuration etc... we will need to have iommu specific code to handle
> the case going through the fabric. 
> 
> Virtualization is a separate can of worms due to how qemu completely
> fakes the MMIO space, we can look into that later.

My first reflex when reading this thread was to think that this whole domain
lends it self excellently to testing via Qemu. Could it be that doing this in 
the opposite direction might be a safer approach in the long run even though 
(significant) more work up-front?

Eg. start by fixing/providing/documenting suitable model(s) 
for testing this in Qemu, then implement the patch set based 
on those models?

Thanks,
Knut

> 
> Cheers,
> Ben.
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html