Re: [Xen-devel] [XenSummit 2017] Shared coprocessor framework followup

2017-08-16 Thread Andrii Anisov

Hello Edgar,

I'm just wondering if you have had a chance to play with SCF?
Please do not hesitate to come up with questions and comments. We are 
extremely interested in them to make the thing better.


--

*Andrii Anisov*



___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [XenSummit 2017] Shared coprocessor framework followup

2017-08-02 Thread Andrii Anisov


On 02.08.17 15:58, Edgar E. Iglesias wrote:

Today it's an SMMUv2.

You would need to implement additional ops like [1].


I don't necessarily think so. The context-switch would involve saving and
restoring accelerator state aswell as re-programming the PL.
With allocate/release, we only need to re-program the PL.

Saving the state of the PL might be tricky since we don't know how to
(we don't know the details of the acceelerator ahead of time).
I guess we could somehow let the guest that owns the accellerator
save/restore the state somehow but perhaps that brings us back
in the direction of allocate/release semantics...
In this area context switch means to me a process to prepare a 
coprocessor to (start or continue) serving another domain tasks. It 
depends on a coprocessor nature and use-cases what the context switch 
physically means. And it is up to coprocessor platform driver how 
ctx_switch_from() and ctx_switch_to() are implemented [2]. Theoretically 
the framework should be platform agnostic.


BTW, the most up to date sources you can find here [3].

[1] 
https://github.com/xen-troops/xen/commit/a01f7ccf8bd5e9069f82c6ea6b92e2faca4920d9
[2] 
https://github.com/xen-troops/xen/blob/vgpu-dev/xen/arch/arm/coproc/coproc.h#L81

[3] https://github.com/xen-troops/xen/tree/vgpu-dev

--

*Andrii Anisov*



___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [XenSummit 2017] Shared coprocessor framework followup

2017-08-02 Thread Edgar E. Iglesias
On Wed, Aug 02, 2017 at 02:07:17PM +0300, Andrii Anisov wrote:
> Hello Edgar,

Hi Andrii,

> 
> 
> On 01.08.17 20:13, Edgar E. Iglesias wrote:
> >>Are master ports behind IOMMU?
> >Yes, they are.
> What IOMMU IP is used?

Today it's an SMMUv2.


> 
> >>>It's possible to reprogram the configuration of the PL and swap 
> >>>accelerators in
> >>>and out on the fly. It's probably going to be too slow for trying to
> >>>context switch between guests
> >>So let us assume it is a FW-less resource we need to share between domains.
> >>Context switch will be stripped to mapping its mmio (or passing mmio
> >>accesses) next domain to serve and IOMMU configuration switching.
> >>Yep, IOMMU matters.
> >OK. I think the PL is more like a firmware enabled resource.
> >The PL configuration needs to be loaded much like firmware
> >otherwise the accelerators can't change shape and all guests
> >must use the same kind.
> I understand this.
> But I got your words like you are going to give a try to the same kind for
> all domains first. Because you assumed that reconfiguring would be too slow,
> what is actually discussable.

Aha, OK. What I meant was that it may be to slow for context-switching
at a micro-level. But with an allocate/release interface for batch
processing, I don't think it's to slow to reprogram the PL between
guests.

I agree that we need hard numbers on the PL programming before we rule
things out. I'll try to dig internally for some.


> >>>  so I think primarily we would be looking at
> >>>a way to lets say, "allocate" and "release" the resources for batch use.
> >>Kind of voluntary preemption?
> >Right. That could be a start.
> >In the future perhaps it makes sense to context-switch.
> We still need the context switch to be done. The difference is that now it
> could be done only when the accelerator is not busy.

I don't necessarily think so. The context-switch would involve saving and
restoring accelerator state aswell as re-programming the PL.
With allocate/release, we only need to re-program the PL.

Saving the state of the PL might be tricky since we don't know how to
(we don't know the details of the acceelerator ahead of time).
I guess we could somehow let the guest that owns the accellerator
save/restore the state somehow but perhaps that brings us back
in the direction of allocate/release semantics...


> >>>If a guest cannot allocate an accelerator, it could fall back to emulation
> >>>or just to using SW libraries until an accelerator slot is available.
> >>What about the thing I called "an access emulation" [1]? From the domain's
> >>point of view it would be reflected in a delayed response (via IRQ or
> >>register polling) from an accelerator.
> >>
> >>I guess the concept described above is feasible even with current SCF code
> >>and will not take too much efforts.
> >I'll have a look, thanks!
> Do not hesitate to contact us in case you need any help or clarification.

Thanks!
Edgar

> 
> 
> -- 
> 
> *Andrii Anisov*
> 

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [XenSummit 2017] Shared coprocessor framework followup

2017-08-02 Thread Andrii Anisov

Hello Edgar,


On 01.08.17 20:13, Edgar E. Iglesias wrote:

Are master ports behind IOMMU?

Yes, they are.

What IOMMU IP is used?


It's possible to reprogram the configuration of the PL and swap accelerators in
and out on the fly. It's probably going to be too slow for trying to
context switch between guests

So let us assume it is a FW-less resource we need to share between domains.
Context switch will be stripped to mapping its mmio (or passing mmio
accesses) next domain to serve and IOMMU configuration switching.
Yep, IOMMU matters.

OK. I think the PL is more like a firmware enabled resource.
The PL configuration needs to be loaded much like firmware
otherwise the accelerators can't change shape and all guests
must use the same kind.

I understand this.
But I got your words like you are going to give a try to the same kind 
for all domains first. Because you assumed that reconfiguring would be 
too slow, what is actually discussable.



  so I think primarily we would be looking at
a way to lets say, "allocate" and "release" the resources for batch use.

Kind of voluntary preemption?

Right. That could be a start.
In the future perhaps it makes sense to context-switch.
We still need the context switch to be done. The difference is that now 
it could be done only when the accelerator is not busy.



If a guest cannot allocate an accelerator, it could fall back to emulation
or just to using SW libraries until an accelerator slot is available.

What about the thing I called "an access emulation" [1]? From the domain's
point of view it would be reflected in a delayed response (via IRQ or
register polling) from an accelerator.

I guess the concept described above is feasible even with current SCF code
and will not take too much efforts.

I'll have a look, thanks!

Do not hesitate to contact us in case you need any help or clarification.


--

*Andrii Anisov*


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [XenSummit 2017] Shared coprocessor framework followup

2017-08-01 Thread Edgar E. Iglesias
On Tue, Aug 01, 2017 at 08:04:09PM +0300, Andrii Anisov wrote:
> Hello Edgar,
> 
> 
> On 01.08.17 17:56, Edgar E. Iglesias wrote:
> >On the PL, there's a chunk of programmable logic that allows you to
> >create your own custom accellerators or devices.
> >Some devices are tied to specific boards (e.g when they depend on specific 
> >IO)
> >but others are not (for example memory to memory computational accelerators).
> >To communicate with these devices, they have memory slave and master ports
> >(for register accesses and for DMA). They also have interrupts both ways.
> Are master ports behind IOMMU?

Yes, they are.

> 
> >It's possible to reprogram the configuration of the PL and swap accelerators 
> >in
> >and out on the fly. It's probably going to be too slow for trying to
> >context switch between guests
> So let us assume it is a FW-less resource we need to share between domains.
> Context switch will be stripped to mapping its mmio (or passing mmio
> accesses) next domain to serve and IOMMU configuration switching.
> Yep, IOMMU matters.

OK. I think the PL is more like a firmware enabled resource.
The PL configuration needs to be loaded much like firmware
otherwise the accelerators can't change shape and all guests
must use the same kind.

For example, one guest might want a crypto accelerator while
another might want some kind of machine-learning accelerator.

I think each guest may want to provide it's own accelerator "config".


> >  so I think primarily we would be looking at
> >a way to lets say, "allocate" and "release" the resources for batch use.
> 
> Kind of voluntary preemption?

Right. That could be a start.
In the future perhaps it makes sense to context-switch.

> 
> >If a guest cannot allocate an accelerator, it could fall back to emulation
> >or just to using SW libraries until an accelerator slot is available.
> What about the thing I called "an access emulation" [1]? From the domain's
> point of view it would be reflected in a delayed response (via IRQ or
> register polling) from an accelerator.
> 
> I guess the concept described above is feasible even with current SCF code
> and will not take too much efforts.

I'll have a look, thanks!

Cheers,
Edgar

> 
> [1]
> https://lists.xenproject.org/archives/html/xen-devel/2016-11/msg01935.html
> 
> -- 
> 
> *Andrii Anisov*
> 

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [XenSummit 2017] Shared coprocessor framework followup

2017-08-01 Thread Andrii Anisov

Hello Edgar,


On 01.08.17 17:56, Edgar E. Iglesias wrote:

On the PL, there's a chunk of programmable logic that allows you to
create your own custom accellerators or devices.
Some devices are tied to specific boards (e.g when they depend on specific IO)
but others are not (for example memory to memory computational accelerators).
To communicate with these devices, they have memory slave and master ports
(for register accesses and for DMA). They also have interrupts both ways.

Are master ports behind IOMMU?


It's possible to reprogram the configuration of the PL and swap accelerators in
and out on the fly. It's probably going to be too slow for trying to
context switch between guests

So let us assume it is a FW-less resource we need to share between domains.
Context switch will be stripped to mapping its mmio (or passing mmio 
accesses) next domain to serve and IOMMU configuration switching.

Yep, IOMMU matters.


  so I think primarily we would be looking at
a way to lets say, "allocate" and "release" the resources for batch use.


Kind of voluntary preemption?


If a guest cannot allocate an accelerator, it could fall back to emulation
or just to using SW libraries until an accelerator slot is available.
What about the thing I called "an access emulation" [1]? From the 
domain's point of view it would be reflected in a delayed response (via 
IRQ or register polling) from an accelerator.


I guess the concept described above is feasible even with current SCF 
code and will not take too much efforts.


[1] 
https://lists.xenproject.org/archives/html/xen-devel/2016-11/msg01935.html


--

*Andrii Anisov*


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [XenSummit 2017] Shared coprocessor framework followup

2017-08-01 Thread Edgar E. Iglesias
On Tue, Aug 01, 2017 at 02:52:22PM +0300, Andrii Anisov wrote:
> Dear Edgar,
> 
> 
> On 31.07.17 23:42, Edgar E. Iglesias wrote:
> >Yes I'm interested in this.
> It's good to hear at least one vote for the stuff :)
> 
> >  I'm not sure how much time I'll be able to contribute but at least I can 
> > review proposals and hopefully look at implementing a driver/backend that 
> > may be useful for our FPGA platforms.
> I really hope we can start from small things:
> - Could you please describe use-cases you have in your mind. What
> functionality do you expect? It is really important for us, we need some
> side view on the framework.
> - Also it would be good for us to have some view on the coprocessor (fpga?)
> you are going to share using SCF. How is it exposed to the system? Does it
> have mmio, ram, irq, iommu connection, whatever?

Hi,

I don't really have a defined list of requirements but I can do some initial
hand-waiving :-)

On the ZynqMP we have to classes of HW (on the chip) that I think may benefit.
1. The Cortex-R5s
2. The Programmable Logic (FPGA)

The Cortex-R5s are two real-time co-processors that can be programmed.
They have local RAMs and control registers to reset, halt etc.
I'll leave these out for now, they are probably more similar to the
use-cases you had in mind.

On the PL, there's a chunk of programmable logic that allows you to
create your own custom accellerators or devices.
Some devices are tied to specific boards (e.g when they depend on specific IO)
but others are not (for example memory to memory computational accelerators).
To communicate with these devices, they have memory slave and master ports
(for register accesses and for DMA). They also have interrupts both ways.

It's possible to reprogram the configuration of the PL and swap accelerators in
and out on the fly. It's probably going to be too slow for trying to
context switch between guests so I think primarily we would be looking at
a way to lets say, "allocate" and "release" the resources for batch use.

If a guest cannot allocate an accelerator, it could fall back to emulation
or just to using SW libraries until an accelerator slot is available.

Best regards,
Edgar


> - Please comment on SCF configuration followup letter [1]
> 
> [1]
> https://lists.xenproject.org/archives/html/xen-devel/2017-07/msg02124.html
> 
> -- 
> 
> *Andrii Anisov*
> 
> 

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [XenSummit 2017] Shared coprocessor framework followup

2017-08-01 Thread Andrii Anisov

Dear Edgar,


On 31.07.17 23:42, Edgar E. Iglesias wrote:

Yes I'm interested in this.

It's good to hear at least one vote for the stuff :)


  I'm not sure how much time I'll be able to contribute but at least I can 
review proposals and hopefully look at implementing a driver/backend that may 
be useful for our FPGA platforms.

I really hope we can start from small things:
- Could you please describe use-cases you have in your mind. What 
functionality do you expect? It is really important for us, we need some 
side view on the framework.
- Also it would be good for us to have some view on the coprocessor 
(fpga?) you are going to share using SCF. How is it exposed to the 
system? Does it have mmio, ram, irq, iommu connection, whatever?

- Please comment on SCF configuration followup letter [1]

[1] 
https://lists.xenproject.org/archives/html/xen-devel/2017-07/msg02124.html


--

*Andrii Anisov*



___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [XenSummit 2017] Shared coprocessor framework followup

2017-07-31 Thread Edgar E. Iglesias
On Tue, Jul 18, 2017 at 08:10:15PM +0300, Andrii Anisov wrote:
> **Dear All,
> 
> During the developers summit a Shared Coprocessor Framework (SCF) concept
> was presented. Noticeable interest from community was discovered during
> discussions. So this is a call for all interested parties to collect a
> feedback and setup a collaboration.

Hi Andrii!


> 
> There are several topics I would like to collect responses from the
> community:
> - Who are interested in SCF design, discussions, development, usage,
> etc? Personalities or organizations.

Yes I'm interested in this. I'm not sure how much time I'll be able to
contribute but at least I can review proposals and hopefully look at
implementing a driver/backend that may be useful for our FPGA platforms.

Cheers,
Edgar


> - What devices (type of devices) are intended to be shared using SCF?
> - What are expected coprocessor sharing use-cases (i.e. DSP running
> different FW for different domains, etc).
> - If someone is willing to take a part in SCF design and development
> (core, API)?
> - If someone is willing to implement their coprocessor support (driver)
> for SCF?
> 
> I look forward to hearing from you.
> 
> -- 
> 
> *Andrii Anisov*
> 
> 
> 
> ___
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> https://lists.xen.org/xen-devel

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel