Re: [Mesa-dev] 10bit HEVC decoding for RadeonSI v2

2018-02-12 Thread rainer.hochec...@onlinehome.de

Hi Christian,

 

We have enabled vsync, hence we wait for page flip and renderer will block.

 

I prefer having control at the application level. If all goes easy and you don't

push the system to its limits, all fine. The fun part starts if you have to skip

decoding cycles or drop frames to catch up from being late. Or playing i.e.

60fps material on a 30hz screen.

I want to make sure that decoding has finished before I schedule a frame for

rendering.

 

Regards,

Rainer

 

Gesendet: Sonntag, 11. Februar 2018 um 19:23 Uhr
Von: "Christian König" 
An: "rainer.hochec...@onlinehome.de" 
Cc: "Philipp Kerling" , s...@jkqxz.net, peter.fruehber...@gmail.com, mic...@daenzer.net, mesa-dev@lists.freedesktop.org, harry.wentl...@amd.com, lru...@libreelec.tv
Betreff: Re: [Mesa-dev] 10bit HEVC decoding for RadeonSI v2



Hi Rainer,

well at the end of the pipeline the page flip waits for all previous operations on a buffer to be completed before displaying it. But even that wait is asynchronous.

So as long as you don't wait for the page flip your render thread won't be blocked.

Using vaSyncSurface is only good in two occasions:
1. You want to access the data with the CPU. And even then it is not necessary most of the time because of how Mesa is designed.
2. You need to limit how many jobs are in flight. Depending on how you do resource management that is necessary to prevent the application from scheduling mass amount of jobs and running away with all the resources for that.

Regards,
Christian.

Am 11.02.2018 um 13:26 schrieb rainer.hochec...@onlinehome.de:



Hi Christian,

 

Finally something in the render pipeline has to wait until decoding is finished or the

frame can't be rendered and this will result in blocking the render thread.

 

Regards,

Rainer

 


Gesendet: Sonntag, 11. Februar 2018 um 13:13 Uhr
Von: "Christian König" 
An: "rainer.hochec...@onlinehome.de" 
Cc: "Philipp Kerling" , s...@jkqxz.net, peter.fruehber...@gmail.com, mic...@daenzer.net, mesa-dev@lists.freedesktop.org, harry.wentl...@amd.com, lru...@libreelec.tv
Betreff: Re: [Mesa-dev] 10bit HEVC decoding for RadeonSI v2



Hi Rainer,

the render thread doesn't wait either.


See when you dispatch some work the AMD drivers always wait for prerequisites in the background, not in the foreground.

The older radeon driver uses hardware semaphores for this while amdgpu has a GPU scheduler which handles that stuff.

This is very important because when you hold back rendering work in the application the driver stack won't know about it and power management starts to stutter.

Not so important for 1920p because power management should be able to compensate the work peaks, but for 4K that's something mandatory to let the driver be able to estimate future load.

We even discussed in our multimedia meeting if we shouldn't limit 3D power management when UVD/VCN decoding is active because of that problem. But I'm not very keen about those workarounds because it's really counterproductive for transcode use cases to have the 3D engine idling around with high clocks.

Regards,
Christian.

Am 11.02.2018 um 12:51 schrieb rainer.hochec...@onlinehome.de:



Hi Christian,

 

For Kodi it is better to wait on the thread that does decoding than later 

by the render thread. Means it is desired to call it.

 

Does vaSyncSurface block as stated by the docs?

 

Regards,

Rainer

 

Gesendet: Sonntag, 11. Februar 2018 um 11:02 Uhr
Von: "Christian König" 
An: "Philipp Kerling" , s...@jkqxz.net
Cc: rainer.hochec...@onlinehome.de, peter.fruehber...@gmail.com, mic...@daenzer.net, mesa-dev@lists.freedesktop.org, harry.wentl...@amd.com, lru...@libreelec.tv
Betreff: Re: [Mesa-dev] 10bit HEVC decoding for RadeonSI v2

Am 09.02.2018 um 21:35 schrieb Philipp Kerling:
> Hi,
>
> resurrecting this thread again since there's been some progress on the
> Kodi side.
>
>> For the EGL part, see <https://github.com/01org/libva/pull/125>
>> and <https://lists.freedesktop.org/archives/mesa-dev/2017-October/171246.html>.
> We recently started testing vaExportSurfaceHandle support, so we will
> have this covered soon.
>
>> I have been testing with mpv and ffmpeg; any thoughts from the
>> Kodi point of view would be most welcome.
> It generally works quite well, but we still have the unresolved
> vaSyncSurface problem.
> To recap: vaExportSurfaceHandle requires calling vaSyncSurface to make
> sure that the decode is actually finished and the buffer is usable
> before rendering the frame. However, vaSyncSurface was largely
> unimplemented on mesa back then and it was unclear how to proceed with
> regard to decode (VAAPI)/present (EGL+GL) synchronization.
>
> So on to the question: Is this still the case, or has there been
> progress on implementing vaSyncSurface in

Re: [Mesa-dev] 10bit HEVC decoding for RadeonSI v2

2018-02-12 Thread rainer.hochec...@onlinehome.de
Hi Christian,

 

For Kodi it is better to wait on the thread that does decoding than later 

by the render thread. Means it is desired to call it.

 

Does vaSyncSurface block as stated by the docs?

 

Regards,

Rainer

 

Gesendet: Sonntag, 11. Februar 2018 um 11:02 Uhr
Von: "Christian König" 
An: "Philipp Kerling" , s...@jkqxz.net
Cc: rainer.hochec...@onlinehome.de, peter.fruehber...@gmail.com, mic...@daenzer.net, mesa-dev@lists.freedesktop.org, harry.wentl...@amd.com, lru...@libreelec.tv
Betreff: Re: [Mesa-dev] 10bit HEVC decoding for RadeonSI v2

Am 09.02.2018 um 21:35 schrieb Philipp Kerling:
> Hi,
>
> resurrecting this thread again since there's been some progress on the
> Kodi side.
>
>> For the EGL part, see <https://github.com/01org/libva/pull/125>
>> and <https://lists.freedesktop.org/archives/mesa-dev/2017-October/171246.html>.
> We recently started testing vaExportSurfaceHandle support, so we will
> have this covered soon.
>
>> I have been testing with mpv and ffmpeg; any thoughts from the
>> Kodi point of view would be most welcome.
> It generally works quite well, but we still have the unresolved
> vaSyncSurface problem.
> To recap: vaExportSurfaceHandle requires calling vaSyncSurface to make
> sure that the decode is actually finished and the buffer is usable
> before rendering the frame. However, vaSyncSurface was largely
> unimplemented on mesa back then and it was unclear how to proceed with
> regard to decode (VAAPI)/present (EGL+GL) synchronization.
>
> So on to the question: Is this still the case, or has there been
> progress on implementing vaSyncSurface in mesa? In either case, do we
> need that support or does this syncing work implicitly somehow on AMD?
>
> I've noticed that mpv does not seem to call vaSyncSurface, although it
> technically should.

Actually the mpv approach is correct.

Calling vaSyncSurface is unnecessary and undesired for AMD hardware
because it moves synchronization to the CPU while it should happen on
the GPU and/or GPU scheduler.

E.g. our 3D pipeline can wait for hardware video decoding to finish
before starting the rendering. We even have some implementations which
allow the 3D pipeline to start when only the first halve of the picture
is decoded etc..

If we don't do this the 3D pipeline runs dry between frame decoding
which leads to problems with power management.

We should probably add a flag or bit or feature or something like this
to note that the application explicitly should NOT call vaSyncSurface
before exporting the surface.

Regards,
Christian.

>
> Best regards,
> Philipp
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
 


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] 10bit HEVC decoding for RadeonSI v2

2018-02-12 Thread rainer.hochec...@onlinehome.de
Hi Christian,

 

Finally something in the render pipeline has to wait until decoding is finished or the

frame can't be rendered and this will result in blocking the render thread.

 

Regards,

Rainer

 


Gesendet: Sonntag, 11. Februar 2018 um 13:13 Uhr
Von: "Christian König" 
An: "rainer.hochec...@onlinehome.de" 
Cc: "Philipp Kerling" , s...@jkqxz.net, peter.fruehber...@gmail.com, mic...@daenzer.net, mesa-dev@lists.freedesktop.org, harry.wentl...@amd.com, lru...@libreelec.tv
Betreff: Re: [Mesa-dev] 10bit HEVC decoding for RadeonSI v2



Hi Rainer,

the render thread doesn't wait either.


See when you dispatch some work the AMD drivers always wait for prerequisites in the background, not in the foreground.

The older radeon driver uses hardware semaphores for this while amdgpu has a GPU scheduler which handles that stuff.

This is very important because when you hold back rendering work in the application the driver stack won't know about it and power management starts to stutter.

Not so important for 1920p because power management should be able to compensate the work peaks, but for 4K that's something mandatory to let the driver be able to estimate future load.

We even discussed in our multimedia meeting if we shouldn't limit 3D power management when UVD/VCN decoding is active because of that problem. But I'm not very keen about those workarounds because it's really counterproductive for transcode use cases to have the 3D engine idling around with high clocks.

Regards,
Christian.

Am 11.02.2018 um 12:51 schrieb rainer.hochec...@onlinehome.de:



Hi Christian,

 

For Kodi it is better to wait on the thread that does decoding than later 

by the render thread. Means it is desired to call it.

 

Does vaSyncSurface block as stated by the docs?

 

Regards,

Rainer

 

Gesendet: Sonntag, 11. Februar 2018 um 11:02 Uhr
Von: "Christian König" 
An: "Philipp Kerling" , s...@jkqxz.net
Cc: rainer.hochec...@onlinehome.de, peter.fruehber...@gmail.com, mic...@daenzer.net, mesa-dev@lists.freedesktop.org, harry.wentl...@amd.com, lru...@libreelec.tv
Betreff: Re: [Mesa-dev] 10bit HEVC decoding for RadeonSI v2

Am 09.02.2018 um 21:35 schrieb Philipp Kerling:
> Hi,
>
> resurrecting this thread again since there's been some progress on the
> Kodi side.
>
>> For the EGL part, see <https://github.com/01org/libva/pull/125>
>> and <https://lists.freedesktop.org/archives/mesa-dev/2017-October/171246.html>.
> We recently started testing vaExportSurfaceHandle support, so we will
> have this covered soon.
>
>> I have been testing with mpv and ffmpeg; any thoughts from the
>> Kodi point of view would be most welcome.
> It generally works quite well, but we still have the unresolved
> vaSyncSurface problem.
> To recap: vaExportSurfaceHandle requires calling vaSyncSurface to make
> sure that the decode is actually finished and the buffer is usable
> before rendering the frame. However, vaSyncSurface was largely
> unimplemented on mesa back then and it was unclear how to proceed with
> regard to decode (VAAPI)/present (EGL+GL) synchronization.
>
> So on to the question: Is this still the case, or has there been
> progress on implementing vaSyncSurface in mesa? In either case, do we
> need that support or does this syncing work implicitly somehow on AMD?
>
> I've noticed that mpv does not seem to call vaSyncSurface, although it
> technically should.

Actually the mpv approach is correct.

Calling vaSyncSurface is unnecessary and undesired for AMD hardware
because it moves synchronization to the CPU while it should happen on
the GPU and/or GPU scheduler.

E.g. our 3D pipeline can wait for hardware video decoding to finish
before starting the rendering. We even have some implementations which
allow the 3D pipeline to start when only the first halve of the picture
is decoded etc..

If we don't do this the 3D pipeline runs dry between frame decoding
which leads to problems with power management.

We should probably add a flag or bit or feature or something like this
to note that the application explicitly should NOT call vaSyncSurface
before exporting the surface.

Regards,
Christian.

>
> Best regards,
> Philipp
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
 







___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] 10bit HEVC decoding for RadeonSI v2

2018-02-11 Thread Christian König

Hi Rainer,

well at the end of the pipeline the page flip waits for all previous 
operations on a buffer to be completed before displaying it. But even 
that wait is asynchronous.


So as long as you don't wait for the page flip your render thread won't 
be blocked.


Using vaSyncSurface is only good in two occasions:
1. You want to access the data with the CPU. And even then it is not 
necessary most of the time because of how Mesa is designed.
2. You need to limit how many jobs are in flight. Depending on how you 
do resource management that is necessary to prevent the application from 
scheduling mass amount of jobs and running away with all the resources 
for that.


Regards,
Christian.

Am 11.02.2018 um 13:26 schrieb rainer.hochec...@onlinehome.de:

Hi Christian,
Finally something in the render pipeline has to wait until decoding is 
finished or the
frame can't be rendered and this will result in blocking the render 
thread.

Regards,
Rainer
*Gesendet:* Sonntag, 11. Februar 2018 um 13:13 Uhr
*Von:* "Christian König" 
*An:* "rainer.hochec...@onlinehome.de" 
*Cc:* "Philipp Kerling" , s...@jkqxz.net, 
peter.fruehber...@gmail.com, mic...@daenzer.net, 
mesa-dev@lists.freedesktop.org, harry.wentl...@amd.com, 
lru...@libreelec.tv

*Betreff:* Re: [Mesa-dev] 10bit HEVC decoding for RadeonSI v2
Hi Rainer,

the render thread doesn't wait either.

See when you dispatch some work the AMD drivers always wait for 
prerequisites in the background, not in the foreground.


The older radeon driver uses hardware semaphores for this while amdgpu 
has a GPU scheduler which handles that stuff.


This is very important because when you hold back rendering work in 
the application the driver stack won't know about it and power 
management starts to stutter.


Not so important for 1920p because power management should be able to 
compensate the work peaks, but for 4K that's something mandatory to 
let the driver be able to estimate future load.


We even discussed in our multimedia meeting if we shouldn't limit 3D 
power management when UVD/VCN decoding is active because of that 
problem. But I'm not very keen about those workarounds because it's 
really counterproductive for transcode use cases to have the 3D engine 
idling around with high clocks.


Regards,
Christian.

Am 11.02.2018 um 12:51 schrieb rainer.hochec...@onlinehome.de:

Hi Christian,
For Kodi it is better to wait on the thread that does decoding
than later
by the render thread. Means it is desired to call it.
Does vaSyncSurface block as stated by the docs?
Regards,
Rainer
*Gesendet:* Sonntag, 11. Februar 2018 um 11:02 Uhr
*Von:* "Christian König" 
*An:* "Philipp Kerling" , s...@jkqxz.net
*Cc:* rainer.hochec...@onlinehome.de, peter.fruehber...@gmail.com,
mic...@daenzer.net, mesa-dev@lists.freedesktop.org,
    harry.wentl...@amd.com, lru...@libreelec.tv
*Betreff:* Re: [Mesa-dev] 10bit HEVC decoding for RadeonSI v2
Am 09.02.2018 um 21:35 schrieb Philipp Kerling:
> Hi,
>
> resurrecting this thread again since there's been some progress
on the
> Kodi side.
>
>> For the EGL part, see <https://github.com/01org/libva/pull/125>
>> and
<https://lists.freedesktop.org/archives/mesa-dev/2017-October/171246.html>.
> We recently started testing vaExportSurfaceHandle support, so we
will
> have this covered soon.
>
>> I have been testing with mpv and ffmpeg; any thoughts from the
>> Kodi point of view would be most welcome.
> It generally works quite well, but we still have the unresolved
> vaSyncSurface problem.
> To recap: vaExportSurfaceHandle requires calling vaSyncSurface
to make
> sure that the decode is actually finished and the buffer is usable
> before rendering the frame. However, vaSyncSurface was largely
> unimplemented on mesa back then and it was unclear how to
proceed with
> regard to decode (VAAPI)/present (EGL+GL) synchronization.
>
> So on to the question: Is this still the case, or has there been
> progress on implementing vaSyncSurface in mesa? In either case,
do we
> need that support or does this syncing work implicitly somehow
on AMD?
>
> I've noticed that mpv does not seem to call vaSyncSurface,
although it
> technically should.

Actually the mpv approach is correct.

Calling vaSyncSurface is unnecessary and undesired for AMD hardware
because it moves synchronization to the CPU while it should happen on
the GPU and/or GPU scheduler.

E.g. our 3D pipeline can wait for hardware video decoding to finish
before starting the rendering. We even have some implementations which
allow the 3D pipeline to start when only the first halve o

Re: [Mesa-dev] 10bit HEVC decoding for RadeonSI v2

2018-02-11 Thread Christian König

Hi Rainer,

the render thread doesn't wait either.

See when you dispatch some work the AMD drivers always wait for 
prerequisites in the background, not in the foreground.


The older radeon driver uses hardware semaphores for this while amdgpu 
has a GPU scheduler which handles that stuff.


This is very important because when you hold back rendering work in the 
application the driver stack won't know about it and power management 
starts to stutter.


Not so important for 1920p because power management should be able to 
compensate the work peaks, but for 4K that's something mandatory to let 
the driver be able to estimate future load.


We even discussed in our multimedia meeting if we shouldn't limit 3D 
power management when UVD/VCN decoding is active because of that 
problem. But I'm not very keen about those workarounds because it's 
really counterproductive for transcode use cases to have the 3D engine 
idling around with high clocks.


Regards,
Christian.

Am 11.02.2018 um 12:51 schrieb rainer.hochec...@onlinehome.de:

Hi Christian,
For Kodi it is better to wait on the thread that does decoding than later
by the render thread. Means it is desired to call it.
Does vaSyncSurface block as stated by the docs?
Regards,
Rainer
*Gesendet:* Sonntag, 11. Februar 2018 um 11:02 Uhr
*Von:* "Christian König" 
*An:* "Philipp Kerling" , s...@jkqxz.net
*Cc:* rainer.hochec...@onlinehome.de, peter.fruehber...@gmail.com, 
mic...@daenzer.net, mesa-dev@lists.freedesktop.org, 
harry.wentl...@amd.com, lru...@libreelec.tv

*Betreff:* Re: [Mesa-dev] 10bit HEVC decoding for RadeonSI v2
Am 09.02.2018 um 21:35 schrieb Philipp Kerling:
> Hi,
>
> resurrecting this thread again since there's been some progress on the
> Kodi side.
>
>> For the EGL part, see <https://github.com/01org/libva/pull/125>
>> and 
<https://lists.freedesktop.org/archives/mesa-dev/2017-October/171246.html>.

> We recently started testing vaExportSurfaceHandle support, so we will
> have this covered soon.
>
>> I have been testing with mpv and ffmpeg; any thoughts from the
>> Kodi point of view would be most welcome.
> It generally works quite well, but we still have the unresolved
> vaSyncSurface problem.
> To recap: vaExportSurfaceHandle requires calling vaSyncSurface to make
> sure that the decode is actually finished and the buffer is usable
> before rendering the frame. However, vaSyncSurface was largely
> unimplemented on mesa back then and it was unclear how to proceed with
> regard to decode (VAAPI)/present (EGL+GL) synchronization.
>
> So on to the question: Is this still the case, or has there been
> progress on implementing vaSyncSurface in mesa? In either case, do we
> need that support or does this syncing work implicitly somehow on AMD?
>
> I've noticed that mpv does not seem to call vaSyncSurface, although it
> technically should.

Actually the mpv approach is correct.

Calling vaSyncSurface is unnecessary and undesired for AMD hardware
because it moves synchronization to the CPU while it should happen on
the GPU and/or GPU scheduler.

E.g. our 3D pipeline can wait for hardware video decoding to finish
before starting the rendering. We even have some implementations which
allow the 3D pipeline to start when only the first halve of the picture
is decoded etc..

If we don't do this the 3D pipeline runs dry between frame decoding
which leads to problems with power management.

We should probably add a flag or bit or feature or something like this
to note that the application explicitly should NOT call vaSyncSurface
before exporting the surface.

Regards,
Christian.

>
> Best regards,
> Philipp
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] 10bit HEVC decoding for RadeonSI v2

2018-02-11 Thread Christian König

Am 11.02.2018 um 11:35 schrieb Philipp Kerling:

Hi,

2018-02-11 (日) の 11:02 +0100 に Christian König さんは書きました:

Am 09.02.2018 um 21:35 schrieb Philipp Kerling:

Hi,

resurrecting this thread again since there's been some progress on
the
Kodi side.


For the EGL part, see 
and .

We recently started testing vaExportSurfaceHandle support, so we
will
have this covered soon.


I have been testing with mpv and ffmpeg; any thoughts from the
Kodi point of view would be most welcome.

It generally works quite well, but we still have the unresolved
vaSyncSurface problem.
To recap: vaExportSurfaceHandle requires calling vaSyncSurface to
make
sure that the decode is actually finished and the buffer is usable
before rendering the frame. However, vaSyncSurface was largely
unimplemented on mesa back then and it was unclear how to proceed
with
regard to decode (VAAPI)/present (EGL+GL) synchronization.

So on to the question: Is this still the case, or has there been
progress on implementing vaSyncSurface in mesa? In either case, do
we
need that support or does this syncing work implicitly somehow on
AMD?

I've noticed that mpv does not seem to call vaSyncSurface, although
it
technically should.

Actually the mpv approach is correct.

Calling vaSyncSurface is unnecessary and undesired for AMD hardware
because it moves synchronization to the CPU while it should happen
on
the GPU and/or GPU scheduler.

E.g. our 3D pipeline can wait for hardware video decoding to finish
before starting the rendering. We even have some implementations
which
allow the 3D pipeline to start when only the first halve of the
picture
is decoded etc..

If we don't do this the 3D pipeline runs dry between frame decoding
which leads to problems with power management.

Thanks for the explanation! To verify: For AMD hardware, all of this is
taken care of implicitly when we use the DRMPRIME buffer from
vaExportSurfaceHandle in EGL/GL? We don't need to wait for anything?


Yes, at least in theory that is exactly what should happen.

No idea if that was ever tested with VA-API, but VDPAU and VA-API as 
well as OpenCL/OpenGL interop and DRI2/DRI3 buffer sharing should all 
use exactly the same code path.


So if I'm not completely mistaken that should work as expected.




We should probably add a flag or bit or feature or something like
this
to note that the application explicitly should NOT call
vaSyncSurface
before exporting the surface.

Doesn't this also depend on how the surface is going to be used? What
happens if I mmap the dma-buf fd and copy from the buffer via CPU? Does
this also sync implicitly at some point?


In this case the driver doesn't wait for previous activity on a buffer 
to end.


So before you access anything with the CPU you should indeed call 
vaSyncSurface.


Christian.



Regards,
Philipp


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] 10bit HEVC decoding for RadeonSI v2

2018-02-11 Thread Philipp Kerling
Hi,

2018-02-11 (日) の 11:02 +0100 に Christian König さんは書きました:
> Am 09.02.2018 um 21:35 schrieb Philipp Kerling:
> > Hi,
> > 
> > resurrecting this thread again since there's been some progress on
> > the
> > Kodi side.
> > 
> > > For the EGL part, see 
> > > and  > > /171246.html>.
> > 
> > We recently started testing vaExportSurfaceHandle support, so we
> > will
> > have this covered soon.
> > 
> > > I have been testing with mpv and ffmpeg; any thoughts from the
> > > Kodi point of view would be most welcome.
> > 
> > It generally works quite well, but we still have the unresolved
> > vaSyncSurface problem.
> > To recap: vaExportSurfaceHandle requires calling vaSyncSurface to
> > make
> > sure that the decode is actually finished and the buffer is usable
> > before rendering the frame. However, vaSyncSurface was largely
> > unimplemented on mesa back then and it was unclear how to proceed
> > with
> > regard to decode (VAAPI)/present (EGL+GL) synchronization.
> > 
> > So on to the question: Is this still the case, or has there been
> > progress on implementing vaSyncSurface in mesa? In either case, do
> > we
> > need that support or does this syncing work implicitly somehow on
> > AMD?
> > 
> > I've noticed that mpv does not seem to call vaSyncSurface, although
> > it
> > technically should.
> 
> Actually the mpv approach is correct.
> 
> Calling vaSyncSurface is unnecessary and undesired for AMD hardware 
> because it moves synchronization to the CPU while it should happen
> on 
> the GPU and/or GPU scheduler.
> 
> E.g. our 3D pipeline can wait for hardware video decoding to finish 
> before starting the rendering. We even have some implementations
> which 
> allow the 3D pipeline to start when only the first halve of the
> picture 
> is decoded etc..
> 
> If we don't do this the 3D pipeline runs dry between frame decoding 
> which leads to problems with power management.
Thanks for the explanation! To verify: For AMD hardware, all of this is
taken care of implicitly when we use the DRMPRIME buffer from
vaExportSurfaceHandle in EGL/GL? We don't need to wait for anything?

> We should probably add a flag or bit or feature or something like
> this 
> to note that the application explicitly should NOT call
> vaSyncSurface 
> before exporting the surface.
Doesn't this also depend on how the surface is going to be used? What
happens if I mmap the dma-buf fd and copy from the buffer via CPU? Does
this also sync implicitly at some point?

Regards,
Philipp
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] 10bit HEVC decoding for RadeonSI v2

2018-02-11 Thread Christian König

Am 09.02.2018 um 21:35 schrieb Philipp Kerling:

Hi,

resurrecting this thread again since there's been some progress on the
Kodi side.


For the EGL part, see 
and .

We recently started testing vaExportSurfaceHandle support, so we will
have this covered soon.


I have been testing with mpv and ffmpeg; any thoughts from the
Kodi point of view would be most welcome.

It generally works quite well, but we still have the unresolved
vaSyncSurface problem.
To recap: vaExportSurfaceHandle requires calling vaSyncSurface to make
sure that the decode is actually finished and the buffer is usable
before rendering the frame. However, vaSyncSurface was largely
unimplemented on mesa back then and it was unclear how to proceed with
regard to decode (VAAPI)/present (EGL+GL) synchronization.

So on to the question: Is this still the case, or has there been
progress on implementing vaSyncSurface in mesa? In either case, do we
need that support or does this syncing work implicitly somehow on AMD?

I've noticed that mpv does not seem to call vaSyncSurface, although it
technically should.


Actually the mpv approach is correct.

Calling vaSyncSurface is unnecessary and undesired for AMD hardware 
because it moves synchronization to the CPU while it should happen on 
the GPU and/or GPU scheduler.


E.g. our 3D pipeline can wait for hardware video decoding to finish 
before starting the rendering. We even have some implementations which 
allow the 3D pipeline to start when only the first halve of the picture 
is decoded etc..


If we don't do this the 3D pipeline runs dry between frame decoding 
which leads to problems with power management.


We should probably add a flag or bit or feature or something like this 
to note that the application explicitly should NOT call vaSyncSurface 
before exporting the surface.


Regards,
Christian.



Best regards,
Philipp
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] 10bit HEVC decoding for RadeonSI v2

2018-02-10 Thread Philipp Kerling
Hi,

resurrecting this thread again since there's been some progress on the
Kodi side.

> For the EGL part, see 
> and 
> .
We recently started testing vaExportSurfaceHandle support, so we will
have this covered soon.

> I have been testing with mpv and ffmpeg; any thoughts from the
> Kodi point of view would be most welcome.
It generally works quite well, but we still have the unresolved
vaSyncSurface problem.
To recap: vaExportSurfaceHandle requires calling vaSyncSurface to make
sure that the decode is actually finished and the buffer is usable
before rendering the frame. However, vaSyncSurface was largely
unimplemented on mesa back then and it was unclear how to proceed with
regard to decode (VAAPI)/present (EGL+GL) synchronization.

So on to the question: Is this still the case, or has there been
progress on implementing vaSyncSurface in mesa? In either case, do we
need that support or does this syncing work implicitly somehow on AMD?

I've noticed that mpv does not seem to call vaSyncSurface, although it
technically should.

Best regards,
Philipp
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] 10bit HEVC decoding for RadeonSI v2

2017-10-01 Thread Mark Thompson
On 28/09/17 04:41, Lukas Rusak wrote:
> Hello all,
> 
> I'm bumping this to layout some progress that has been made on our (Kodi)
> side and hopefully we can create a road map for what needs to be done on
> the amd side to get it working with our vaapi implementation.
> 
> I have merged the drm/kms display code into kodi. So in master currently is
> a simple drm legacy mode. I have an open PR for getting drm atomic working.
> It currently works on my intel graphics but has issues with other hardware
> so I haven't merged it yet.
> 
> I have also split out glx code in Kodi so now it is possible to build
> without vdpau and glx. So if we could get amd working with our vaapi
> implementation then nvidia would be the only ones using vdpau and glx
> anymore, the others are on egl.
> 
> Maybe things have changed now that I see the amd DC display code has been
> queued up for kernel 4.15. Are there any changes in mesa master that may be
> relevant to our cause?
> 
> What needs to be done?
> - vasyncsurface??
> - 10bit support in gbm
> - vaapi drm egl extensions

For the EGL part, see  and 
.

I have been testing with mpv and ffmpeg; any thoughts from the Kodi point of 
view would be most welcome.

Thanks,

- Mark
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] 10bit HEVC decoding for RadeonSI v2

2017-09-28 Thread Lukas Rusak
Hello all,

I'm bumping this to layout some progress that has been made on our (Kodi)
side and hopefully we can create a road map for what needs to be done on
the amd side to get it working with our vaapi implementation.

I have merged the drm/kms display code into kodi. So in master currently is
a simple drm legacy mode. I have an open PR for getting drm atomic working.
It currently works on my intel graphics but has issues with other hardware
so I haven't merged it yet.

I have also split out glx code in Kodi so now it is possible to build
without vdpau and glx. So if we could get amd working with our vaapi
implementation then nvidia would be the only ones using vdpau and glx
anymore, the others are on egl.

Maybe things have changed now that I see the amd DC display code has been
queued up for kernel 4.15. Are there any changes in mesa master that may be
relevant to our cause?

What needs to be done?
- vasyncsurface??
- 10bit support in gbm
- vaapi drm egl extensions

Anything else?

Thanks,
Lukas Rusak
On Sun, Jun 25, 2017 at 2:53 AM Peter Frühberger <
peter.fruehber...@gmail.com> wrote:

> Hi all,
>
> just as information:
> https://github.com/FernetMenta/kodi-agile/commit/ca8119b4e11a52415125af959f220b280f56ecae
> Rainer moved the specific parts of the buffer sharing into a separate
> infrastructure see the VAAPIEGL.cpp and VAAPIEGL.h in the above patch.
>
> This basically encapsulates the fourcc_code('R', '8', ' ', ' '); specific
> to intel / mesa and makes VAAPI.cpp - the decoder - more generic.
>
> That means an AMD implementation of above interface can now happen easily
> by implementing Init / Map / Unmap.
>
> Have a nice weekend
> Peter
>
> 2017-03-20 17:00 GMT+01:00 Marek Olšák :
>
>> On Sun, Mar 19, 2017 at 2:49 PM, Christian König
>>  wrote:
>> > Hi Peter,
>> >
>> > Adding Michel and Marek for the Mesa interop side and Harry for the
>> display
>> > side.
>> >
>> > How do you want us to display the decoded surfaces?
>> >
>> > Well to make a long story short: I don't have the slightest idea.
>> Ideally we
>> > would of the same handling as Intel so that you guys don't have anything
>> > vendor dependent in your code.
>> >
>> > The first step would be to get the VA-API DRM extension to work with
>> EGL. So
>> > that Kodi is able to export the YUV surfaces and import parts of them as
>> > separate R8/R16 or R8G8/R16G16 surfaces, right?
>> >
>> > What EGL/GL extension do you guys use to import the surfaces? Marek is
>> that
>> > stuff fully supported, e.g. do we also handle the offsets correctly?
>> I've
>> > added the backend code for this while doing VDPAU interop, but the
>> EGL/GL
>> > frontend code needs to handle it gracefully as well.
>>
>> Mesa/EGL imports an FD with an offset, but it always exports an FD
>> with offset=0 (the driver offset is ignored). It also always returns
>> num_planes = 1 on export, is that bad?
>>
>> Marek
>>
>
>
>
> --
>Key-ID: 0x1A995A9B
>keyserver: pgp.mit.edu
> ==
> Fingerprint: 4606 DA19 EC2E 9A0B 0157  C81B DA07 CF63 1A99 5A9B
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] 10bit HEVC decoding for RadeonSI v2

2017-06-25 Thread Peter Frühberger
Hi all,

just as information:
https://github.com/FernetMenta/kodi-agile/commit/ca8119b4e11a52415125af959f220b280f56ecae
Rainer moved the specific parts of the buffer sharing into a separate
infrastructure see the VAAPIEGL.cpp and VAAPIEGL.h in the above patch.

This basically encapsulates the fourcc_code('R', '8', ' ', ' '); specific
to intel / mesa and makes VAAPI.cpp - the decoder - more generic.

That means an AMD implementation of above interface can now happen easily
by implementing Init / Map / Unmap.

Have a nice weekend
Peter

2017-03-20 17:00 GMT+01:00 Marek Olšák :

> On Sun, Mar 19, 2017 at 2:49 PM, Christian König
>  wrote:
> > Hi Peter,
> >
> > Adding Michel and Marek for the Mesa interop side and Harry for the
> display
> > side.
> >
> > How do you want us to display the decoded surfaces?
> >
> > Well to make a long story short: I don't have the slightest idea.
> Ideally we
> > would of the same handling as Intel so that you guys don't have anything
> > vendor dependent in your code.
> >
> > The first step would be to get the VA-API DRM extension to work with
> EGL. So
> > that Kodi is able to export the YUV surfaces and import parts of them as
> > separate R8/R16 or R8G8/R16G16 surfaces, right?
> >
> > What EGL/GL extension do you guys use to import the surfaces? Marek is
> that
> > stuff fully supported, e.g. do we also handle the offsets correctly? I've
> > added the backend code for this while doing VDPAU interop, but the EGL/GL
> > frontend code needs to handle it gracefully as well.
>
> Mesa/EGL imports an FD with an offset, but it always exports an FD
> with offset=0 (the driver offset is ignored). It also always returns
> num_planes = 1 on export, is that bad?
>
> Marek
>



-- 
   Key-ID: 0x1A995A9B
   keyserver: pgp.mit.edu
==
Fingerprint: 4606 DA19 EC2E 9A0B 0157  C81B DA07 CF63 1A99 5A9B
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] 10bit HEVC decoding for RadeonSI v2

2017-03-20 Thread Peter Frühberger
Hi Christian,

we use it the following way:
Dependend on the surface NV12 vs. P010 we use:
https://github.com/FernetMenta/kodi-agile/blob/master/xbmc/cores/VideoPlayer/DVDCodecs/Video/VAAPI.cpp#L1416


R8 and GR88

or alternatively:
https://github.com/FernetMenta/kodi-agile/blob/master/xbmc/cores/VideoPlayer/DVDCodecs/Video/VAAPI.cpp#L1493

R16 and GR32

There is also a possibility to use BGRA, but this involves internal copy of
the yuv surfaces in vaapi and is therefore not suited well (more memory and
more load).

For both images Y and UV we use: eglCreateImageKHR extension follow
by glEGLImageTargetTexture2DOES.

See:
https://github.com/FernetMenta/kodi-agile/blob/master/xbmc/cores/VideoPlayer/DVDCodecs/Video/VAAPI.cpp#L1262

On the VAAPI side:
VA_SURFACE_ATTRIB_MEM_TYPE_DRM_PRIME with either VA_RT_FORMAT_YUV420
or VA_FOURCC_P010 are used.

I think that method is quite generalizable and nothing is intel specific.

What do you think?

Best regards
Peter



2017-03-19 14:49 GMT+01:00 Christian König :

> Hi Peter,
>
> Adding Michel and Marek for the Mesa interop side and Harry for the
> display side.
>
> How do you want us to display the decoded surfaces?
>
> Well to make a long story short: I don't have the slightest idea. Ideally
> we would of the same handling as Intel so that you guys don't have anything
> vendor dependent in your code.
>
> The first step would be to get the VA-API DRM extension to work with EGL.
> So that Kodi is able to export the YUV surfaces and import parts of them as
> separate R8/R16 or R8G8/R16G16 surfaces, right?
>
> What EGL/GL extension do you guys use to import the surfaces? Marek is
> that stuff fully supported, e.g. do we also handle the offsets correctly?
> I've added the backend code for this while doing VDPAU interop, but the
> EGL/GL frontend code needs to handle it gracefully as well.
>
> The second step is then to teach our DC how to handle RGB surfaces with
> 10bit. I doubt the old code has support for that and we probably don't want
> to add it. So Harry can you comment on how far along we got with that in DC?
>
> Regards,
> Christian.
>
>
> Am 19.03.2017 um 13:26 schrieb Peter Frühberger:
>
> Hi Christian,
>
> thank you for your message. We are still wondering about the render part.
> How do you want us to display the decoded surfaces? Looking at mpv it seems
> it will only work via vaPutSurface and is therefore tight to X11. That
> means it's dependend on the visuals 8 bit only.
>
> We are working on a drm-only kodi and now ask ourselves: Is there a
> possibility to interop with a drm extension and eglCreateImage on AMD hw,
> too? With the intel only R32, R8 linux buf methods we are also running
> succesfully on MIR now, wayland would work the very same.
>
> Best regards
> Peter
>
>
>
> 2017-03-10 17:25 GMT+01:00 rainer.hochec...@onlinehome.de <
> rainer.hochec...@onlinehome.de>:
>
>> Hi Christian,
>>
>> I already removed the check for Intel in my dev branch. On startup
>> Kodi does a functional test if vaapi works. If the test passes, it is
>> availalbe
>> regarless of the underlying type of hardware/driver.
>>
>> Regards,
>> Rainer
>>
>> *Gesendet:* Mittwoch, 08. März 2017 um 13:29 Uhr
>> *Von:* "Christian König" 
>> *An:* mesa-dev@lists.freedesktop.org
>> *Cc:* rainer.hochec...@onlinehome.de, peter.fruehber...@gmail.com
>> *Betreff:* 10bit HEVC decoding for RadeonSI v2
>> Hi guys,
>>
>> I finally found time testing this and hammering out (hopefully) all the
>> remaining bugs. Playing a 10bit HEVC file through VAAPI with mpv/ffmpeg
>> git
>> master from about two days ago now works flawlessly and has only about
>> 15% CPU
>> load on one core on a Kaveri system.
>>
>> The VDPAU path should work as well, but NVidias implementation of this is
>> still
>> completely broken and so nobody enables it and we don't have a way to
>> test it.
>>
>> Rainer/Peter maybe you guys want to take a look and enable it in Kodi.
>>
>> The next logical step is to get our display code paths to be 10bit ready.
>>
>> Please review and comment,
>> Christian.
>>
>>
>
>
>
> --
>Key-ID: 0x1A995A9B
>keyserver: pgp.mit.edu
> ==
> Fingerprint: 4606 DA19 EC2E 9A0B 0157  C81B DA07 CF63 1A99 5A9B
>
>
>


-- 
   Key-ID: 0x1A995A9B
   keyserver: pgp.mit.edu
==
Fingerprint: 4606 DA19 EC2E 9A0B 0157  C81B DA07 CF63 1A99 5A9B
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] 10bit HEVC decoding for RadeonSI v2

2017-03-20 Thread Peter Frühberger
Hi Christian,

thank you for your message. We are still wondering about the render part.
How do you want us to display the decoded surfaces? Looking at mpv it seems
it will only work via vaPutSurface and is therefore tight to X11. That
means it's dependend on the visuals 8 bit only.

We are working on a drm-only kodi and now ask ourselves: Is there a
possibility to interop with a drm extension and eglCreateImage on AMD hw,
too? With the intel only R32, R8 linux buf methods we are also running
succesfully on MIR now, wayland would work the very same.

Best regards
Peter



2017-03-10 17:25 GMT+01:00 rainer.hochec...@onlinehome.de <
rainer.hochec...@onlinehome.de>:

> Hi Christian,
>
> I already removed the check for Intel in my dev branch. On startup
> Kodi does a functional test if vaapi works. If the test passes, it is
> availalbe
> regarless of the underlying type of hardware/driver.
>
> Regards,
> Rainer
>
> *Gesendet:* Mittwoch, 08. März 2017 um 13:29 Uhr
> *Von:* "Christian König" 
> *An:* mesa-dev@lists.freedesktop.org
> *Cc:* rainer.hochec...@onlinehome.de, peter.fruehber...@gmail.com
> *Betreff:* 10bit HEVC decoding for RadeonSI v2
> Hi guys,
>
> I finally found time testing this and hammering out (hopefully) all the
> remaining bugs. Playing a 10bit HEVC file through VAAPI with mpv/ffmpeg git
> master from about two days ago now works flawlessly and has only about 15%
> CPU
> load on one core on a Kaveri system.
>
> The VDPAU path should work as well, but NVidias implementation of this is
> still
> completely broken and so nobody enables it and we don't have a way to test
> it.
>
> Rainer/Peter maybe you guys want to take a look and enable it in Kodi.
>
> The next logical step is to get our display code paths to be 10bit ready.
>
> Please review and comment,
> Christian.
>
>



-- 
   Key-ID: 0x1A995A9B
   keyserver: pgp.mit.edu
==
Fingerprint: 4606 DA19 EC2E 9A0B 0157  C81B DA07 CF63 1A99 5A9B
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] 10bit HEVC decoding for RadeonSI v2

2017-03-20 Thread rainer.hochec...@onlinehome.de
> for example how does synchronization happen between the two APIs?

 

right, vaapi seems not as matured as vdpau in this regard. But Kodi's multithreading design does cope with this. We call

vaSyncSurface before feeding vpp and before maping va buffers to GL.


 

I suggest to keep it simple from a driver perspective and require applications to use vaSyncSurface

 


Gesendet: Sonntag, 19. März 2017 um 15:28 Uhr
Von: "Christian König" 
An: "Peter Frühberger" 
Cc: "rainer.hochec...@onlinehome.de" , mesa-dev@lists.freedesktop.org, lru...@libreelec.tv, "Michel Dänzer" , "Marek Olšák" , "Wentland, Harry" 
Betreff: Re: 10bit HEVC decoding for RadeonSI v2





What do you think?

In general that it might work, but basic problem is the API design once more.

While with VDPAU the steps where applications asks OpenGL to interop with VDPAU and the two APIs can do all the handshaking internally.

With VA-API we have Application exporting buffers from VA-API and then importing the same buffer as two surfaces into OpenGL.

That leaves a whole bunch of open questions, for example how does synchronization happen between the two APIs? E.g. the application (Kodi) probably doesn't wants to wait for the decoding result before it uses the the surface with OpenGL. We don't have a way to sync between the two APIs here except for the handle.

The next problem is how do we communicate the layout of data in the buffer? E.g. we have the format and the offset, but that assumes that you don't have any nasty kind of tilling modes applied here.

I think we can make that work for now (we aren't using tilling modes with UVD much anyway), but this is going to bite us again sooner or later. Going to put the whole thing on my todo list once more.

Regards,
Christian.

Am 19.03.2017 um 15:06 schrieb Peter Frühberger:


Hi Christian,
 

we use it the following way:

Dependend on the surface NV12 vs. P010 we use:

https://github.com/FernetMenta/kodi-agile/blob/master/xbmc/cores/VideoPlayer/DVDCodecs/Video/VAAPI.cpp#L1416 

 

R8 and GR88

 

or alternatively:

https://github.com/FernetMenta/kodi-agile/blob/master/xbmc/cores/VideoPlayer/DVDCodecs/Video/VAAPI.cpp#L1493

 

R16 and GR32

 

There is also a possibility to use BGRA, but this involves internal copy of the yuv surfaces in vaapi and is therefore not suited well (more memory and more load).

 

For both images Y and UV we use: eglCreateImageKHR extension follow by glEGLImageTargetTexture2DOES.

 

See: https://github.com/FernetMenta/kodi-agile/blob/master/xbmc/cores/VideoPlayer/DVDCodecs/Video/VAAPI.cpp#L1262

 

On the VAAPI side:
VA_SURFACE_ATTRIB_MEM_TYPE_DRM_PRIME with either VA_RT_FORMAT_YUV420 or VA_FOURCC_P010 are used.

 

I think that method is quite generalizable and nothing is intel specific.

 

What do you think?

 

Best regards

Peter

 

 


 
2017-03-19 14:49 GMT+01:00 Christian König :



Hi Peter,

Adding Michel and Marek for the Mesa interop side and Harry for the display side.
 
How do you want us to display the decoded surfaces?
  Well to make a long story short: I don't have the slightest idea. Ideally we would of the same handling as Intel so that you guys don't have anything vendor dependent in your code.

The first step would be to get the VA-API DRM extension to work with EGL. So that Kodi is able to export the YUV surfaces and import parts of them as separate R8/R16 or R8G8/R16G16 surfaces, right?

What EGL/GL extension do you guys use to import the surfaces? Marek is that stuff fully supported, e.g. do we also handle the offsets correctly? I've added the backend code for this while doing VDPAU interop, but the EGL/GL frontend code needs to handle it gracefully as well.

The second step is then to teach our DC how to handle RGB surfaces with 10bit. I doubt the old code has support for that and we probably don't want to add it. So Harry can you comment on how far along we got with that in DC?

Regards,
Christian.



Am 19.03.2017 um 13:26 schrieb Peter Frühberger:






Hi Christian,
 

thank you for your message. We are still wondering about the render part. How do you want us to display the decoded surfaces? Looking at mpv it seems it will only work via vaPutSurface and is therefore tight to X11. That means it's dependend on the visuals 8 bit only.

 

We are working on a drm-only kodi and now ask ourselves: Is there a possibility to interop with a drm extension and eglCreateImage on AMD hw, too? With the intel only R32, R8 linux buf methods we are also running succesfully on MIR now, wayland would work the very same.

 

Best regards

Peter

 

 


 
2017-03-10 17:25 GMT+01:00 rainer.hochec...@onlinehome.de :





Hi Christian,

 

I already removed the check for Intel in my dev branch. On startup

Kodi does a functional test if vaapi works. If the test passes, it is availalbe

regarless of the underlying type of hardware/driver.

 

Regards,

Rainer

 

Gesendet: Mittwoch, 08. März 2017 um 13:29 

Re: [Mesa-dev] 10bit HEVC decoding for RadeonSI v2

2017-03-20 Thread Marek Olšák
On Sun, Mar 19, 2017 at 2:49 PM, Christian König
 wrote:
> Hi Peter,
>
> Adding Michel and Marek for the Mesa interop side and Harry for the display
> side.
>
> How do you want us to display the decoded surfaces?
>
> Well to make a long story short: I don't have the slightest idea. Ideally we
> would of the same handling as Intel so that you guys don't have anything
> vendor dependent in your code.
>
> The first step would be to get the VA-API DRM extension to work with EGL. So
> that Kodi is able to export the YUV surfaces and import parts of them as
> separate R8/R16 or R8G8/R16G16 surfaces, right?
>
> What EGL/GL extension do you guys use to import the surfaces? Marek is that
> stuff fully supported, e.g. do we also handle the offsets correctly? I've
> added the backend code for this while doing VDPAU interop, but the EGL/GL
> frontend code needs to handle it gracefully as well.

Mesa/EGL imports an FD with an offset, but it always exports an FD
with offset=0 (the driver offset is ignored). It also always returns
num_planes = 1 on export, is that bad?

Marek
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] 10bit HEVC decoding for RadeonSI v2

2017-03-20 Thread Alex Deucher
On Sun, Mar 19, 2017 at 9:49 AM, Christian König
 wrote:
> Hi Peter,
>
> Adding Michel and Marek for the Mesa interop side and Harry for the display
> side.
>
> How do you want us to display the decoded surfaces?
>
> Well to make a long story short: I don't have the slightest idea. Ideally we
> would of the same handling as Intel so that you guys don't have anything
> vendor dependent in your code.
>
> The first step would be to get the VA-API DRM extension to work with EGL. So
> that Kodi is able to export the YUV surfaces and import parts of them as
> separate R8/R16 or R8G8/R16G16 surfaces, right?
>
> What EGL/GL extension do you guys use to import the surfaces? Marek is that
> stuff fully supported, e.g. do we also handle the offsets correctly? I've
> added the backend code for this while doing VDPAU interop, but the EGL/GL
> frontend code needs to handle it gracefully as well.
>
> The second step is then to teach our DC how to handle RGB surfaces with
> 10bit. I doubt the old code has support for that and we probably don't want
> to add it. So Harry can you comment on how far along we got with that in DC?

DC supports 10 bit surfaces fine.  What's missing is support for 10
bit surfaces in GBM and glamor.  We have patches to do this from the
hybrid stack and Nicolai was working on upstreaming them.

Alex

>
> Regards,
> Christian.
>
> Am 19.03.2017 um 13:26 schrieb Peter Frühberger:
>
> Hi Christian,
>
> thank you for your message. We are still wondering about the render part.
> How do you want us to display the decoded surfaces? Looking at mpv it seems
> it will only work via vaPutSurface and is therefore tight to X11. That means
> it's dependend on the visuals 8 bit only.
>
> We are working on a drm-only kodi and now ask ourselves: Is there a
> possibility to interop with a drm extension and eglCreateImage on AMD hw,
> too? With the intel only R32, R8 linux buf methods we are also running
> succesfully on MIR now, wayland would work the very same.
>
> Best regards
> Peter
>
>
>
> 2017-03-10 17:25 GMT+01:00 rainer.hochec...@onlinehome.de
> :
>>
>> Hi Christian,
>>
>> I already removed the check for Intel in my dev branch. On startup
>> Kodi does a functional test if vaapi works. If the test passes, it is
>> availalbe
>> regarless of the underlying type of hardware/driver.
>>
>> Regards,
>> Rainer
>>
>> Gesendet: Mittwoch, 08. März 2017 um 13:29 Uhr
>> Von: "Christian König" 
>> An: mesa-dev@lists.freedesktop.org
>> Cc: rainer.hochec...@onlinehome.de, peter.fruehber...@gmail.com
>> Betreff: 10bit HEVC decoding for RadeonSI v2
>> Hi guys,
>>
>> I finally found time testing this and hammering out (hopefully) all the
>> remaining bugs. Playing a 10bit HEVC file through VAAPI with mpv/ffmpeg
>> git
>> master from about two days ago now works flawlessly and has only about 15%
>> CPU
>> load on one core on a Kaveri system.
>>
>> The VDPAU path should work as well, but NVidias implementation of this is
>> still
>> completely broken and so nobody enables it and we don't have a way to test
>> it.
>>
>> Rainer/Peter maybe you guys want to take a look and enable it in Kodi.
>>
>> The next logical step is to get our display code paths to be 10bit ready.
>>
>> Please review and comment,
>> Christian.
>>
>
>
>
>
> --
>Key-ID: 0x1A995A9B
>keyserver: pgp.mit.edu
> ==
> Fingerprint: 4606 DA19 EC2E 9A0B 0157  C81B DA07 CF63 1A99 5A9B
>
>
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] 10bit HEVC decoding for RadeonSI v2

2017-03-19 Thread Christian König
I suggest to keep it simple from a driver perspective and require 
applications to use vaSyncSurface
Currently our vaSyncSurface doesn't really do what the name suggests. 
All what we do is flushing the command buffers and that for good reasons.


That the application waits for all decoding to complete before handing 
of the surface to the post processing/display engine not only makes the 
application trickier to write (fortunately you already solved that for 
Kodi), but is also seriously bad for things like power management.


In other words we not only do this heavy pipelining of work to gain 
throughput, but also for the reason that the kernel driver and hardware 
need a good idea of what is coming next. When the application waits for 
decoding to finish before handing of post processing to the we can't 
make those predictions any more.


We even discussed to use some sort of hack to signal the kernel driver 
during video decode to not drop below a certain power level to handle 
such things, but for GL interop that would mean that we need to set an 
environment variable for video decoding because we can't really 
differentiate the use case from the driver side.


Regards,
Christian.

Am 19.03.2017 um 15:44 schrieb rainer.hochec...@onlinehome.de:

> for example how does synchronization happen between the two APIs?
right, vaapi seems not as matured as vdpau in this regard. But Kodi's 
multithreading design does cope with this. We call

vaSyncSurface before feeding vpp and before maping va buffers to GL.
I suggest to keep it simple from a driver perspective and require 
applications to use vaSyncSurface

*Gesendet:* Sonntag, 19. März 2017 um 15:28 Uhr
*Von:* "Christian König" 
*An:* "Peter Frühberger" 
*Cc:* "rainer.hochec...@onlinehome.de" 
, mesa-dev@lists.freedesktop.org, 
lru...@libreelec.tv, "Michel Dänzer" , "Marek 
Olšák" , "Wentland, Harry" 

*Betreff:* Re: 10bit HEVC decoding for RadeonSI v2

What do you think?

In general that it might work, but basic problem is the API design 
once more.


While with VDPAU the steps where applications asks OpenGL to interop 
with VDPAU and the two APIs can do all the handshaking internally.


With VA-API we have Application exporting buffers from VA-API and then 
importing the same buffer as two surfaces into OpenGL.


That leaves a whole bunch of open questions, for example how does 
synchronization happen between the two APIs? E.g. the application 
(Kodi) probably doesn't wants to wait for the decoding result before 
it uses the the surface with OpenGL. We don't have a way to sync 
between the two APIs here except for the handle.


The next problem is how do we communicate the layout of data in the 
buffer? E.g. we have the format and the offset, but that assumes that 
you don't have any nasty kind of tilling modes applied here.


I think we can make that work for now (we aren't using tilling modes 
with UVD much anyway), but this is going to bite us again sooner or 
later. Going to put the whole thing on my todo list once more.


Regards,
Christian.

Am 19.03.2017 um 15:06 schrieb Peter Frühberger:

Hi Christian,
we use it the following way:
Dependend on the surface NV12 vs. P010 we use:

https://github.com/FernetMenta/kodi-agile/blob/master/xbmc/cores/VideoPlayer/DVDCodecs/Video/VAAPI.cpp#L1416

R8 and GR88
or alternatively:

https://github.com/FernetMenta/kodi-agile/blob/master/xbmc/cores/VideoPlayer/DVDCodecs/Video/VAAPI.cpp#L1493
R16 and GR32
There is also a possibility to use BGRA, but this involves
internal copy of the yuv surfaces in vaapi and is therefore not
suited well (more memory and more load).
For both images Y and UV we use: eglCreateImageKHR extension
follow by glEGLImageTargetTexture2DOES.
See:

https://github.com/FernetMenta/kodi-agile/blob/master/xbmc/cores/VideoPlayer/DVDCodecs/Video/VAAPI.cpp#L1262
On the VAAPI side:
VA_SURFACE_ATTRIB_MEM_TYPE_DRM_PRIME with
either VA_RT_FORMAT_YUV420 or VA_FOURCC_P010 are used.
I think that method is quite generalizable and nothing is intel
specific.
What do you think?
Best regards
Peter
2017-03-19 14:49 GMT+01:00 Christian König
mailto:deathsim...@vodafone.de>>:

Hi Peter,

Adding Michel and Marek for the Mesa interop side and Harry
for the display side.

How do you want us to display the decoded surfaces?

Well to make a long story short: I don't have the slightest
idea. Ideally we would of the same handling as Intel so that
you guys don't have anything vendor dependent in your code.

The first step would be to get the VA-API DRM extension to
work with EGL. So that Kodi is able to export the YUV surfaces
and import parts of them as separate R8/R16 or R8G8/R16G16
surfaces, right?

What EGL/GL extension do you guys use to import the surfaces?
Marek is that stuff fully supported, e.g. do we also h

Re: [Mesa-dev] 10bit HEVC decoding for RadeonSI v2

2017-03-19 Thread Christian König

What do you think?
In general that it might work, but basic problem is the API design once 
more.


While with VDPAU the steps where applications asks OpenGL to interop 
with VDPAU and the two APIs can do all the handshaking internally.


With VA-API we have Application exporting buffers from VA-API and then 
importing the same buffer as two surfaces into OpenGL.


That leaves a whole bunch of open questions, for example how does 
synchronization happen between the two APIs? E.g. the application (Kodi) 
probably doesn't wants to wait for the decoding result before it uses 
the the surface with OpenGL. We don't have a way to sync between the two 
APIs here except for the handle.


The next problem is how do we communicate the layout of data in the 
buffer? E.g. we have the format and the offset, but that assumes that 
you don't have any nasty kind of tilling modes applied here.


I think we can make that work for now (we aren't using tilling modes 
with UVD much anyway), but this is going to bite us again sooner or 
later. Going to put the whole thing on my todo list once more.


Regards,
Christian.

Am 19.03.2017 um 15:06 schrieb Peter Frühberger:

Hi Christian,

we use it the following way:
Dependend on the surface NV12 vs. P010 we use:
https://github.com/FernetMenta/kodi-agile/blob/master/xbmc/cores/VideoPlayer/DVDCodecs/Video/VAAPI.cpp#L1416 



R8 and GR88

or alternatively:
https://github.com/FernetMenta/kodi-agile/blob/master/xbmc/cores/VideoPlayer/DVDCodecs/Video/VAAPI.cpp#L1493

R16 and GR32

There is also a possibility to use BGRA, but this involves internal 
copy of the yuv surfaces in vaapi and is therefore not suited well 
(more memory and more load).


For both images Y and UV we use: eglCreateImageKHR extension follow 
by glEGLImageTargetTexture2DOES.


See: 
https://github.com/FernetMenta/kodi-agile/blob/master/xbmc/cores/VideoPlayer/DVDCodecs/Video/VAAPI.cpp#L1262


On the VAAPI side:
VA_SURFACE_ATTRIB_MEM_TYPE_DRM_PRIME with either VA_RT_FORMAT_YUV420 
or VA_FOURCC_P010 are used.


I think that method is quite generalizable and nothing is intel specific.

What do you think?

Best regards
Peter



2017-03-19 14:49 GMT+01:00 Christian König >:


Hi Peter,

Adding Michel and Marek for the Mesa interop side and Harry for
the display side.


How do you want us to display the decoded surfaces?

Well to make a long story short: I don't have the slightest idea.
Ideally we would of the same handling as Intel so that you guys
don't have anything vendor dependent in your code.

The first step would be to get the VA-API DRM extension to work
with EGL. So that Kodi is able to export the YUV surfaces and
import parts of them as separate R8/R16 or R8G8/R16G16 surfaces,
right?

What EGL/GL extension do you guys use to import the surfaces?
Marek is that stuff fully supported, e.g. do we also handle the
offsets correctly? I've added the backend code for this while
doing VDPAU interop, but the EGL/GL frontend code needs to handle
it gracefully as well.

The second step is then to teach our DC how to handle RGB surfaces
with 10bit. I doubt the old code has support for that and we
probably don't want to add it. So Harry can you comment on how far
along we got with that in DC?

Regards,
Christian.


Am 19.03.2017 um 13:26 schrieb Peter Frühberger:

Hi Christian,

thank you for your message. We are still wondering about the
render part. How do you want us to display the decoded surfaces?
Looking at mpv it seems it will only work via vaPutSurface and is
therefore tight to X11. That means it's dependend on the visuals
8 bit only.

We are working on a drm-only kodi and now ask ourselves: Is there
a possibility to interop with a drm extension and eglCreateImage
on AMD hw, too? With the intel only R32, R8 linux buf methods we
are also running succesfully on MIR now, wayland would work the
very same.

Best regards
Peter



2017-03-10 17:25 GMT+01:00 rainer.hochec...@onlinehome.de

mailto:rainer.hochec...@onlinehome.de>>:

Hi Christian,
I already removed the check for Intel in my dev branch. On
startup
Kodi does a functional test if vaapi works. If the test
passes, it is availalbe
regarless of the underlying type of hardware/driver.
Regards,
Rainer
*Gesendet:* Mittwoch, 08. März 2017 um 13:29 Uhr
*Von:* "Christian König" mailto:deathsim...@vodafone.de>>
*An:* mesa-dev@lists.freedesktop.org

*Cc:* rainer.hochec...@onlinehome.de
,
peter.fruehber...@gmail.com 
*Betreff:* 10bit HEVC decoding for RadeonSI v2
Hi guys,

I finally found time testing 

Re: [Mesa-dev] 10bit HEVC decoding for RadeonSI v2

2017-03-19 Thread Christian König

Hi Peter,

Adding Michel and Marek for the Mesa interop side and Harry for the 
display side.



How do you want us to display the decoded surfaces?
Well to make a long story short: I don't have the slightest idea. 
Ideally we would of the same handling as Intel so that you guys don't 
have anything vendor dependent in your code.


The first step would be to get the VA-API DRM extension to work with 
EGL. So that Kodi is able to export the YUV surfaces and import parts of 
them as separate R8/R16 or R8G8/R16G16 surfaces, right?


What EGL/GL extension do you guys use to import the surfaces? Marek is 
that stuff fully supported, e.g. do we also handle the offsets 
correctly? I've added the backend code for this while doing VDPAU 
interop, but the EGL/GL frontend code needs to handle it gracefully as well.


The second step is then to teach our DC how to handle RGB surfaces with 
10bit. I doubt the old code has support for that and we probably don't 
want to add it. So Harry can you comment on how far along we got with 
that in DC?


Regards,
Christian.

Am 19.03.2017 um 13:26 schrieb Peter Frühberger:

Hi Christian,

thank you for your message. We are still wondering about the render 
part. How do you want us to display the decoded surfaces? Looking at 
mpv it seems it will only work via vaPutSurface and is therefore tight 
to X11. That means it's dependend on the visuals 8 bit only.


We are working on a drm-only kodi and now ask ourselves: Is there a 
possibility to interop with a drm extension and eglCreateImage on AMD 
hw, too? With the intel only R32, R8 linux buf methods we are also 
running succesfully on MIR now, wayland would work the very same.


Best regards
Peter



2017-03-10 17:25 GMT+01:00 rainer.hochec...@onlinehome.de 
 
mailto:rainer.hochec...@onlinehome.de>>:


Hi Christian,
I already removed the check for Intel in my dev branch. On startup
Kodi does a functional test if vaapi works. If the test passes, it
is availalbe
regarless of the underlying type of hardware/driver.
Regards,
Rainer
*Gesendet:* Mittwoch, 08. März 2017 um 13:29 Uhr
*Von:* "Christian König" mailto:deathsim...@vodafone.de>>
*An:* mesa-dev@lists.freedesktop.org

*Cc:* rainer.hochec...@onlinehome.de
,
peter.fruehber...@gmail.com 
*Betreff:* 10bit HEVC decoding for RadeonSI v2
Hi guys,

I finally found time testing this and hammering out (hopefully)
all the
remaining bugs. Playing a 10bit HEVC file through VAAPI with
mpv/ffmpeg git
master from about two days ago now works flawlessly and has only
about 15% CPU
load on one core on a Kaveri system.

The VDPAU path should work as well, but NVidias implementation of
this is still
completely broken and so nobody enables it and we don't have a way
to test it.

Rainer/Peter maybe you guys want to take a look and enable it in Kodi.

The next logical step is to get our display code paths to be 10bit
ready.

Please review and comment,
Christian.




--
 Key-ID: 0x1A995A9B
   keyserver: pgp.mit.edu 
==
Fingerprint: 4606 DA19 EC2E 9A0B 0157  C81B DA07 CF63 1A99 5A9B



___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] 10bit HEVC decoding for RadeonSI v2

2017-03-13 Thread rainer.hochec...@onlinehome.de

Hi Christian,

 

I already removed the check for Intel in my dev branch. On startup

Kodi does a functional test if vaapi works. If the test passes, it is availalbe

regarless of the underlying type of hardware/driver.

 

Regards,

Rainer

 

Gesendet: Mittwoch, 08. März 2017 um 13:29 Uhr
Von: "Christian König" 
An: mesa-dev@lists.freedesktop.org
Cc: rainer.hochec...@onlinehome.de, peter.fruehber...@gmail.com
Betreff: 10bit HEVC decoding for RadeonSI v2

Hi guys,

I finally found time testing this and hammering out (hopefully) all the
remaining bugs. Playing a 10bit HEVC file through VAAPI with mpv/ffmpeg git
master from about two days ago now works flawlessly and has only about 15% CPU
load on one core on a Kaveri system.

The VDPAU path should work as well, but NVidias implementation of this is still
completely broken and so nobody enables it and we don't have a way to test it.

Rainer/Peter maybe you guys want to take a look and enable it in Kodi.

The next logical step is to get our display code paths to be 10bit ready.

Please review and comment,
Christian.
 



___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] 10bit HEVC decoding for RadeonSI v2

2017-03-08 Thread Mark Thompson
On 08/03/17 12:29, Christian König wrote:
> Hi guys,
> 
> I finally found time testing this and hammering out (hopefully) all the
> remaining bugs. Playing a 10bit HEVC file through VAAPI with mpv/ffmpeg git
> master from about two days ago now works flawlessly and has only about 15% CPU
> load on one core on a Kaveri system.

Um, libav* is querying the capabilities and finding that only 8-bit output is 
supported for Main10:

{
"profile": 18,
"name": "HEVCMain10",
"description": "H.265 / MPEG-H part 2 (HEVC) Main 10 Profile",
"entrypoints": [
{
"entrypoint": 1,
"name": "VLD",
"description": "Decode Slice",
"attributes": [
{
"rt_formats": [
"YUV420",
"YUV420_10BPP",
],
},
],
"surface_formats": [
{
"rt_format": 1,
"memory_types": [
"VA",
"DRM_PRIME",
],
"max_width": 16384,
"max_height": 16384,
"pixel_formats": [
"NV12",
],
},
Unable to create config to test surface attributes: 14 (the requested RT Format 
is not supported)
],
},
],
},

So, it works because it decodes to 8-bit surfaces and then everything is the 
same as 8-bit video after that.

(Continued in reply to 11/11.)
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev