Am 27.01.2017 um 13:51 schrieb Mark Thompson:
On 26/01/17 16:59, Christian König wrote:
Am 26.01.2017 um 13:14 schrieb Mark Thompson:
[SNIP]
The problem here is I need to know what will be done with the surface from the 
very beginning. E.g. if you want to scan it out directly to hardware you need a 
different memory layout as when you just want to sample from it.

Same applies to sampling from it from OpenGL and in this case also how you want 
to sample from it etc etc...
For use in other places (like scanout or OpenGL), is this a correctness issue 
(things just won't work with the wrong layout) or a performance one (things 
will be slower or use more resources)?

It is a correctness issue, cause only a subset of the formats are directly usable with certain sampling modes or scanout for example.

(For that matter, is there a list somewhere of the set of formats/layouts and 
what they are used for?)

Well taking a look at all the use cases and options you can program into the decoder you easily end up with 100+ format combinations.

For example the decoder can split a frame into the top and bottom field during decode, apply different tiling modes and layout the YUV data either packet or in separate planes.

To my mind, the PixelFormat attribute (fourcc) is only specifying the 
appearance of the format from the point of view of the user.  That is, what you 
will get if you call vaDeriveImage() and then map the result.
And exactly that is completely nonsense. You CAN'T assume that the decoding 
result is immediately accessible by the CPU.

So the only proper way of handling this is going the VDPAU design. You create 
the surface without specifying any format, decode into it with the decoder and 
then the application tells the driver what format it wants to access it.

The driver then copies the data to CPU accessible memory and does the 
conversion to the format desired by the application on the fly.

    Tiling then doesn't cause any problems because it is a property of the DRM 
object and mapping can automagically take it into account.
No it can't, tiled surfaces are not meant to be CPU accessible. So the whole 
idea of mapping a surface doesn't make much sense to me.
If they aren't CPU accessible, doesn't this mean that the layout of the 
surfaces isn't observable by the user and therefore doesn't matter to the API?

Exactly, but the problem is that VA-API suggests otherwise.

E.g. we had it multiple times that customers coded a specialized application, tested them on Intel hardware and then wanted to have it working on AMD in exactly the same way.

That unfortunately didn't worked because our internal memory layout is just completely different.

Since the user can't access the surface directly, it can be whatever is most 
suitable for the hardware and the user can't tell.

That won't work. An example we ran into was that a customer wanted to black out the first and last line of an image for BOB deinterlacing.

To do so he mapped the surface and just used memset() on the appropriate addresses. On Intel I was told this works because the mapping seems to be bidirectional and all changes done to it are reflected in the original video surface/image.

The API certainly admits the possibility that vaDeriveImage() just can't expose 
surfaces to the CPU directly, or that there are extra implicit copies so that 
it all appears consistent from the point of view of the user.

Yeah, but it doesn't enforce that. E.g. you don't have it properly defined that the derived Image is a copy and it looks like that on Intel it just works so people tend to use it.

I think my use of the word "mapping" wasn't helping there: I was using it to 
refer both to mapping to the CPU address space (which need not be supported) and to other 
APIs (OpenGL, OpenCL, whatever) which will use it on the GPU (which is far more 
important).  My real question on the tiling issue was: is tiling/layout/whatever a 
properly of the DRM object, such that other APIs interacting with it can do the right 
thing without the user needing to know about it?

No, they can't. For example when we import the UV plane of an NV12 surface into OpenGL then OpenGL needs to know that this originally comes from an decoded image. Otherwise it won't use the correct tilling paramters.

Currently we make an educated guess based on the offset what is about to be imported in the GL driver, but that isn't really good engineering.

To be fair I've ran into the exactly same problem with VDPAU as well, that's the reason we currently have tilling disabled for video surfaces there as well.

If not, then the VAAPI buffer-sharing construction (vaCreateSurfaces() and 
vaAcquireBuffer() with VA_SURFACE_ATTRIB_TYPE_DRM_PRIME) doesn't contain enough 
information to ever work and we should be trying to fix it in the API.

Those problems won't resolve with just changes to VA-API. Essentially we need to follow NVidias proposal for the "Unix Device Memory Allocation project".

Hopefully this should in the end result in a library which based on the use case allow drivers to negotiate the surface format.

Regards,
Christian.

If so, then these issues seem resolvable, albeit possibly with extra copies in 
places where we haven't been able to work out far enough in advance what was 
needed (or where the user really does want to do multiple things which require 
different formats: maybe drawing something on a surface (OSD/subtitles, say) 
and then sending it to an encoder or scanout).

Thanks,

- Mark


_______________________________________________
mesa-dev mailing list
[email protected]
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Reply via email to