Re: [PATCH xserver] glx: Allow arbitrary context attributes for direct contexts

2017-07-28 Thread Eric Anholt
Adam Jackson  writes:

> On Fri, 2017-07-28 at 09:56 -0700, Eric Anholt wrote:
>> Adam Jackson  writes:
>> 
>> > For direct contexts, most context attributes don't require any
>> > particular awareness on the part of the server. Examples include
>> > GLX_ARB_create_context_no_error and GLX_ARB_context_flush_control, where
>> > all of the behavior change lives in the renderer; since that's on the
>> > client side for a direct context, there's no reason for the X server to
>> > validate the attribute.
>> > 
>> > The context attributes will still be validated on the client side, and
>> > we still validate attributes for indirect contexts since the server
>> > implementation might need to handle them. For example, the indirect
>> > code might internally use ARB_context_flush_control for all contexts, in
>> > which case it would need to manually emit glFlush when the client
>> > switches between two indirect contexts that didn't request the no-flush
>> > attribute.
>> > 
>> > Signed-off-by: Adam Jackson 
>> 
>> Does the client side even need to send the client-side attributes for
>> direct contexts?
>
> It would probably be legal for it to not? The ARB_create_context spec,
> discussing why direct/indirect is a parameter not an attribute, says
> "... different paths to the server may be taken for creating direct
> contexts, and parsing the attribute list in the client should not be
> required". One could read that to mean that creating the server-side of
> a direct context needn't be exactly like creating an indirect context,
> that you could send only those attributes the server needs to be aware
> of (although doing this would mean parsing the attribute list in the
> client, which the issue is otherwise trying to avoid...)
>
> However, none of the GLX extensions that define new context attributes
> seem to define whether that attrib is client or server state, and
> Mesa's glXCreateContextAttribsARB does not edit the attribute list
> before sending it to the server. It seems simpler to me for the server
> ignore unknown attribs for direct contexts than for Mesa to grow a list
> of attributes to censor, because e.g. ARB_create_context_no_error would
> then be exactly as easy to wire up for direct GLX as for EGL.

My thinking was: if we do it client side, then new Mesa works even with
old X.  That said, the fix here is trivial, so let's do it anyway.

Reviewed-by: Eric Anholt 


signature.asc
Description: PGP signature
___
xorg-devel@lists.x.org: X.Org development
Archives: http://lists.x.org/archives/xorg-devel
Info: https://lists.x.org/mailman/listinfo/xorg-devel

Re: [PATCH xserver] glx: Allow arbitrary context attributes for direct contexts

2017-07-28 Thread Adam Jackson
On Fri, 2017-07-28 at 09:56 -0700, Eric Anholt wrote:
> Adam Jackson  writes:
> 
> > For direct contexts, most context attributes don't require any
> > particular awareness on the part of the server. Examples include
> > GLX_ARB_create_context_no_error and GLX_ARB_context_flush_control, where
> > all of the behavior change lives in the renderer; since that's on the
> > client side for a direct context, there's no reason for the X server to
> > validate the attribute.
> > 
> > The context attributes will still be validated on the client side, and
> > we still validate attributes for indirect contexts since the server
> > implementation might need to handle them. For example, the indirect
> > code might internally use ARB_context_flush_control for all contexts, in
> > which case it would need to manually emit glFlush when the client
> > switches between two indirect contexts that didn't request the no-flush
> > attribute.
> > 
> > Signed-off-by: Adam Jackson 
> 
> Does the client side even need to send the client-side attributes for
> direct contexts?

It would probably be legal for it to not? The ARB_create_context spec,
discussing why direct/indirect is a parameter not an attribute, says
"... different paths to the server may be taken for creating direct
contexts, and parsing the attribute list in the client should not be
required". One could read that to mean that creating the server-side of
a direct context needn't be exactly like creating an indirect context,
that you could send only those attributes the server needs to be aware
of (although doing this would mean parsing the attribute list in the
client, which the issue is otherwise trying to avoid...)

However, none of the GLX extensions that define new context attributes
seem to define whether that attrib is client or server state, and
Mesa's glXCreateContextAttribsARB does not edit the attribute list
before sending it to the server. It seems simpler to me for the server
ignore unknown attribs for direct contexts than for Mesa to grow a list
of attributes to censor, because e.g. ARB_create_context_no_error would
then be exactly as easy to wire up for direct GLX as for EGL.

- ajax
___
xorg-devel@lists.x.org: X.Org development
Archives: http://lists.x.org/archives/xorg-devel
Info: https://lists.x.org/mailman/listinfo/xorg-devel

Re: help with Xorg

2017-07-28 Thread Adam Jackson
On Thu, 2017-07-27 at 21:59 -0500, Perez Rodriguez, Humberto I wrote:

> (04:54 PM) [gfx@gfx-desktop] [xorg]$ : cat Xorg.0.log | grep "(WW)"
>  (WW) warning, (EE) error, (NI) not implemented, (??) unknown.
> [62.765] (WW) The directory "/usr/share/fonts/misc/" does not exist.
> [62.765] (WW) The directory "/usr/share/fonts/TTF/" does not exist.
> [62.765] (WW) The directory "/usr/share/fonts/OTF/" does not exist.
> [62.766] (WW) The directory "/usr/share/fonts/Type1/" does not exist.
> [62.766] (WW) The directory "/usr/share/fonts/100dpi/" does not exist.
> [62.766] (WW) The directory "/usr/share/fonts/75dpi/" does not exist.
> [62.944] (WW) Falling back to old probe method for modesetting
> [62.944] (WW) Falling back to old probe method for fbdev
> [62.946] (WW) Falling back to old probe method for vesa
> [62.946] (WW) VGA arbiter: cannot open kernel arbiter, no multi-card 
> support
> [   292.184] (WW) Option "xkb_variant" requires a string value
> [   292.184] (WW) Option "xkb_options" requires a string value
> 
> 
> i think that the issue is in the line "VGA arbiter: cannot open kernel 
> arbiter, no multi-card support" but i am not sure about that

It almost certainly is not the issue. Please just provide the full X
log instead of grepping randomly.

- ajax
___
xorg@lists.x.org: X.Org support
Archives: http://lists.freedesktop.org/archives/xorg
Info: https://lists.x.org/mailman/listinfo/xorg
Your subscription address: %(user_address)s

Re: [PATCH xserver] glx: Allow arbitrary context attributes for direct contexts

2017-07-28 Thread Eric Anholt
Adam Jackson  writes:

> For direct contexts, most context attributes don't require any
> particular awareness on the part of the server. Examples include
> GLX_ARB_create_context_no_error and GLX_ARB_context_flush_control, where
> all of the behavior change lives in the renderer; since that's on the
> client side for a direct context, there's no reason for the X server to
> validate the attribute.
>
> The context attributes will still be validated on the client side, and
> we still validate attributes for indirect contexts since the server
> implementation might need to handle them. For example, the indirect
> code might internally use ARB_context_flush_control for all contexts, in
> which case it would need to manually emit glFlush when the client
> switches between two indirect contexts that didn't request the no-flush
> attribute.
>
> Signed-off-by: Adam Jackson 

Does the client side even need to send the client-side attributes for
direct contexts?


signature.asc
Description: PGP signature
___
xorg-devel@lists.x.org: X.Org development
Archives: http://lists.x.org/archives/xorg-devel
Info: https://lists.x.org/mailman/listinfo/xorg-devel

Re: [PATCH dri3proto v2] Add modifier/multi-plane requests, bump to v1.1

2017-07-28 Thread Keith Packard
Michel Dänzer  writes:

> Declaring where? Once a pixmap is created, it only has a depth, no
> format, so there's nothing to base on that e.g. CopyArea between two
> pixmaps of the same depth is undefined.

I think we'd need some extension request which provides the format data
and other attributes. We can define the transformation of underlying
data into depth-based pixel data to match core drawing.

> I'm getting less and less convinced that any reference at all to formats
> in the context of pixmaps in the DRI3 specification is a good idea.

Well, if we want to deal with YUV or compressed data, then we'd better
find a way to describe the contents of the buffer returned by
DRI3BufferFromPixmap.

However, it would also be 'nice' if the underlying raw data could be
accessed over the wire (ala PutImage/GetImage). Perhaps a different
extension which talks about new formats (including deep color?) and
provides for Get/Put semantics?

-- 
-keith


signature.asc
Description: PGP signature
___
xorg-devel@lists.x.org: X.Org development
Archives: http://lists.x.org/archives/xorg-devel
Info: https://lists.x.org/mailman/listinfo/xorg-devel

Re: [PATCH dri3proto v2] Add modifier/multi-plane requests, bump to v1.1

2017-07-28 Thread Keith Packard
Daniel Stone  writes:

> You're right that it makes little to no sense to mix the two, but I'm
> not sure what practical gain we get from expressing things as
> Pictures.

Well, abstractly, a Picture can represent any of the formats you've
talked about -- they have explicit knowledge of color and translucency,
and already provide a notion of 'source only' objects which need not
store ARGB values for each pixel.

> Ultimately, the server still deals in Window Pixmaps and
> Screen Pixmaps for backing storage, and the compositor interface is
> NameWindowPixmap. Your suggestion of another type seems nicer, but it
> doesn't look like we can avoid Pixmaps as the lowest common
> denominator anyway.

What's the plan for compositors which just want to turn window contents
into textures? I'd assume you'd want to update them to understand the
new format bits, which leaves us free to do something sensible here.

For X-based compositors (are there any left?), we can fake out the
pixmap easily enough (NameWindowPixmap creates a suitable 'real' pixmap,
CopyArea/Composite update the source region before the copy happens).

> That would much more clearly match the intention of the spec
> additions, which was to allow non-linearly-addressed (Intel X/Y*,
> Vivante tiled/supertiled, VC4 T-tiled, the infinite AMD/NV tiling
> modes) buffers.

I'm confused -- I can't see how tiling is relevant here; if you're
storing actual ARGB pixels, even in some crazy order, then you've got a
pixmap. I have to admit that all I know about is Intel X/Y tiling
though; are there ways in which the other tiling formats don't just
store ARGB pixels?

> tl;dr: we could replace FourCC with depth as the format determinant to
> better match DRI3 v1.0, or just declare that doing anything with those
> Pixmaps other than displaying them to a Window with a compatible
> Visual results in undefined behaviour

Hrm. I think I'm stuck on the notion that Pixmaps are concrete 2D
arrays of multi-bit values.

What about calling them 'source pixmaps' and offering extended requests
to determine the precise nature of the data within? Make core requests
report correct width and height values, and then fix them at depth 24
for RGB or depth 32 for ARGB. Using them as a rendering target could
generate BadMatch, or be ignored (I'd prefer the former).

DRI3BufferFromPixmap could create a shadow 'normal' pixmap and use
Damage to sync it with the source pixmap.

Creating a Picture from a 'source pixmap' could dtrt (at least at some
point; it could also just use a shadow pixmap).

A new DRI3BufferFromSourcePixmap request would be used to get the list
of buffers and appropriate format data so that updated compositors could
get improved results.

-- 
-keith


signature.asc
Description: PGP signature
___
xorg-devel@lists.x.org: X.Org development
Archives: http://lists.x.org/archives/xorg-devel
Info: https://lists.x.org/mailman/listinfo/xorg-devel

Re: [Mesa-dev] [PATCH dri3proto v2] Add modifier/multi-plane requests, bump to v1.1

2017-07-28 Thread Nicolai Hähnle

Hi Daniel,

On 28.07.2017 12:46, Daniel Stone wrote:

On 28 July 2017 at 10:24, Nicolai Hähnle  wrote:

On 28.07.2017 09:44, Daniel Stone wrote:

No, I don't think it is. Tiled layouts still have a stride: if you
look at i915 X/Y/Yf/Y_CCS/Yf_CCS (the latter two containing an
auxiliary compression/fast-clear buffer), iMX/etnaviv
tiled/supertiled, or VC4 T-tiled modifiers and how they're handled
both for DRIImage and KMS interchange, they all specify a stride which
is conceptually the same as linear, if you imagine linear to be 1x1
tiled.

Most tiling users accept any integer units of tiles (buffer width
aligned to tile width), but not always. The NV12MT format used by
Samsung media decoders (also shipped in some Qualcomm SoCs) is
particularly special, IIRC requiring height to be aligned to a
multiple of two tiles.


Fair enough, but I think you need to distinguish between the chosen stride
and stride *requirements*. I do think it makes sense to consider the stride
requirement as part of the format/layout description, but more below.


Right. Stride is a property of one buffer, stride requirements are a
property of the users of that buffer (GPU, display control, media
encode, etc). The requirements also depend on use, e.g. trying to do
rotation through your scanout engine can change those requirements.


Right.



It definitely seems attractive to kill two birds with one stone, but
I'd really much rather not conflate format description/advertisement,
and allocation restriction, into one enum. I'm still on the side of
saying that this is a problem modifiers do not solve, deferring to the
allocator we need anyway in order to determine things like placement
and global optimality (e.g. rotated scanout placing further
restrictions on allocation).


Okay, the original issue here is that the allocator *cannot* determine the
alignment requirement in the use case that prompted this sub-thread.

The use case is PRIME off-loading, where the rendering GPU supports linear
layouts with a 64 byte stride, while the display GPU requires a 256 byte
stride.

The allocator *cannot* solve this issue, because the allocation happens on
the rendering GPU. We need to communicate somehow what the display GPU's
stride requirements are.

How do you propose to do that?


The allocator[0] in itself can't magically reach across processes to
determine desired usage and resolve dependencies. But the entire
design behind it was to be able to solve cross-device usage: between
GPU and scanout, between both of those and media encode/decode, etc.
Obviously it can't do that without help, so winsys will need to gain
protocol in order to express those in terms the allocator will
understand.

The idea was to split information into positive capabilities and
negative constraints. Modifier queries fall into the same boat as
format queries: you're expressing an additional capability ('I can
speak tiled'). Stride alignment, for me, falls into a negative
constraint ('linear allocations must have stride aligned to 256
bytes'). Similarly, placement constraints (VRAM possibly only
accessible to SLI-type paired GPU vs. GTT vs. pure system RAM, etc)
are in the same boat AFAICT. So this helps solve one side of the
equation, but not the other.


I've been thinking about this some more, and I can see now that the 
changed modifier scheme that I originally proposed does not fit well 
into places where modifiers are used to express buffer properties (e.g. 
DRI3PixmapFromBuffers, DRI3BuffersFromPixmap).


But I see no proposal on how to fix the issue so far. You cannot fully 
separate capabilities from constraints. As is, we (AMD) cannot properly 
implement the proposed DRI3 v1.1: what would we return in 
DRI3GetSupportedModifiers?


The natural option is to return (at least) DRM_FORMAT_MOD_LINEAR, but 
that would be a lie, because we *don't* speak arbitrary linear formats.


I don't think this is difficult to fix in terms of protocol, although 
there's plenty of opportunity for bike-shedding :)


I see roughly two options:

1. Make the constraints per-modifier, and add a "constraints: 
ListOfCard32" (or 64) to the response to DRI3GetSupportedModifiers. We 
can then reserve some bits for global constraints (e.g. placement) and 
some bits on a per-modifier basis (e.g. stride alignment for linear). 
You could build constraints like DRM_CONSTRAINT_PLACEMENT_SYSTEM | 
DRM_CONSTRAINT_LINEAR_STRIDE_256B.


2. Make the constraints global, and add a DRI3GetConstraints protocol 
with the same signature as DRI3GetSupportedModifiers. We'd need vendor 
namespaces for the constraint defines, to support constraints that are 
specific to vendor-specific modifiers. You could have entries like 
DRM_CONSTRAINT_PLACEMENT(DRM_CONSTRAINT_PLACEMENT_SYSTEM) and, as a 
separate list entry, DRM_CONSTRAINT_LINEAR_STRIDE(256).


Cheers,
Nicolai
___
xorg-devel@lists.x.org: X.Org development
Archives: 

Re: [Mesa-dev] [PATCH dri3proto v2] Add modifier/multi-plane requests, bump to v1.1

2017-07-28 Thread Daniel Stone
Hi,

On 28 July 2017 at 10:24, Nicolai Hähnle  wrote:
> On 28.07.2017 09:44, Daniel Stone wrote:
>> No, I don't think it is. Tiled layouts still have a stride: if you
>> look at i915 X/Y/Yf/Y_CCS/Yf_CCS (the latter two containing an
>> auxiliary compression/fast-clear buffer), iMX/etnaviv
>> tiled/supertiled, or VC4 T-tiled modifiers and how they're handled
>> both for DRIImage and KMS interchange, they all specify a stride which
>> is conceptually the same as linear, if you imagine linear to be 1x1
>> tiled.
>>
>> Most tiling users accept any integer units of tiles (buffer width
>> aligned to tile width), but not always. The NV12MT format used by
>> Samsung media decoders (also shipped in some Qualcomm SoCs) is
>> particularly special, IIRC requiring height to be aligned to a
>> multiple of two tiles.
>
> Fair enough, but I think you need to distinguish between the chosen stride
> and stride *requirements*. I do think it makes sense to consider the stride
> requirement as part of the format/layout description, but more below.

Right. Stride is a property of one buffer, stride requirements are a
property of the users of that buffer (GPU, display control, media
encode, etc). The requirements also depend on use, e.g. trying to do
rotation through your scanout engine can change those requirements.

>> It definitely seems attractive to kill two birds with one stone, but
>> I'd really much rather not conflate format description/advertisement,
>> and allocation restriction, into one enum. I'm still on the side of
>> saying that this is a problem modifiers do not solve, deferring to the
>> allocator we need anyway in order to determine things like placement
>> and global optimality (e.g. rotated scanout placing further
>> restrictions on allocation).
>
> Okay, the original issue here is that the allocator *cannot* determine the
> alignment requirement in the use case that prompted this sub-thread.
>
> The use case is PRIME off-loading, where the rendering GPU supports linear
> layouts with a 64 byte stride, while the display GPU requires a 256 byte
> stride.
>
> The allocator *cannot* solve this issue, because the allocation happens on
> the rendering GPU. We need to communicate somehow what the display GPU's
> stride requirements are.
>
> How do you propose to do that?

The allocator[0] in itself can't magically reach across processes to
determine desired usage and resolve dependencies. But the entire
design behind it was to be able to solve cross-device usage: between
GPU and scanout, between both of those and media encode/decode, etc.
Obviously it can't do that without help, so winsys will need to gain
protocol in order to express those in terms the allocator will
understand.

The idea was to split information into positive capabilities and
negative constraints. Modifier queries fall into the same boat as
format queries: you're expressing an additional capability ('I can
speak tiled'). Stride alignment, for me, falls into a negative
constraint ('linear allocations must have stride aligned to 256
bytes'). Similarly, placement constraints (VRAM possibly only
accessible to SLI-type paired GPU vs. GTT vs. pure system RAM, etc)
are in the same boat AFAICT. So this helps solve one side of the
equation, but not the other.

Cheers,
Daniel

[0]: Read as, 'the design we discussed for the allocator at XDC'.
___
xorg-devel@lists.x.org: X.Org development
Archives: http://lists.x.org/archives/xorg-devel
Info: https://lists.x.org/mailman/listinfo/xorg-devel

Re: [PATCH dri3proto v2] Add modifier/multi-plane requests, bump to v1.1

2017-07-28 Thread Daniel Stone
Hi,

On 28 July 2017 at 09:54, Michel Dänzer  wrote:
> On 28/07/17 05:03 PM, Daniel Stone wrote:
>> By implementation if not spec, DRI3 v1.0 enforces that depth 16 is
>> RGB565, depth 24 is XRGB, and depth 32 is ARGB.
>
> No, it doesn't. How the bits stored in a pixmap are interpreted is
> outside of the scope of DRI3 1.0.

I mean that, when using glamor_egl and Mesa, PixmapFromBuffer
translates XRGB to depth 24, and BufferFromPixmap translates depth
24 to XRGB. So, de facto, these are the established translations
between depth and format required to use DRI3.

>> tl;dr: we could replace FourCC with depth as the format determinant to
>> better match DRI3 v1.0, or just declare that doing anything with those
>> Pixmaps other than displaying them to a Window with a compatible
>> Visual results in undefined behaviour
>
> I'm getting less and less convinced that any reference at all to formats
> in the context of pixmaps in the DRI3 specification is a good idea.

I'd be fine with that, but at least documenting the implemented
behaviour would probably be reasonable, to save other implementations
from having to reverse-engineer the actual semantics.

Cheers,
Daniel
___
xorg-devel@lists.x.org: X.Org development
Archives: http://lists.x.org/archives/xorg-devel
Info: https://lists.x.org/mailman/listinfo/xorg-devel

Re: [Mesa-dev] [PATCH dri3proto v2] Add modifier/multi-plane requests, bump to v1.1

2017-07-28 Thread Nicolai Hähnle

On 28.07.2017 09:44, Daniel Stone wrote:

Hi Nicolai,
Trying to tackle the stride subthread in one go ...

On 25 July 2017 at 09:28, Nicolai Hähnle  wrote:

On 22.07.2017 14:00, Daniel Stone wrote:

On 21 July 2017 at 18:32, Michel Dänzer  wrote:

We just ran into an issue which might mean that there's still something
missing in this v2 proposal:

The context is DRI3 PRIME render offloading of glxgears (not useful in
practice, but bear with me). The display GPU is Raven Ridge, which
requires
that the stride even of linear textures is a multiple of 256 bytes. The
renderer GPU is Polaris12, which still supports smaller alignment of the
stride. With the glxgears window of width 300, the renderer GPU driver
chooses a stride of 304 (* 4 / 256 = 4.75), whereas the display GPU would
require 320 (* 4 / 256 = 5). This cannot work.



The obvious answer is just to increase padding on external/winsys
surfaces? Increasing it for all allocations would probably be a
non-starter, but winsys surfaces are rare enough that you could
probably afford to take the hit, I guess.


I see two basic approaches to solve this:
[...]
Maybe there are other possible approaches I'm missing? Other comments?


I don't have any great solution off the top of my head, but I'd be
inclined to bundle stride in with placement. TTBOMK (from having
looked at radv), buffers shared cross-GPU also need to be allocated
from a separate externally-visible memory heap. And at the moment,
lacking placement information at allocation time (at least for EGL
allocations, via DRIImage), that happens via transparent migration at
import time I think. Placement restrictions would probably also
involve communicating base address alignment requirements.


Stride isn't really in the same category as placement and base address
alignment, though.

Placement and base address alignment requirements can apply to all possible
texture layouts, while the concept of stride is specific to linear layouts.


No, I don't think it is. Tiled layouts still have a stride: if you
look at i915 X/Y/Yf/Y_CCS/Yf_CCS (the latter two containing an
auxiliary compression/fast-clear buffer), iMX/etnaviv
tiled/supertiled, or VC4 T-tiled modifiers and how they're handled
both for DRIImage and KMS interchange, they all specify a stride which
is conceptually the same as linear, if you imagine linear to be 1x1
tiled.

Most tiling users accept any integer units of tiles (buffer width
aligned to tile width), but not always. The NV12MT format used by
Samsung media decoders (also shipped in some Qualcomm SoCs) is
particularly special, IIRC requiring height to be aligned to a
multiple of two tiles.


Fair enough, but I think you need to distinguish between the chosen 
stride and stride *requirements*. I do think it makes sense to consider 
the stride requirement as part of the format/layout description, but 
more below.




Given that, I'm fairly inclined to punt those until we have the grand
glorious allocator, rather than trying to add it to EGL/GBM
separately. The modifiers stuff was a fairly obvious augmentation -
EGL already had no-modifier format import but no query as to which
formats it would accept, and modifiers are a logical extension of
format - but adding the other restrictions is a bigger step forward.


Perhaps a third option would be to encode the required pitch_in_bytes
alignment as part of the modifier?

So DRI3GetSupportedModifiers would return DRM_FORMAT_MOD_LINEAR_256B when
the display GPU is a Raven Ridge.

More generally, we could say that fourcc_mod_code(NONE, k) means that the
pitch_in_bytes has to be a multiple of 2**k for e.g. k <= 31. Or if you
prefer, we could have a stride requirement in elements or pixels instead of
bytes.


I've been thinking this over for the past couple of days, and we could
make it work in clients. But we already have DRM_FORMAT_MOD_LINEAR
with a well-defined meaning, implemented in quite a few places. I
think special-casing linear to encode stride alignment is something
we'd regret in the long run, especially given that some hardware
_does_ have more strict alignment requirements for tiled modes than
just an integer number of tiles allocated.

AFAICT, both AMD and NVIDIA are both going to use a fair bit of the
tiling enum space to encode tile size as well as layout. If allocation
alignment requirements (in both dimensions) needed to be added to
that, it's entirely likely that there wouldn't be enough space and
you'd need to put it somewhere else than the modifier, in which case
we've not even really solved the problem.


At least for AMD, the alignment requirements are de facto part of the 
tiling description, so I don't think it makes a difference.




It definitely seems attractive to kill two birds with one stone, but
I'd really much rather not conflate format description/advertisement,
and allocation restriction, into one enum. I'm still on the side of
saying that this is a problem modifiers do not solve, 

Re: [PATCH dri3proto v2] Add modifier/multi-plane requests, bump to v1.1

2017-07-28 Thread Michel Dänzer
On 28/07/17 04:14 PM, Daniel Stone wrote:
> On 26 July 2017 at 07:21, Michel Dänzer  wrote:
>> On 26/07/17 10:48 AM, Michel Dänzer wrote:
>>> On 26/07/17 06:20 AM, Eric Anholt wrote:
 Daniel Stone  writes:
> +   The exact configuration of the buffer is specified by 'format',
> +   a DRM FourCC format token as defined in that project's
> +   drm_fourcc.h header, in combination with the modifier.

 Should we be specifying how the depth of the Pixmap is determined from
 the fourcc?  Should we be specifying if X11 rendering works on various
 fourccs, and between pixmaps of different fourccs?  It's not clear to me
 what glamor would need to be able to do with these pixmaps (can I
 CopyArea between XRGB888 and BGRX?  What does that even mean?)
>>>
>>> X11 pixmaps provide storage for n bits per pixel, where n is the pixmap
>>> depth. CopyArea between pixmaps just copies the bits in the source to
>>> the destination verbatim. Note that this is only possible if both
>>> pixmaps have the same depth.
>>
>> This raises a question about multi-plane (e.g. YUV) formats, where
>> different planes may have different resolutions. Is this functionality
>> intended to be used for such formats? If so, how are X11 drawing
>> operations to/from pixmaps created from such formats envisioned to work?
> 
> Honestly, I wasn't planning on attacking YUV right now, especially
> render-to-YUV. Not least since that brings the complexity of wide vs.
> narrow range and selection of primaries into play. I wouldn't expect
> any server to advertise that as a supported format; one which would
> had better have that figured out ...
> 
> Should we perhaps insert text specifically enforcing only RGB formats?

Probably, assuming there should be any reference to formats at all.


BTW, what's the purpose of supporting four buffers per pixmap, if not
for multi-plane formats?


-- 
Earthling Michel Dänzer   |   http://www.amd.com
Libre software enthusiast | Mesa and X developer
___
xorg-devel@lists.x.org: X.Org development
Archives: http://lists.x.org/archives/xorg-devel
Info: https://lists.x.org/mailman/listinfo/xorg-devel

Re: [PATCH dri3proto v2] Add modifier/multi-plane requests, bump to v1.1

2017-07-28 Thread Michel Dänzer
On 28/07/17 05:03 PM, Daniel Stone wrote:
> On 28 July 2017 at 01:11, Keith Packard  wrote:
>> Eric Anholt  writes:
>>> I think one option would be to have this extension create pixmaps with a
>>> depth equal to the highest populated bit of the fourcc plus one.  Sure,
>>> it's weird that rgbx and xrgb888 have a different depth, but "depth"
>>> is a pretty awful concept at this point.
>>
>> It's just supposed to express which bits in the pixel contribute to
>> generating the resulting color; bits outside of depth 'don't matter',
>> and may not even be retained. Of course, for fourcc values which spread
>> 'pixel' data across multiple storage units.
>>
>> Sorry for not reading the whole proposal up front; I've been out
>> crabbing in the San Juan's for a week and trying to catch up on email in
>> the last few days...
> 
> Really can't fault your choices there.
> 
>> When I was doing Present, what I figured was that we'd define new kinds
>> of read-only picture which had image storage data associated with it
>> that could be in a non-pixel format -- various fourcc formats using
>> multiple buffers, jpeg, png or whatever. You could use those with Render
>> or Present to get data into RGB format or onto the screen. Trying to
>> mash them into 'pixmaps' makes little sense.
>>
>> In this case, I'd imagine we'd add fourcc pictures, and a new
>> DRI3PictureFromBuffers request. Adding a PresentPicture request would be
>> a nice bonus feature to make sure the copy was synchronized with vblank.
> 
> Hm. I think I prefer Eric's suggestion, of just declaring this to be
> undefined behaviour.

Declaring where? Once a pixmap is created, it only has a depth, no
format, so there's nothing to base on that e.g. CopyArea between two
pixmaps of the same depth is undefined.


> You're right that it makes little to no sense to mix the two, but I'm
> not sure what practical gain we get from expressing things as
> Pictures. Ultimately, the server still deals in Window Pixmaps and
> Screen Pixmaps for backing storage, and the compositor interface is
> NameWindowPixmap. Your suggestion of another type seems nicer, but it
> doesn't look like we can avoid Pixmaps as the lowest common
> denominator anyway.
> 
> By implementation if not spec, DRI3 v1.0 enforces that depth 16 is
> RGB565, depth 24 is XRGB, and depth 32 is ARGB.

No, it doesn't. How the bits stored in a pixmap are interpreted is
outside of the scope of DRI3 1.0.


> Or at least in Mesa: the server only supports the latter two with
> glamor_egl. It seems like this has served well enough for long enough,
> so equally we could just eliminate the FourCC from the protocol, stick
> with the fixed depth <-> base format mapping, and encode that in the
> protocol text as well.
> 
> That would much more clearly match the intention of the spec
> additions, which was to allow non-linearly-addressed (Intel X/Y*,
> Vivante tiled/supertiled, VC4 T-tiled, the infinite AMD/NV tiling
> modes) buffers, as well as losslessly-compressed buffers (Intel CCS,
> whatever Tegra calls its mode which similarly has an auxiliary buffer
> to interpret the tiled primary colour buffer, ARM AFBC which is just
> an inline block format). Allowing YUV buffers, attaching colourspace
> information (e.g. converting between primaries/EOTF), replacing the
> use of Pixmaps with another buffer type (Readable as a counterpart to
> Drawable ... ?), would all be objectively good things to have, but
> equally I don't think trying to fix them in a protocol which is really
> just a better version of PutImage for Pixmaps is the best thing.
> 
> tl;dr: we could replace FourCC with depth as the format determinant to
> better match DRI3 v1.0, or just declare that doing anything with those
> Pixmaps other than displaying them to a Window with a compatible
> Visual results in undefined behaviour

I'm getting less and less convinced that any reference at all to formats
in the context of pixmaps in the DRI3 specification is a good idea.


-- 
Earthling Michel Dänzer   |   http://www.amd.com
Libre software enthusiast | Mesa and X developer
___
xorg-devel@lists.x.org: X.Org development
Archives: http://lists.x.org/archives/xorg-devel
Info: https://lists.x.org/mailman/listinfo/xorg-devel

Re: [PATCH dri3proto v2] Add modifier/multi-plane requests, bump to v1.1

2017-07-28 Thread Daniel Stone
Hi,

On 28 July 2017 at 01:11, Keith Packard  wrote:
> Eric Anholt  writes:
>> I think one option would be to have this extension create pixmaps with a
>> depth equal to the highest populated bit of the fourcc plus one.  Sure,
>> it's weird that rgbx and xrgb888 have a different depth, but "depth"
>> is a pretty awful concept at this point.
>
> It's just supposed to express which bits in the pixel contribute to
> generating the resulting color; bits outside of depth 'don't matter',
> and may not even be retained. Of course, for fourcc values which spread
> 'pixel' data across multiple storage units.
>
> Sorry for not reading the whole proposal up front; I've been out
> crabbing in the San Juan's for a week and trying to catch up on email in
> the last few days...

Really can't fault your choices there.

> When I was doing Present, what I figured was that we'd define new kinds
> of read-only picture which had image storage data associated with it
> that could be in a non-pixel format -- various fourcc formats using
> multiple buffers, jpeg, png or whatever. You could use those with Render
> or Present to get data into RGB format or onto the screen. Trying to
> mash them into 'pixmaps' makes little sense.
>
> In this case, I'd imagine we'd add fourcc pictures, and a new
> DRI3PictureFromBuffers request. Adding a PresentPicture request would be
> a nice bonus feature to make sure the copy was synchronized with vblank.

Hm. I think I prefer Eric's suggestion, of just declaring this to be
undefined behaviour.

You're right that it makes little to no sense to mix the two, but I'm
not sure what practical gain we get from expressing things as
Pictures. Ultimately, the server still deals in Window Pixmaps and
Screen Pixmaps for backing storage, and the compositor interface is
NameWindowPixmap. Your suggestion of another type seems nicer, but it
doesn't look like we can avoid Pixmaps as the lowest common
denominator anyway.

By implementation if not spec, DRI3 v1.0 enforces that depth 16 is
RGB565, depth 24 is XRGB, and depth 32 is ARGB. Or at least in
Mesa: the server only supports the latter two with glamor_egl. It
seems like this has served well enough for long enough, so equally we
could just eliminate the FourCC from the protocol, stick with the
fixed depth <-> base format mapping, and encode that in the protocol
text as well.

That would much more clearly match the intention of the spec
additions, which was to allow non-linearly-addressed (Intel X/Y*,
Vivante tiled/supertiled, VC4 T-tiled, the infinite AMD/NV tiling
modes) buffers, as well as losslessly-compressed buffers (Intel CCS,
whatever Tegra calls its mode which similarly has an auxiliary buffer
to interpret the tiled primary colour buffer, ARM AFBC which is just
an inline block format). Allowing YUV buffers, attaching colourspace
information (e.g. converting between primaries/EOTF), replacing the
use of Pixmaps with another buffer type (Readable as a counterpart to
Drawable ... ?), would all be objectively good things to have, but
equally I don't think trying to fix them in a protocol which is really
just a better version of PutImage for Pixmaps is the best thing.

tl;dr: we could replace FourCC with depth as the format determinant to
better match DRI3 v1.0, or just declare that doing anything with those
Pixmaps other than displaying them to a Window with a compatible
Visual results in undefined behaviour

Cheers,
Daniel
___
xorg-devel@lists.x.org: X.Org development
Archives: http://lists.x.org/archives/xorg-devel
Info: https://lists.x.org/mailman/listinfo/xorg-devel

Re: [Mesa-dev] [PATCH dri3proto v2] Add modifier/multi-plane requests, bump to v1.1

2017-07-28 Thread Daniel Stone
Hi Nicolai,
Trying to tackle the stride subthread in one go ...

On 25 July 2017 at 09:28, Nicolai Hähnle  wrote:
> On 22.07.2017 14:00, Daniel Stone wrote:
>> On 21 July 2017 at 18:32, Michel Dänzer  wrote:
>>> We just ran into an issue which might mean that there's still something
>>> missing in this v2 proposal:
>>>
>>> The context is DRI3 PRIME render offloading of glxgears (not useful in
>>> practice, but bear with me). The display GPU is Raven Ridge, which
>>> requires
>>> that the stride even of linear textures is a multiple of 256 bytes. The
>>> renderer GPU is Polaris12, which still supports smaller alignment of the
>>> stride. With the glxgears window of width 300, the renderer GPU driver
>>> chooses a stride of 304 (* 4 / 256 = 4.75), whereas the display GPU would
>>> require 320 (* 4 / 256 = 5). This cannot work.
>>
>>
>> The obvious answer is just to increase padding on external/winsys
>> surfaces? Increasing it for all allocations would probably be a
>> non-starter, but winsys surfaces are rare enough that you could
>> probably afford to take the hit, I guess.
>>
>>> I see two basic approaches to solve this:
>>> [...]
>>> Maybe there are other possible approaches I'm missing? Other comments?
>>
>> I don't have any great solution off the top of my head, but I'd be
>> inclined to bundle stride in with placement. TTBOMK (from having
>> looked at radv), buffers shared cross-GPU also need to be allocated
>> from a separate externally-visible memory heap. And at the moment,
>> lacking placement information at allocation time (at least for EGL
>> allocations, via DRIImage), that happens via transparent migration at
>> import time I think. Placement restrictions would probably also
>> involve communicating base address alignment requirements.
>
> Stride isn't really in the same category as placement and base address
> alignment, though.
>
> Placement and base address alignment requirements can apply to all possible
> texture layouts, while the concept of stride is specific to linear layouts.

No, I don't think it is. Tiled layouts still have a stride: if you
look at i915 X/Y/Yf/Y_CCS/Yf_CCS (the latter two containing an
auxiliary compression/fast-clear buffer), iMX/etnaviv
tiled/supertiled, or VC4 T-tiled modifiers and how they're handled
both for DRIImage and KMS interchange, they all specify a stride which
is conceptually the same as linear, if you imagine linear to be 1x1
tiled.

Most tiling users accept any integer units of tiles (buffer width
aligned to tile width), but not always. The NV12MT format used by
Samsung media decoders (also shipped in some Qualcomm SoCs) is
particularly special, IIRC requiring height to be aligned to a
multiple of two tiles.

>> Given that, I'm fairly inclined to punt those until we have the grand
>> glorious allocator, rather than trying to add it to EGL/GBM
>> separately. The modifiers stuff was a fairly obvious augmentation -
>> EGL already had no-modifier format import but no query as to which
>> formats it would accept, and modifiers are a logical extension of
>> format - but adding the other restrictions is a bigger step forward.
>
> Perhaps a third option would be to encode the required pitch_in_bytes
> alignment as part of the modifier?
>
> So DRI3GetSupportedModifiers would return DRM_FORMAT_MOD_LINEAR_256B when
> the display GPU is a Raven Ridge.
>
> More generally, we could say that fourcc_mod_code(NONE, k) means that the
> pitch_in_bytes has to be a multiple of 2**k for e.g. k <= 31. Or if you
> prefer, we could have a stride requirement in elements or pixels instead of
> bytes.

I've been thinking this over for the past couple of days, and we could
make it work in clients. But we already have DRM_FORMAT_MOD_LINEAR
with a well-defined meaning, implemented in quite a few places. I
think special-casing linear to encode stride alignment is something
we'd regret in the long run, especially given that some hardware
_does_ have more strict alignment requirements for tiled modes than
just an integer number of tiles allocated.

AFAICT, both AMD and NVIDIA are both going to use a fair bit of the
tiling enum space to encode tile size as well as layout. If allocation
alignment requirements (in both dimensions) needed to be added to
that, it's entirely likely that there wouldn't be enough space and
you'd need to put it somewhere else than the modifier, in which case
we've not even really solved the problem.

It definitely seems attractive to kill two birds with one stone, but
I'd really much rather not conflate format description/advertisement,
and allocation restriction, into one enum. I'm still on the side of
saying that this is a problem modifiers do not solve, deferring to the
allocator we need anyway in order to determine things like placement
and global optimality (e.g. rotated scanout placing further
restrictions on allocation).

Cheers,
Daniel
___
xorg-devel@lists.x.org: 

Re: [PATCH xserver] composite: Make compIsAlternateVisual safe even if Composite is off

2017-07-28 Thread walter harms


Am 27.07.2017 22:02, schrieb Adam Jackson:
> As of ea483af9 we're calling this unconditionally from the GLX code so
> the synthetic visual is in a lower select group. If Composite has been
> disabled then GetCompScreen() will return NULL, and this would crash.
> 
> Rather than force the caller to check first, just always return FALSE if
> Composite is disabled (which is correct, since none of the visuals will
> be synthetic in that case).
> 
> Signed-off-by: Adam Jackson 
> ---
>  composite/compwindow.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/composite/compwindow.c b/composite/compwindow.c
> index 367f23eb7..f88238146 100644
> --- a/composite/compwindow.c
> +++ b/composite/compwindow.c
> @@ -326,7 +326,7 @@ compIsAlternateVisual(ScreenPtr pScreen, XID visual)
>  CompScreenPtr cs = GetCompScreen(pScreen);
>  int i;
>  
> -for (i = 0; i < cs->numAlternateVisuals; i++)
> +for (i = 0; cs && i < cs->numAlternateVisuals; i++)
>  if (cs->alternateVisuals[i] == visual)
>  return TRUE;
>  return FALSE;

IMHO this would be more readable:

cs = GetCompScreen(pScreen);
if (!cs)
return FALSE;

   for()...


just my 2 cents ...

re,
 wh
___
xorg-devel@lists.x.org: X.Org development
Archives: http://lists.x.org/archives/xorg-devel
Info: https://lists.x.org/mailman/listinfo/xorg-devel

Re: [PATCH dri3proto v2] Add modifier/multi-plane requests, bump to v1.1

2017-07-28 Thread Daniel Stone
Hi Michel,

On 26 July 2017 at 07:21, Michel Dänzer  wrote:
> On 26/07/17 10:48 AM, Michel Dänzer wrote:
>> On 26/07/17 06:20 AM, Eric Anholt wrote:
>>> Daniel Stone  writes:
 +   The exact configuration of the buffer is specified by 'format',
 +   a DRM FourCC format token as defined in that project's
 +   drm_fourcc.h header, in combination with the modifier.
>>>
>>> Should we be specifying how the depth of the Pixmap is determined from
>>> the fourcc?  Should we be specifying if X11 rendering works on various
>>> fourccs, and between pixmaps of different fourccs?  It's not clear to me
>>> what glamor would need to be able to do with these pixmaps (can I
>>> CopyArea between XRGB888 and BGRX?  What does that even mean?)
>>
>> X11 pixmaps provide storage for n bits per pixel, where n is the pixmap
>> depth. CopyArea between pixmaps just copies the bits in the source to
>> the destination verbatim. Note that this is only possible if both
>> pixmaps have the same depth.
>
> This raises a question about multi-plane (e.g. YUV) formats, where
> different planes may have different resolutions. Is this functionality
> intended to be used for such formats? If so, how are X11 drawing
> operations to/from pixmaps created from such formats envisioned to work?

Honestly, I wasn't planning on attacking YUV right now, especially
render-to-YUV. Not least since that brings the complexity of wide vs.
narrow range and selection of primaries into play. I wouldn't expect
any server to advertise that as a supported format; one which would
had better have that figured out ...

Should we perhaps insert text specifically enforcing only RGB formats?

Cheers,
Daniel
___
xorg-devel@lists.x.org: X.Org development
Archives: http://lists.x.org/archives/xorg-devel
Info: https://lists.x.org/mailman/listinfo/xorg-devel

Re: [PATCH dri3proto v2] Add modifier/multi-plane requests, bump to v1.1

2017-07-28 Thread Daniel Stone
Hi,
Sorry, I've been sick the past couple of days - exactly when this
thread exploded ...

On 25 July 2017 at 22:20, Eric Anholt  wrote:
> Daniel Stone  writes:
>> DRI3 version 1.1 adds support for explicit format modifiers, including
>> multi-planar buffers.
>
> I still want proper 64-bit values, and I don't think the little XSync
> mess will be much of a blocker.

Cool, I'll happily review the CARD64 bits and flick the switch on the
protocol when those land.

>> +┌───
>> +DRI3GetSupportedModifiers
>> + window: WINDOW
>> + format: CARD32
>> +  ▶
>> + num_modifiers: CARD32
>> + modifiers: ListOfCARD32
>> +└───
>> + Errors: Window, Match
>> +
>> + For the Screen associated with 'window', return a list of
>> + supported DRM FourCC modifiers, as defined in drm_fourcc.h,
>> + supported as formats for DRI3 pixmap/buffer interchange.
>> + Each modifier is returned as returned as a CARD32
>> + containing the most significant 32 bits, followed by a
>> + CARD32 containing the least significant 32 bits. The hi/lo
>> + pattern repeats 'num_modifiers' times, thus there are
>> + '2 * num_modifiers' CARD32 elements returned.
>
> Should any meaning be assumed from the ordering of modifiers?

Nope, arbitrary order. In practice, the client does its own sort
across the list anyway, selecting for local optimality.

>> + Precisely how any additional information about the buffer is
>> + shared is outside the scope of this extension.
>
> Should we be specifying how the depth of the Pixmap is determined from
> the fourcc?  Should we be specifying if X11 rendering works on various
> fourccs, and between pixmaps of different fourccs?  It's not clear to me
> what glamor would need to be able to do with these pixmaps (can I
> CopyArea between XRGB888 and BGRX?  What does that even mean?)

I'll come back to that in the subthread.

>> +┌───
>> +DRI3FenceFromDMAFenceFD
>> + drawable: DRAWABLE
>> + fence: FENCE
>> + fd: FD
>> +└───
>> + Errors: IDChoice, Drawable
>> +
>> + Creates a Sync extension Fence that provides the regular Sync
>> + extension semantics. The Fence will begin untriggered, and
>> + become triggered when the underlying dma-fence FD signals.
>> + The resulting Sync Fence is a one-shot, and may not be
>> + manually triggered, reset, or reused until it is destroyed.
>> + Details about the mechanism used with this file descriptor are
>> + outside the scope of the DRI3 extension.
>
> I was surprised to find this lumped in with a commit about
> multi-planar/buffer support -- is it actually related, and is it used?

Related, no. Used, not right now, but there'll be patches out to
implement explicit fencing for Vulkan clients next week. It's only
lumped in to save doing two version bumps at the exact same time.

> Must an implementation supporting 1.1 support this?  dma-fences seem
> like a pretty recent kernel feature.

You're right, a capability query would be better here.

>> +┌───
>> +DRI3DMAFenceFDFromFence
>> + drawable: DRAWABLE
>> + fence: FENCE
>> +  ▶
>> + fd: FD
>> +└───
>> + Errors: IDChoice, Drawable, Match
>> +
>> + Given a Sync extension Fence originally created by the
>> + DRI3FenceFromDMAFenceFD request, return the underlying
>> + dma-fence FD to the client. Details about the mechanism used
>> + with this file descriptor are outside the scope of the DRI3
>> + extension. 'drawable' must be associated with a direct
>> + rendering device that 'fence' can work with, otherwise a Match
>> + error results. NB: it is quite likely this will be forever
>> + unused, and may be removed in later revisions.
>> +
>
> Let's not introduce protocol if we can't come up with a use for it.

Happily, it is actually used, after a bit of back-and-forth on the
implementation. lfrb is at SIGGRAPH this week, but he has some
branches which work but are in need of cleanup here:
https://git.collabora.com/cgit/user/lfrb/xserver.git/log/?h=x11-fences
https://git.collabora.com/cgit/user/lfrb/mesa.git/log/?h=wip/2017-07/vulkan-fences

The idea is that when a VkSemaphore is passed into vkQueuePresentKHR,
the implementation extracts a dma-fence from the semaphore, creates an
X11 fence directly wrapping that, and passes that in as the wait_fence
to PresentPixmap. The server then inserts a hardware-side wait (either
EGL_ANDROID_native_fence_fd + eglWaitSyncKHR for Glamor, or
IN_FENCE_FD when directly flipping with KMS). On the converse side,
out-fences are implemented by creating an 'empty' DMAFence object,
passing that as the idle_fence to PresentPixmap, calling
DMAFenceFDFromFence when the PresentIdlePixmap event comes through,
then wrapping that into a VkSemaphore/VkFence returned via
vkAcquireNextImageKHR.

We'll do some cleanup across the branch - and this protocol text -
before sending it out though.

Cheers,
Daniel

Re: [RFC xserver 05/16] DRI3/Glamor: Implement PixmapFromBuffers request

2017-07-28 Thread Daniel Stone
Hi,

On 28 July 2017 at 00:03, Eric Anholt  wrote:
> Daniel Stone  writes:
>> +{ .format = DRM_FORMAT_ARGB2101010, .depth = 30, .bpp = 32 },
>> +{ .format = DRM_FORMAT_ABGR2101010, .depth = 30, .bpp = 32 },
>> +{ .format = DRM_FORMAT_RGBA1010102, .depth = 30, .bpp = 32 },
>> +{ .format = DRM_FORMAT_BGRA1010102, .depth = 30, .bpp = 32 },
>
> Regardless of the resolution of the core rendering/depth/etc discussion
> elsewhere, these 4 would be depth 32.

Yeah, obviously correct. Thanks for the detailed look. :)

Cheers,
Daniel
___
xorg-devel@lists.x.org: X.Org development
Archives: http://lists.x.org/archives/xorg-devel
Info: https://lists.x.org/mailman/listinfo/xorg-devel