Re: [PATCH 2/7] drm/uAPI: Add "active color format" drm property as feedback for userspace
Hi, On Wed, 10 Jan 2024 at 10:44, Daniel Vetter wrote: > On Tue, Jan 09, 2024 at 11:12:11PM +, Andri Yngvason wrote: > > ţri., 9. jan. 2024 kl. 22:32 skrifađi Daniel Stone : > > > How does userspace determine what's happened without polling? Will it > > > only change after an `ALLOW_MODESET` commit, and be guaranteed to be > > > updated after the commit has completed and the event being sent? > > > Should it send a HOTPLUG event? Other? > > > > > > > Userspace does not determine what's happened without polling. The purpose > > of this property is not for programmatic verification that the preferred > > property was applied. It is my understanding that it's mostly intended for > > debugging purposes. It should only change as a consequence of modesetting, > > although I didn't actually look into what happens if you set the "preferred > > color format" outside of a modeset. > > This feels a bit irky to me, since we don't have any synchronization and > it kinda breaks how userspace gets to know about stuff. > > For context the current immutable properties are all stuff that's derived > from the sink (like edid, or things like that). Userspace is guaranteed to > get a hotplug event (minus driver bugs as usual) if any of these change, > and we've added infrastructure so that the hotplug event even contains the > specific property so that userspace can avoid re-read (which can cause > some costly re-probing) them all. Right. > [...] > > This thing here works entirely differently, and I think we need somewhat > new semantics for this: > > - I agree it should be read-only for userspace, so immutable sounds right. > > - But I also agree with Daniel Stone that this should be tied more > directly to the modeset state. > > So I think the better approach would be to put the output type into > drm_connector_state, require that drivers compute it in their > ->atomic_check code (which in the future would allow us to report it out > for TEST_ONLY commits too), and so guarantee that the value is updated > right after the kms ioctl returns (and not somewhen later for non-blocking > commits). That's exactly the point. Whether userspace gets an explicit notification or it has to 'know' when to read isn't much of an issue - just as long as it's well defined. I think the suggestion of 'do it in atomic_check and then it's guaranteed to be readable when the commit completes' is a good one. I do still have some reservations - for instance, why do we have the fallback to auto when userspace has explicitly requested a certain type? - but they may have been covered previously. Cheers, Daniel
Re: [PATCH 2/7] drm/uAPI: Add "active color format" drm property as feedback for userspace
Hi, On Tue, 9 Jan 2024 at 18:12, Andri Yngvason wrote: > + * active color format: > + * This read-only property tells userspace the color format actually used > + * by the hardware display engine "on the cable" on a connector. The > chosen > + * value depends on hardware capabilities, both display engine and > + * connected monitor. Drivers shall use > + * drm_connector_attach_active_color_format_property() to install this > + * property. Possible values are "not applicable", "rgb", "ycbcr444", > + * "ycbcr422", and "ycbcr420". How does userspace determine what's happened without polling? Will it only change after an `ALLOW_MODESET` commit, and be guaranteed to be updated after the commit has completed and the event being sent? Should it send a HOTPLUG event? Other? Cheers, Daniel
Re: [PATCH 3/3] drm/connector: Deprecate split for BT.2020 in drm_colorspace enum
On Wed, 15 Feb 2023 at 20:54, Harry Wentland wrote: > On 2/15/23 06:46, Daniel Stone wrote: > > On Tue, 14 Feb 2023 at 16:57, Harry Wentland wrote: > >> On 2/14/23 10:49, Sebastian Wick wrote: > >> From what I've seen recently I am inclined to favor an incremental > >> approach more. The reason is that any API, or portion thereof, is > >> useless unless it's enabled full stack. When it isn't it becomes > >> dead code quickly, or never really works because we overlooked > >> one thing. The colorspace debacle shows how even something as > >> simple as extra enum values in KMS APIs shouldn't be added unless > >> someone in a canonical upstream project actually uses them. I > >> would argue that such a canonical upstream project actually has > >> to be a production environment and not something like Weston. > > > > Just to chime in as well that it is a real production environment; > > it's probably actually shipped the most of any compositor by a long > > way. It doesn't have much place on the desktop, but it does live in > > planes, trains, automobiles, digital signage, kiosks, STBs/TVs, and > > about a billion other places you might not have expected. > > > > Understood. > > Curious if there's a list of some concrete examples. If I was allowed to name them, I'd definitely be doing a much better job of promoting it ... but if you've bought a car in the last 7-8 years, it's much more likely than not that its console display is using Weston. Probably about 50% odds that you've flown on a plane whose IFE is driven by Weston. You've definitely walked past a lot of digital signage advertisements and display walls which are driven by Weston. There are a huge number of consumer products (and other modes of transport, would you believe?) that are too, but I can't name them because it gets too specific. The cars are probably using a 10+ year old (and frankly awful) SoC. The display walls are probably using a 6ish-year-old SoC with notoriously poor memory bandwidth. Or TVs trying to make 4K UIs fly on an ancient (pre-unified-shader) GPU. The hits go on. We do ship things on nice and capable new hardware as well, but keeping old hardware working with new software stacks is non-negotiable for us, and we have to bend over backwards to make that happen. > >> We should look at this from a use-case angle, similar to what > >> the gamescope guys are doing. Small steps, like: > >> 1) Add HDR10 output (PQ, BT.2020) to the display > >> 2) Add ability to do sRGB linear blending > >> 3) Add ability to do sRGB and PQ linear blending > >> 4) Post-blending 3D LUT > >> 5) Pre-blending 3D LUT > >> > >> At each stage the whole stack needs to work together in production. > > > > Personally, I do think at this stage we probably have enough of an > > understanding to be able to work with an intermediate solution. We > > just need to think hard about what that intermediate solution is - > > making sure that we don't end up in the same tangle of impossible > > semantics like the old 'broadcast RGB' / colorspace / HDR properties > > which were never thought through - so that it is something we can > > build on rather than something we have to work around. But it would be > > really good to make HDR10/HDR10+ media and HDR games work on HDR > > displays, yeah. > > I have a feeling we'll make some progress here this year. I definitely > think the whole HDR/Colour work is on the right track in Weston and > Wayland which will hopefully give us a good base to work with over > many years. Yep! Coming to the point you were making in the other mail - Weston was traditionally used as _the_ enablement vehicle for KMS, because we cared about using the depth of hardware much more than anyone else (e.g. being years ahead on planes), and the vendor who wanted to enable it either wanted to enable Weston specifically or just didn't have an open userspace stack for it. The other compositors couldn't be that vehicle, either because they were more focused on desktop UI, or they could just afford to throw the GPU at it and suck up the occasional frame hitch / thermal burn / etc. I like to think we had a reputation for being pretty thoughtful and careful with our review as well, and didn't give it lightly to misguided ideas which caused long-term problems. But we've got a greater diversity in userspace these days, and that's no bad thing. If the best vehicle to demonstrate HDR GPU rendering is gamescope, then use gamescope as that vehicle. We'll be there if we can, and if it makes sense for us, but it's not a requirement. Cheers, Daniel
Re: [PATCH 3/3] drm/connector: Deprecate split for BT.2020 in drm_colorspace enum
Hi, On Tue, 14 Feb 2023 at 16:57, Harry Wentland wrote: > On 2/14/23 10:49, Sebastian Wick wrote: > From what I've seen recently I am inclined to favor an incremental > approach more. The reason is that any API, or portion thereof, is > useless unless it's enabled full stack. When it isn't it becomes > dead code quickly, or never really works because we overlooked > one thing. The colorspace debacle shows how even something as > simple as extra enum values in KMS APIs shouldn't be added unless > someone in a canonical upstream project actually uses them. I > would argue that such a canonical upstream project actually has > to be a production environment and not something like Weston. Just to chime in as well that it is a real production environment; it's probably actually shipped the most of any compositor by a long way. It doesn't have much place on the desktop, but it does live in planes, trains, automobiles, digital signage, kiosks, STBs/TVs, and about a billion other places you might not have expected. Probably the main factor that joins all these together - apart from not having much desktop-style click-and-drag reconfigurable UI - is that we need to use the hardware pipeline as efficiently as possible, because either we don't have the memory bandwidth to burn like desktops, or we need to minimise it for power/thermal reasons. Given that, we don't really want to paint ourselves into a corner with incremental solutions that mean we can't do fully efficient things later. We're also somewhat undermanned, and we've been using our effort to try to make sure that the full solution - including full colour-managed pathways for things like movie and TV post-prod composition, design, etc - is possible at some point through the full Wayland ecosystem at some point. The X11 experience was so horribly botched that it wasn't really possible without a complete professional setup, and that's something I personally don't want to see. However ... > I could see us getting to a fully new color pipeline API but > the only way to do that is with a development model that supports > it. While upstream needs to be our ultimate goal, a good way > to bring in new APIs and ensure a full-stack implementation is > to develop them in a downstream production kernel, alongside > userspace that makes use of it. Once the implementation is > proven in the downstream repos it can then go upstream. This > brings new challenges, though, as things don't get wide > testing and get out of sync with upstream quickly. The > alternative is the incremental approach. > > We should look at this from a use-case angle, similar to what > the gamescope guys are doing. Small steps, like: > 1) Add HDR10 output (PQ, BT.2020) to the display > 2) Add ability to do sRGB linear blending > 3) Add ability to do sRGB and PQ linear blending > 4) Post-blending 3D LUT > 5) Pre-blending 3D LUT > > At each stage the whole stack needs to work together in production. Personally, I do think at this stage we probably have enough of an understanding to be able to work with an intermediate solution. We just need to think hard about what that intermediate solution is - making sure that we don't end up in the same tangle of impossible semantics like the old 'broadcast RGB' / colorspace / HDR properties which were never thought through - so that it is something we can build on rather than something we have to work around. But it would be really good to make HDR10/HDR10+ media and HDR games work on HDR displays, yeah. Cheers, Daniel
Re: [PATCHv4] drm/amdgpu: disable ASPM on Intel Alder Lake based systems
On Sun, 1 May 2022 at 08:08, Paul Menzel wrote: > Am 26.04.22 um 15:53 schrieb Gong, Richard: > > I think so. We captured dmesg log. > > Then the (whole) system did *not* freeze, if you could still log in > (maybe over network) and execute `dmesg`. Please also paste the > amdgpu(?) error logs in the commit message. > > > As mentioned early we need support from Intel on how to get ASPM working > > for VI generation on Intel Alder Lake, but we don't know where things > > currently stand. > > Who is working on this, and knows? This has gone beyond the point of a reasonable request. The amount of detail you're demanding is completely unnecessary.
Re: Commit messages (was: [PATCH v11] drm/amdgpu: add drm buddy support to amdgpu)
On Wed, 23 Mar 2022 at 15:14, Alex Deucher wrote: > On Wed, Mar 23, 2022 at 11:04 AM Daniel Stone wrote: > > That's not what anyone's saying here ... > > > > No-one's demanding AMD publish RTL, or internal design docs, or > > hardware specs, or URLs to JIRA tickets no-one can access. > > > > This is a large and invasive commit with pretty big ramifications; > > containing exactly two lines of commit message, one of which just > > duplicates the subject. > > > > It cannot be the case that it's completely impossible to provide any > > justification, background, or details, about this commit being made. > > Unless, of course, it's to fix a non-public security issue, that is > > reasonable justification for eliding some of the details. But then > > again, 'huge change which is very deliberately opaque' is a really > > good way to draw a lot of attention to the commit, and it would be > > better to provide more detail about the change to help it slip under > > the radar. > > > > If dri-devel@ isn't allowed to inquire about patches which are posted, > > then CCing the list is just a façade; might as well just do it all > > internally and periodically dump out pull requests. > > I think we are in agreement. I think the withheld information > Christian was referring to was on another thread with Christian and > Paul discussing a workaround for a hardware bug: > https://www.spinics.net/lists/amd-gfx/msg75908.html Right, that definitely seems like some crossed wires. I don't see anything wrong with that commit at all: the commit message and a comment notes that there is a hardware issue preventing Raven from being able to do TMZ+GTT, and the code does the very straightforward and obvious thing to ensure that on VCN 1.0, any TMZ buffer must be VRAM-placed. This one, on the other hand, is much less clear ... Cheers, Daniel
Re: Commit messages (was: [PATCH v11] drm/amdgpu: add drm buddy support to amdgpu)
Hi Alex, On Wed, 23 Mar 2022 at 14:42, Alex Deucher wrote: > On Wed, Mar 23, 2022 at 10:00 AM Daniel Stone wrote: > > On Wed, 23 Mar 2022 at 08:19, Christian König > > wrote: > > > Well the key point is it's not about you to judge that. > > > > > > If you want to complain about the commit message then come to me with > > > that and don't request information which isn't supposed to be publicly > > > available. > > > > > > So to make it clear: The information is intentionally hold back and not > > > made public. > > > > In that case, the code isn't suitable to be merged into upstream > > trees; it can be resubmitted when it can be explained. > > So you are saying we need to publish the problematic RTL to be able to > fix a HW bug in the kernel? That seems a little unreasonable. Also, > links to internal documents or bug trackers don't provide much value > to the community since they can't access them. In general, adding > internal documents to commit messages is frowned on. That's not what anyone's saying here ... No-one's demanding AMD publish RTL, or internal design docs, or hardware specs, or URLs to JIRA tickets no-one can access. This is a large and invasive commit with pretty big ramifications; containing exactly two lines of commit message, one of which just duplicates the subject. It cannot be the case that it's completely impossible to provide any justification, background, or details, about this commit being made. Unless, of course, it's to fix a non-public security issue, that is reasonable justification for eliding some of the details. But then again, 'huge change which is very deliberately opaque' is a really good way to draw a lot of attention to the commit, and it would be better to provide more detail about the change to help it slip under the radar. If dri-devel@ isn't allowed to inquire about patches which are posted, then CCing the list is just a façade; might as well just do it all internally and periodically dump out pull requests. Cheers, Daniel
Re: [PATCH v2 1/2] drm: Add GPU reset sysfs event
Hi, On Mon, 21 Mar 2022 at 16:02, Rob Clark wrote: > On Mon, Mar 21, 2022 at 2:30 AM Christian König > wrote: > > Well you can, it just means that their contexts are lost as well. > > Which is rather inconvenient when deqp-egl reset tests, for example, > take down your compositor ;-) Yeah. Or anything WebGL. System-wide collateral damage is definitely a non-starter. If that means that the userspace driver has to do what iris does and ensure everything's recreated and resubmitted, that works too, just as long as the response to 'my adblocker didn't detect a crypto miner ad' is something better than 'shoot the entire user session'. Cheers, Daniel
Re: Commit messages (was: [PATCH v11] drm/amdgpu: add drm buddy support to amdgpu)
On Wed, 23 Mar 2022 at 08:19, Christian König wrote: > Am 23.03.22 um 09:10 schrieb Paul Menzel: > > Sorry, I disagree. The motivation needs to be part of the commit > > message. For example see recent discussion on the LWN article > > *Donenfeld: Random number generator enhancements for Linux 5.17 and > > 5.18* [1]. > > > > How much the commit message should be extended, I do not know, but the > > current state is insufficient (too terse). > > Well the key point is it's not about you to judge that. > > If you want to complain about the commit message then come to me with > that and don't request information which isn't supposed to be publicly > available. > > So to make it clear: The information is intentionally hold back and not > made public. In that case, the code isn't suitable to be merged into upstream trees; it can be resubmitted when it can be explained. Cheers, Daniel
Re: [PATCH v2 1/2] drm: Add GPU reset sysfs event
Hi, On Thu, 17 Mar 2022 at 09:21, Christian König wrote: > Am 17.03.22 um 09:42 schrieb Sharma, Shashank: > >> AFAIU you probably want to be passing around a `struct pid *`, and > >> then somehow use pid_vnr() in the context of the process reading the > >> event to get the numeric pid. Otherwise things will not do what you > >> expect if the process triggering the crash is in a different pid > >> namespace from the compositor. > > > > I am not sure if it is a good idea to add the pid extraction > > complexity in here, it is left upto the driver to extract this > > information and pass it to the work queue. In case of AMDGPU, its > > extracted from GPU VM. It would be then more flexible for the drivers > > as well. > > Yeah, but that is just used for debugging. > > If we want to use the pid for housekeeping, like for a daemon which > kills/restarts processes, we absolutely need that or otherwise won't be > able to work with containers. 100% this. Pushing back to the compositor is a red herring. The compositor is just a service which tries to handle window management and input. If you're looking to kill the offending process or whatever, then that should go through the session manager - be it systemd or something container-centric or whatever. At least that way it can deal with cgroups at the same time, unlike the compositor which is not really aware of what the thing on the other end of the socket is doing. This ties in with the support they already have for things like coredump analysis, and would also be useful for other devices. Some environments combine compositor and session manager, and a lot of them have them strongly related, but they're very definitely not the same thing ... Cheers, Daniel
Re: [RFC PATCH v2 0/3] Add support modifiers for drivers whose planes only support linear layout
Hi Esaki-san, On Thu, 13 Jan 2022 at 09:44, Tomohito Esaki wrote: > Some drivers whose planes only support linear layout fb do not support format > modifiers. > These drivers should support modifiers, however the DRM core should handle > this > rather than open-coding in every driver. > > In this patch series, these drivers expose format modifiers based on the > following suggestion[1]. Thanks for the series, it looks like the right thing to do. Can you please change the patch ordering though? At the moment there will be a bisection break at patch #1, because the legacy drivers will suddenly start gaining modifier support, before it is removed in patch #2. I think a better order would be: 1: add fb_modifiers_not_supported flag to core and drivers 2: add default modifiers (and set allow_fb_modifiers) if fb_modifiers_not_supported flag is not set 3: remove allow_fb_modifiers flag Cheers, Daniel
Re: [diagnostic TDR mode patches] unify our solution opinions/suggestions in one thread
Hi Monk, On Thu, 2 Sept 2021 at 06:52, Liu, Monk wrote: > I didn't mean your changes on AMD driver need my personal approval or review > ... and I'm totally already get used that our driver is not 100% under > control by AMDers, > but supposedly any one from community (including you) who tend to change > AMD's driver need at least to get approvement from someone in AMD, e.g.: > AlexD or Christian, doesn't that reasonable? > just like we need your approve if we try to modify DRM-sched, or need > panfrost's approval if we need to change panfrost code ... > > by only CC AMD's engineers looks not quite properly, how do you know if your > changes (on AMD code part) are conflicting with AMD's on-going internal > features/refactoring or not ? Looking at the patches in question, they were (at least mostly) CCed both to the amd-gfx@ mailing list and also to ckoenig. Unfortunately it is not possible for every single patch to get mandatory signoff from every single stakeholder - e.g. if every AMD patch which touched the scheduler required explicit approval from Etnaviv, Freedreno, Lima, Panfrost, and V3D teams, it would become very difficult for AMD to merge any code. So the approach is that patches are sent for approval, they are CCed to people who should be interested, and after some time with no comments, they may be merged if it seems like a reasonable thing to do. The problem with internal work is that, well, it's internal. If the community sends patches to amd-gfx@, there is no comment from AMD, and then months later we are told that it should not have happened because it conflicts with development that AMD has been doing - how should the rest of the community have known about this? So unfortunately this is the compromise: if you decide to do private development, not inform anyone about your plans, and not join in any common discussion, then it is your responsibility to deal with any changes or conflicts that happen whilst you are developing privately. The only way we can successfully have support in the same ecosystem for AMD, Arm, Broadcom, Intel, NVIDIA, Qualcomm, and VeriSilicon, is that we are all working together openly. If community development had to stop because each of these vendors had been doing internal development for several months without even informing the community of their plans, any kind of shared development is clearly impossible. Cheers, Daniel
Re: [PATCH] drm/amdgpu: Ensure that the modifier requested is supported by plane.
Hi Mark, On Wed, 24 Mar 2021 at 14:58, Mark Yacoub wrote: > So you mean it would make more sense to be more explicit in handling > DRM_FORMAT_MOD_INVALID as an incoming modifier (which will, just like > DRM_FORMAT_MOD_LINEAR, will return true in > dm_plane_format_mod_supported)? > That's correct. Not passing any modifiers is the same as explicitly passing INVALID, both of which mean 'the driver will figure it out somehow'; that driver-specific determination is not the same as explicit LINEAR. (I cannot regret enough that INVALID is not 0.) Cheers, Daniel ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
Re: [PATCH] drm/amdgpu: Ensure that the modifier requested is supported by plane.
On Wed, 24 Mar 2021 at 10:53, Bas Nieuwenhuizen wrote: > On Wed, Mar 24, 2021 at 11:13 AM Michel Dänzer wrote: > >> No modifier support does not imply linear. It's generally signalled via >> DRM_FORMAT_MOD_INVALID, which roughly means "tiling is determined by driver >> specific mechanisms". >> > > Doesn't quite work that way in the kernel sadly. If you don't set > DRM_MODE_FB_MODIFIERS then the modifier fields have to be 0 (which happens > to alias DRM_FORMAT_MOD_LINEAR and then now deprecated > DRM_FORMAT_MOD_NONE). This is verified in shared drm code. > > (and all userspace code I've seen simply doesn't set DRM_MODE_FB_MODIFIERS > if the incoming modifier is DRM_FORMAT_MOD_INVALID) > Yes, but even though the field is zero, the lack of the flag means it must be treated as INVALID. If the kernel is not doing this, the kernel is objectively wrong. (And I know it doesn't do this in most cases, because otherwise I wouldn't be able to use this GNOME session on an Intel laptop, where modifiers are blacklisted.) Cheers, Daniel ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
Re: [PATCH 0/3] Use implicit kref infra
Hi Luben, On Wed, 2 Sep 2020 at 16:16, Luben Tuikov wrote: > Not sure how I can do this when someone doesn't want to read up on > the kref infrastructure. Can you help? > > When someone starts off with "My understanding of ..." (as in the OP) you > know you're > in trouble and in for a rough times. > > Such is the nature of world-wide open-to-everyone mailing lists where > anyone can put forth an argument, regardless of their level of understanding. > The more obfuscated an argument, the more uncertainty. > > If one knows the kref infrastructure, it just clicks, no explanation > necessary. Evidently there are more points of view than yours. Evidently your method of persuasion is also not working, because this thread is now getting quite long and not converging on your point of view (which you are holding to be absolutely objectively correct). I think you need to re-evaluate the way in which you speak to people, considering that it costs nothing to be polite and considerate, and also takes effort to be rude and dismissive. Cheers, Daniel ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
Re: [PATCH 0/3] Use implicit kref infra
Hi Luben, On Wed, 2 Sep 2020 at 15:51, Luben Tuikov wrote: > Of course it's true--good morning! > > Let me stop you right there--just read the documentation I pointed > to you at. > > No! > > I'm sorry, that doesn't make sense. > > No, that's horrible. > > No, that's horrible. > > You need to understand how the kref infrastructure works in the kernel. I've > said > it a million times: it's implicit. > > Or LESS. Less changes. Less is better. Basically revert and redo all this > "managed resources". There are many better ways to make your point. At the moment it's just getting lost in shouting. Cheers, Daniel ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
Re: [Intel-gfx] [PATCH 03/25] dma-buf.rst: Document why idenfinite fences are a bad idea
On Thu, 9 Jul 2020 at 09:05, Daniel Vetter wrote: > On Thu, Jul 09, 2020 at 08:36:43AM +0100, Daniel Stone wrote: > > On Tue, 7 Jul 2020 at 21:13, Daniel Vetter wrote: > > > Comes up every few years, gets somewhat tedious to discuss, let's > > > write this down once and for all. > > > > Thanks for writing this up! I wonder if any of the notes from my reply > > to the previous-version thread would be helpful to more explicitly > > encode the carrot of dma-fence's positive guarantees, rather than just > > the stick of 'don't do this'. ;) Either way, this is: > > I think the carrot should go into the intro section for dma-fence, this > section here is very much just the "don't do this" part. The previous > patches have an attempt at encoding this a bit, maybe see whether there's > a place for your reply (or parts of it) to fit? Sounds good to me. > > Acked-by: Daniel Stone > > > > > What I'm not sure about is whether the text should be more explicit in > > > flat out mandating the amdkfd eviction fences for long running compute > > > workloads or workloads where userspace fencing is allowed. > > > > ... or whether we just say that you can never use dma-fence in > > conjunction with userptr. > > Uh userptr is entirely different thing. That one is ok. It's userpsace > fences or gpu futexes or future fences or whatever we want to call them. > Or is there some other confusion here?. I mean generating a dma_fence from a batch which will try to page in userptr. Given that userptr could be backed by absolutely anything at all, it doesn't seem smart to allow fences to rely on a pointer to an mmap'ed NFS file. So it seems like batches should be mutually exclusive between arbitrary SVM userptr and generating a dma-fence? Speaking of entirely different things ... the virtio-gpu bit really doesn't belong in this patch. Cheers, Daniel ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
Re: [Intel-gfx] [PATCH 03/25] dma-buf.rst: Document why idenfinite fences are a bad idea
Hi, On Tue, 7 Jul 2020 at 21:13, Daniel Vetter wrote: > Comes up every few years, gets somewhat tedious to discuss, let's > write this down once and for all. Thanks for writing this up! I wonder if any of the notes from my reply to the previous-version thread would be helpful to more explicitly encode the carrot of dma-fence's positive guarantees, rather than just the stick of 'don't do this'. ;) Either way, this is: Acked-by: Daniel Stone > What I'm not sure about is whether the text should be more explicit in > flat out mandating the amdkfd eviction fences for long running compute > workloads or workloads where userspace fencing is allowed. ... or whether we just say that you can never use dma-fence in conjunction with userptr. Cheers, Daniel ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
Re: [Intel-gfx] [PATCH 01/25] dma-fence: basic lockdep annotations
Hi, On Wed, 8 Jul 2020 at 16:13, Daniel Vetter wrote: > On Wed, Jul 8, 2020 at 4:57 PM Christian König > wrote: > > Could we merge this controlled by a separate config option? > > > > This way we could have the checks upstream without having to fix all the > > stuff before we do this? > > Since it's fully opt-in annotations nothing blows up if we don't merge > any annotations. So we could start merging the first 3 patches. After > that the fun starts ... > > My rough idea was that first I'd try to tackle display, thus far > there's 2 actual issues in drivers: > - amdgpu has some dma_resv_lock in commit_tail, plus a kmalloc. I > think those should be fairly easy to fix (I'd try a stab at them even) > - vmwgfx has a full on locking inversion with dma_resv_lock in > commit_tail, and that one is functional. Not just reading something > which we can safely assume to be invariant anyway (like the tmz flag > for amdgpu, or whatever it was). > > I've done a pile more annotations patches for other atomic drivers > now, so hopefully that flushes out any remaining offenders here. Since > some of the annotations are in helper code worst case we might need a > dev->mode_config.broken_atomic_commit flag to disable them. At least > for now I have 0 plans to merge any of these while there's known > unsolved issues. Maybe if some drivers take forever to get fixed we > can then apply some duct-tape for the atomic helper annotation patch. > Instead of a flag we can also copypasta the atomic_commit_tail hook, > leaving the annotations out and adding a huge warning about that. How about an opt-in drm_driver DRIVER_DEADLOCK_HAPPY flag? At first this could just disable the annotations and nothing else, but as we see the annotations gaining real-world testing and maturity, we could eventually make it taint the kernel. Cheers, Daniel ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
Re: [Intel-gfx] [PATCH 03/18] dma-fence: basic lockdep annotations
Hi, Jumping in after a couple of weeks where I've paged most everything out of my brain ... On Fri, 19 Jun 2020 at 10:43, Daniel Vetter wrote: > On Fri, Jun 19, 2020 at 10:13:35AM +0100, Chris Wilson wrote: > > > The proposed patches might very well encode the wrong contract, that's > > > all up for discussion. But fundamentally questioning that we need one > > > is missing what upstream is all about. > > > > Then I have not clearly communicated, as my opinion is not that > > validation is worthless, but that the implementation is enshrining a > > global property on a low level primitive that prevents it from being > > used elsewhere. And I want to replace completion [chains] with fences, and > > bio with fences, and closures with fences, and what other equivalencies > > there are in the kernel. The fence is as central a locking construct as > > struct completion and deserves to be a foundational primitive provided > > by kernel/ used throughout all drivers for discrete problem domains. > > > > This is narrowing dma_fence whereby adding > > struct lockdep_map *dma_fence::wait_map > > and annotating linkage, allows you to continue to specify that all > > dma_fence used for a particular purpose must follow common rules, > > without restricting the primitive for uses outside of this scope. > > Somewhere else in this thread I had discussions with Jason Gunthorpe about > this topic. It might maybe change somewhat depending upon exact rules, but > his take is very much "I don't want dma_fence in rdma". Or pretty close to > that at least. > > Similar discussions with habanalabs, they're using dma_fence internally > without any of the uapi. Discussion there has also now concluded that it's > best if they remove them, and simply switch over to a wait_queue or > completion like every other driver does. > > The next round of the patches already have a paragraph to at least > somewhat limit how non-gpu drivers use dma_fence. And I guess actual > consensus might be pointing even more strongly at dma_fence being solely > something for gpus and closely related subsystem (maybe media) for syncing > dma-buf access. > > So dma_fence as general replacement for completion chains I think just > wont happen. > > What might make sense is if e.g. the lockdep annotations could be reused, > at least in design, for wait_queue or completion or anything else > really. I do think that has a fair chance compared to the automagic > cross-release annotations approach, which relied way too heavily on > guessing where barriers are. My experience from just a bit of playing > around with these patches here and discussing them with other driver > maintainers is that accurately deciding where critical sections start and > end is a job for humans only. And if you get it wrong, you will have a > false positive. > > And you're indeed correct that if we'd do annotations for completions and > wait queues, then that would need to have a class per semantically > equivalent user, like we have lockdep classes for mutexes, not just one > overall. > > But dma_fence otoh is something very specific, which comes with very > specific rules attached - it's not a generic wait_queue at all. Originally > it did start out as one even, but it is a very specialized wait_queue. > > So there's imo two cases: > > - Your completion is entirely orthogonal of dma_fences, and can never ever > block a dma_fence. Don't use dma_fence for this, and no problem. It's > just another wait_queue somewhere. > > - Your completion can eventually, maybe through lots of convolutions and > depdencies, block a dma_fence. In that case full dma_fence rules apply, > and the only thing you can do with a custom annotation is make the rules > even stricter. E.g. if a sub-timeline in the scheduler isn't allowed to > take certain scheduler locks. But the userspace visible/published fence > do take them, maybe as part of command submission or retirement. > Entirely hypotethical, no idea any driver actually needs this. I don't claim to understand the implementation of i915's scheduler and GEM handling, and it seems like there's some public context missing here. But to me, the above is a good statement of what I (and a lot of other userspace) have been relying on - that dma-fence is a very tightly scoped thing which is very predictable but in extremis. It would be great to have something like this enshrined in dma-fence documentation, visible to both kernel and external users. The properties we've so far been assuming for the graphics pipeline - covering production & execution of vertex/fragment workloads on the GPU, framebuffer display, and to the extent this is necessary involving compute - are something like this: A single dma-fence with no dependencies represents (the tail of) a unit of work, which has been all but committed to the hardware. Once committed to the hardware, this work will complete (successfully or in error) in bounded time. The unit of work referred to
Re: [Intel-gfx] [PATCH 03/18] dma-fence: basic lockdep annotations
Hi, On Thu, 11 Jun 2020 at 09:44, Dave Airlie wrote: > On Thu, 11 Jun 2020 at 18:01, Chris Wilson wrote: > > Introducing a global lockmap that cannot capture the rules correctly, > > Can you document the rules all drivers should be following then, > because from here it looks to get refactored every version of i915, > and it would be nice if we could all aim for the same set of things > roughly. We've already had enough problems with amdgpu vs i915 vs > everyone else with fences, if this stops that in the future then I'd > rather we have that than just some unwritten rules per driver and > untestable. As someone who has sunk a bunch of work into explicit-fencing awareness in my compositor so I can never be blocked, I'd be disappointed if the infrastructure was ultimately pointless because the documented fencing rules were \_o_/ or thereabouts. Lockdep definitely isn't my area of expertise so I can't comment on the patch per se, but having something to ensure we don't hit deadlocks sure seems a lot better than nothing. Cheers, Daniel ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
Re: [pull] amdgpu, amdkfd drm-next-5.8
Hi, On Thu, 14 May 2020 at 14:36, Alex Deucher wrote: > On Thu, May 14, 2020 at 5:42 AM Daniel Stone wrote: > > Did this ever go through uAPI review? It's been pushed to the kernel > > before Mesa review was complete. Even then, Mesa only uses it when > > behind a magic environment variable, rather than through the EGL > > extensions which have been specifically designed for protected content > > (EGL_EXT_protected_content, protected_surface, etc). The winsys > > usecase was based on a Wayland system which seems like it will only > > work when composition bypass is available - not using any of the > > Wayland protected-content extensions, and it's completely unclear what > > will happen if composition bypass can't actually be achieved. > > > > I don't think this should be landing before all those open questions > > have been answered. We're trying to come up with a good and coherent > > story for handling protected content, and I'd rather not see AMD > > landing its own uAPI which might completely contradict that. > > Well, the patches have been public for months and we have a UAPI and > mesa userspace which works for our use cases and the mesa side has > been merged and the recent comments on the MR didn't seem like show > stoppers. As a generic compositor author, it's pretty important for me to understand what happens when trying to access protected content. Does the import simply fail? Does it sample rubbish? Does my context killed? etc. > I don't disagree with your point, but I feel like agreeing > on a what we want to do for EGL protected content could drag out for > months or years. I don't really see what the problem is, but it would've been nice if it was at least attempted, rather than just parachuted in ... I know I'm going to end up getting bug reports about it for Wayland/Weston, and then all of a sudden it's become my problem that this was just dropped in as a vendor-specific feature in a vendor-specific island. Cheers, Daniel ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
Re: [pull] amdgpu, amdkfd drm-next-5.8
Hi Alex, On Thu, 30 Apr 2020 at 22:30, Alex Deucher wrote: > UAPI: > - Add amdgpu UAPI for encrypted GPU memory > Used by: https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4401 Did this ever go through uAPI review? It's been pushed to the kernel before Mesa review was complete. Even then, Mesa only uses it when behind a magic environment variable, rather than through the EGL extensions which have been specifically designed for protected content (EGL_EXT_protected_content, protected_surface, etc). The winsys usecase was based on a Wayland system which seems like it will only work when composition bypass is available - not using any of the Wayland protected-content extensions, and it's completely unclear what will happen if composition bypass can't actually be achieved. I don't think this should be landing before all those open questions have been answered. We're trying to come up with a good and coherent story for handling protected content, and I'd rather not see AMD landing its own uAPI which might completely contradict that. Cheers, Daniel ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
Re: gitlab.fd.o financial situation and impact on services
Hi Jan, On Fri, 28 Feb 2020 at 10:09, Jan Engelhardt wrote: > On Friday 2020-02-28 08:59, Daniel Stone wrote: > >I believe that in January, we had $2082 of network cost (almost > >entirely egress; ingress is basically free) and $1750 of > >cloud-storage cost (almost all of which was download). That's based > >on 16TB of cloud-storage (CI artifacts, container images, file > >uploads, Git LFS) egress and 17.9TB of other egress (the web service > >itself, repo activity). Projecting that out [×12 for a year] gives > >us roughly $45k of network activity alone, > > I had come to a similar conclusion a few years back: It is not very > economic to run ephemereal buildroots (and anything like it) between > two (or more) "significant locations" of which one end is located in > a Large Cloud datacenter like EC2/AWS/etc. > > As for such usecases, me and my surrounding peers have used (other) > offerings where there is 50 TB free network/month, and yes that may > have entailed doing more adminning than elsewhere - but an admin > appreciates $2000 a lot more than a corporation, too. Yes, absolutely. For context, our storage & network costs have increased >10x in the past 12 months (~$320 Jan 2019), >3x in the past 6 months (~$1350 July 2019), and ~2x in the past 3 months (~$2000 Oct 2019). I do now (personally) think that it's crossed the point at which it would be worthwhile paying an admin to solve the problems that cloud services currently solve for us - which wasn't true before. Such an admin could also deal with things like our SMTP delivery failure rate, which in the past year has spiked over 50% (see previous email), demand for new services such as Discourse which will enable user support without either a) users having to subscribe to a mailing list, or b) bug trackers being cluttered up with user requests and other non-bugs, etc. Cheers, Daniel ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
Re: [Mesa-dev] [Intel-gfx] gitlab.fd.o financial situation and impact on services
On Fri, 28 Feb 2020 at 10:06, Erik Faye-Lund wrote: > On Fri, 2020-02-28 at 11:40 +0200, Lionel Landwerlin wrote: > > Yeah, changes on vulkan drivers or backend compilers should be > > fairly > > sandboxed. > > > > We also have tools that only work for intel stuff, that should never > > trigger anything on other people's HW. > > > > Could something be worked out using the tags? > > I think so! We have the pre-defined environment variable > CI_MERGE_REQUEST_LABELS, and we can do variable conditions: > > https://docs.gitlab.com/ee/ci/yaml/#onlyvariablesexceptvariables > > That sounds like a pretty neat middle-ground to me. I just hope that > new pipelines are triggered if new labels are added, because not > everyone is allowed to set labels, and sometimes people forget... There's also this which is somewhat more robust: https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2569 Cheers, Daniel ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
Re: [Intel-gfx] gitlab.fd.o financial situation and impact on services
On Fri, 28 Feb 2020 at 08:48, Dave Airlie wrote: > On Fri, 28 Feb 2020 at 18:18, Daniel Stone wrote: > > The last I looked, Google GCP / Amazon AWS / Azure were all pretty > > comparable in terms of what you get and what you pay for them. > > Obviously providers like Packet and Digital Ocean who offer bare-metal > > services are cheaper, but then you need to find someone who is going > > to properly administer the various machines, install decent > > monitoring, make sure that more storage is provisioned when we need > > more storage (which is basically all the time), make sure that the > > hardware is maintained in decent shape (pretty sure one of the fd.o > > machines has had a drive in imminent-failure state for the last few > > months), etc. > > > > Given the size of our service, that's a much better plan (IMO) than > > relying on someone who a) isn't an admin by trade, b) has a million > > other things to do, and c) hasn't wanted to do it for the past several > > years. But as long as that's the resources we have, then we're paying > > the cloud tradeoff, where we pay more money in exchange for fewer > > problems. > > Admin for gitlab and CI is a full time role anyways. The system is > definitely not self sustaining without time being put in by you and > anholt still. If we have $75k to burn on credits, and it was diverted > to just pay an admin to admin the real hw + gitlab/CI would that not > be a better use of the money? I didn't know if we can afford $75k for > an admin, but suddenly we can afford it for gitlab credits? s/gitlab credits/GCP credits/ I took a quick look at HPE, which we previously used for bare metal, and it looks like we'd be spending $25-50k (depending on how much storage you want to provision, how much room you want to leave to provision more storage later, how much you care about backups) to run a similar level of service so that'd put a bit of a dint in your year-one budget. The bare-metal hosting providers also add up to more expensive than you might think, again especially if you want either redundancy or just backups. > > Yes, we could federate everything back out so everyone runs their own > > builds and executes those. Tinderbox did something really similar to > > that IIRC; not sure if Buildbot does as well. Probably rules out > > pre-merge testing, mind. > > Why? does gitlab not support the model? having builds done in parallel > on runners closer to the test runners seems like it should be a thing. > I guess artifact transfer would cost less then as a result. It does support the model but if every single build executor is also compiling Mesa from scratch locally, how long do you think that's going to take? > > Again, if you want everything to be centrally > > designed/approved/monitored/controlled, that's a fine enough idea, and > > I'd be happy to support whoever it was who was doing that for all of > > fd.o. > > I don't think we have any choice but to have someone centrally > controlling it, You can't have a system in place that lets CI users > burn largs sums of money without authorisation, and that is what we > have now. OK, not sure who it is who's going to be approving every update to every .gitlab-ci.yml in the repository, or maybe we just have zero shared runners and anyone who wants to do builds can BYO. ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
Re: [Intel-gfx] gitlab.fd.o financial situation and impact on services
On Fri, 28 Feb 2020 at 03:38, Dave Airlie wrote: > b) we probably need to take a large step back here. > > Look at this from a sponsor POV, why would I give X.org/fd.o > sponsorship money that they are just giving straight to google to pay > for hosting credits? Google are profiting in some minor way from these > hosting credits being bought by us, and I assume we aren't getting any > sort of discounts here. Having google sponsor the credits costs google > substantially less than having any other company give us money to do > it. The last I looked, Google GCP / Amazon AWS / Azure were all pretty comparable in terms of what you get and what you pay for them. Obviously providers like Packet and Digital Ocean who offer bare-metal services are cheaper, but then you need to find someone who is going to properly administer the various machines, install decent monitoring, make sure that more storage is provisioned when we need more storage (which is basically all the time), make sure that the hardware is maintained in decent shape (pretty sure one of the fd.o machines has had a drive in imminent-failure state for the last few months), etc. Given the size of our service, that's a much better plan (IMO) than relying on someone who a) isn't an admin by trade, b) has a million other things to do, and c) hasn't wanted to do it for the past several years. But as long as that's the resources we have, then we're paying the cloud tradeoff, where we pay more money in exchange for fewer problems. > If our current CI architecture is going to burn this amount of money a > year and we hadn't worked this out in advance of deploying it then I > suggest the system should be taken offline until we work out what a > sustainable system would look like within the budget we have, whether > that be never transferring containers and build artifacts from the > google network, just having local runner/build combos etc. Yes, we could federate everything back out so everyone runs their own builds and executes those. Tinderbox did something really similar to that IIRC; not sure if Buildbot does as well. Probably rules out pre-merge testing, mind. The reason we hadn't worked everything out in advance of deploying is because Mesa has had 3993 MRs in the not long over a year since moving, and a similar number in GStreamer, just taking the two biggest users. At the start it was 'maybe let's use MRs if you want to but make sure everything still goes through the list', and now it's something different. Similarly the CI architecture hasn't been 'designed', so much as that people want to run dEQP and Piglit on their hardware pre-merge in an open fashion that's actually accessible to people, and have just done it. Again, if you want everything to be centrally designed/approved/monitored/controlled, that's a fine enough idea, and I'd be happy to support whoever it was who was doing that for all of fd.o. Cheers, Daniel ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
Re: gitlab.fd.o financial situation and impact on services
Hi Matt, On Thu, 27 Feb 2020 at 23:45, Matt Turner wrote: > We're paying 75K USD for the bandwidth to transfer data from the > GitLab cloud instance. i.e., for viewing the https site, for > cloning/updating git repos, and for downloading CI artifacts/images to > the testing machines (AFAIU). I believe that in January, we had $2082 of network cost (almost entirely egress; ingress is basically free) and $1750 of cloud-storage cost (almost all of which was download). That's based on 16TB of cloud-storage (CI artifacts, container images, file uploads, Git LFS) egress and 17.9TB of other egress (the web service itself, repo activity). Projecting that out gives us roughly $45k of network activity alone, so it looks like this figure is based on a projected increase of ~50%. The actual compute capacity is closer to $1150/month. > I was not aware that we were being charged for anything wrt GitLab > hosting yet (and neither was anyone on my team at Intel that I've > asked). This... kind of needs to be communicated. > > A consistent concern put forth when we were discussing switching to > GitLab and building CI was... how do we pay for it. It felt like that > concern was always handwaved away. I heard many times that if we > needed more runners that we could just ask Google to spin up a few > more. If we needed testing machines they'd be donated. No one > mentioned that all the while we were paying for bandwidth... Perhaps > people building the CI would make different decisions about its > structure if they knew it was going to wipe out the bank account. The original answer is that GitLab themselves offered to sponsor enough credit on Google Cloud to get us started. They used GCP themselves so they could assist us (me) in getting bootstrapped, which was invaluable. After that, Google's open-source program office offered to sponsor us for $30k/year, which was I believe last April. Since then the service usage has increased roughly by a factor of 10, so our 12-month sponsorship is no longer enough to cover 12 months. > What percentage of the bandwidth is consumed by transferring CI > images, etc? Wouldn't 75K USD would be enough to buy all the testing > machines we need and host them within Google or wherever so we don't > need to pay for huge amounts of bandwidth? Unless the Google Cloud Platform starts offering DragonBoards, it wouldn't reduce our bandwidth usage as the corporate network is treated separately for egress. Cheers, Daniel ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
Re: [PATCH][next[ drm/amd/display: fix a couple of spelling mistakes
Hi Colin, On Wed, 26 Jun 2019 at 14:24, Colin King wrote: > There are a couple of spelling mistakes in dm_error messages and > a comment. Fix these. Whilst there, you might fix the '[next[' typo in the commit message. Cheers, Daniel
Fwd: PSA: Mailman changes, From addresses no longer accurate
Hi all, Unfortunately, freedesktop.org's job of sending mail to a huge number of people whilst pretending to be other people, has just got even harder than it was. fd.o can no longer (at least for the moment) send mail with the From addresses of DMARC/DKIM/SPF-protected sender domains. When we try to do so, large providers reject the mail, despite DMARC records explicitly specifying that the mail should be accepted. Worse than that, not only does the immediate delivery attempt fail, but it punishes our sender reputation enough that _other_ mail delivery fails: after we hit a DMARC-related failure, large providers often refuse delivery attempts for mail from non-DMARC-protected domains. As a result, if the sender domain has a DMARC record, we rewrite the From address to be the list's address, preserving the original author in Reply-To. I'm chasing this up through a few channels, but in the meantime, please be aware that the From address on mails may no longer be accurate. If you are attempting to apply patches with git-am, please check that the commit author is not 'Person Name via listname-devel '. If you are replying privately to a list mail, please be _very_ careful that you are replying to the original sender (in Reply-To) and not the list itself. Cheers, Daniel -- Forwarded message - From: Daniel Stone Date: Mon, 11 Feb 2019 at 23:38 Subject: PSA: Mailman changes, From addresses no longer accurate To: , Hi all, We have hit another step change in aggressive anti-spam techniques from major mail providers. Over the past few days, we saw a huge spike in the number of mails we were failing to deliver to GMail and outlook.com in particular. It looks like it is now no longer acceptable for us to break DMARC/DKIM/SPF. These are DNS-based extensions to SMTP, which allow domains to publish policies as to who should be allowed to send email on their behalf. SPF provides source filtering, so e.g. freedesktop.org could specify that no-one should accept mail with a From: *@freedesktop.org unless it came from gabe.freedesktop.org. Mailman completely breaks this: if I specified a filter only allowing Google to send mail for @fooishbar.org, then anyone enforcing SPF would reject receipt of this mail, as it would arrive from fd.o with my From address. DKIM allows domains to publish a public key in DNS, inserting a header into mails sent from their domain cryptographically signing the value of named headers. Mailman breaks this too: changing the Sender header (such that bounces get processed by Mailman, rather than sending a deluge of out-of-office and mailbox-over-quota messages to whoever posts to the list) can break most DKIM signatures. Mailman adding the unsubscribe footer also breaks this; we could make it not add the footer, but in order to do so we'd have to convince ourselves that we were not decreasing our GDPR compliance. DMARC ties the two together, allowing domains to specify whether or not DKIM/SPF should be mandatory, and if they fail, what action should be taken. Despite some domains specifying a fail action of 'none' (receiving MTA to send an advisory report to a named email address, but still allow the email), it seems that popular services still interpret 'none' as 'reject'. As a result, Google in particular is dropping some number of our mails on the floor. This does _not_ just apply to mails which fail DMARC/DKIM/SPF: every mail we send that fails these filters (which is a lot - e.g. Intel and NVIDIA both have SPF) decreases our sender reputation with GMail and causes it to reject legitimate mails. I've reached out to Google through a couple of channels to see what we can do to increase our delivery rate to them. In the meantime, if your mail is hosted by Google, or Outlook, and you think you're missing mails - you probably are. Mailman has also now been reconfigured such that if it spots a potential DMARC violation, it rewrites the From address to be @lists.freedesktop.org, keeping the original author in Reply-To. It also strips DKIM headers. This seems to be about the best we can do, unless and until the major mail service providers offer us some alternate way to send mail. If you are replying privately to someone, you should check very carefully that you are replying to them and not to the list. Unfortunately we don't have a good answer in the long run. The latest published RFC at https://tools.ietf.org/html/rfc6377 suggests that there are no good solutions. If anyone is blessed with an abundance of time and familiar with the current email landscape, I would love to talk to you and get your help to work on this. Unfortunately we don't have the manpower to actually properly monitor email; it can often take a collapse in successful-delivery rates for us to notice. Ultimately, I suspect the best solution is to move most of our discussions to dedicated fora like GitLab issues, or something like Discourse. Fundamentally, the thing we're trying to do (send email
Re: [RFC] drm/amdgpu: Add macros and documentation for format modifiers.
Hi, On Tue, 4 Sep 2018 at 11:44, Christian König wrote: > Am 04.09.2018 um 12:15 schrieb Daniel Stone: > > Right. The conclusion, after people went through and started sorting > > out the kinds of formats for which they would _actually_ export real > > colour buffers for, that most vendors definitely have fewer than > > 115,792,089,237,316,195,423,570,985,008,687,907,853,269,984,665,640,564,039,457,584,007,913,129,639,936 > > possible formats to represent, very likely fewer than > > 340,282,366,920,938,463,463,374,607,431,768,211,456 formats, probably > > fewer than 72,057,594,037,927,936 formats, and even still generally > > fewer than 281,474,976,710,656 if you want to be generous and leave 8 > > bits of the 56 available. > > The problem here is that at least for some parameters we actually don't > know which formats are actually used. > > The following are not real world examples, but just to explain the > general problem. > > The memory configuration for example can be not ASIC specific, but > rather determined by whoever took the ASIC and glued it together with > VRAM on a board. It is not likely that somebody puts all the VRAM chips > on one channel, but it is still perfectly possible. > > Same is true for things like harvesting, e.g. of 16 channels halve of > them could be bad and we need to know which to actually use. Sure, that's fine, but I'm not sure why, for instance, 'half of the memory channels are bad and must be avoided' would be attached to a format in the same way that macrotile size/depth would be? Similarly, what happens if you encode a channel or lane blacklist mask into a modifier, and then you import a buffer with that modifier into a client API on another device. Does that also apply to the other device or only the source device? How then do you handle the capability query between them: is it relevant to device B attempting to import and sample from an image that device A only has a 128-bit memory bus because of SKU configuration? How does generic userspace look at a token which encodes all of this information and know that the buffer is interchangeable between devices? Cheers, Daniel ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
Re: [RFC] drm/amdgpu: Add macros and documentation for format modifiers.
Hi, On Tue, 4 Sep 2018 at 11:05, Daniel Vetter wrote: > On Tue, Sep 4, 2018 at 3:00 AM, Bas Nieuwenhuizen > wrote: > > +/* The chip this is compatible with. > > + * > > + * If compression is disabled, use > > + * - AMDGPU_CHIP_TAHITI for GFX6-GFX8 > > + * - AMDGPU_CHIP_VEGA10 for GFX9+ > > + * > > + * With compression enabled please use the exact chip. > > + * > > + * TODO: Do some generations share DCC format? > > + */ > > +#define AMDGPU_MODIFIER_CHIP_GEN_SHIFT 40 > > +#define AMDGPU_MODIFIER_CHIP_GEN_MASK 0xff > > Do you really need all the combinations here of DCC + gpu gen + tiling > details? When we had the entire discussion with nvidia folks they > eventually agreed that they don't need the massive pile with every > possible combination. Do you really plan to share all these different > things? > > Note that e.g. on i915 we spec some of the tiling depending upon > buffer size and buffer format (because that's how the hw works), not > using explicit modifier flags for everything. Right. The conclusion, after people went through and started sorting out the kinds of formats for which they would _actually_ export real colour buffers for, that most vendors definitely have fewer than 115,792,089,237,316,195,423,570,985,008,687,907,853,269,984,665,640,564,039,457,584,007,913,129,639,936 possible formats to represent, very likely fewer than 340,282,366,920,938,463,463,374,607,431,768,211,456 formats, probably fewer than 72,057,594,037,927,936 formats, and even still generally fewer than 281,474,976,710,656 if you want to be generous and leave 8 bits of the 56 available. If you do use 256 bits in order to represent 115,792,089,237,316,195,423,570,985,008,687,907,853,269,984,665,640,564,039,457,584,007,913,129,639,936 modifiers per format, userspace would start hitting OOM pretty quickly as it attempted to enumerate and negotiate acceptable modifiers. Either that or we need to replace the fixed 64-bit modifier tokens with some kind of eBPF script. Cheers, Daniel ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
Re: [igt-dev] RFC: Migration to Gitlab
Hi Rodrigo, On Wed, 22 Aug 2018 at 17:06, Rodrigo Vivi wrote: > On Wed, Aug 22, 2018 at 10:19:19AM -0400, Adam Jackson wrote: > > On Wed, 2018-08-22 at 16:13 +0300, Jani Nikula wrote: > > > - Sticking to fdo bugzilla and disabling gitlab issues for at least > > > drm-intel for the time being. Doing that migration in the same go is a > > > bit much I think. Reassignment across bugzilla and gitlab will be an > > > issue. > > > > Can you elaborate a bit on the issues here? The actual move-the-bugs > > process has been pretty painless for the parts of xorg we've done so > > far. > > I guess there is nothing against moving the bugs there. The concern is only on > doing everything at once. > > I'm in favor of moving gits for now and after we are confident that > everything is there and working we move the bugs. As Daniel alluded to, the only issue I really have is moving _all_ the kernel repos at once. At the end of the year we'll have easy automatic scaling thanks to the independent services being separated. As it is, all the GitLab services (apart from CI runners) run on a single machine, so we have limited options if it becomes overwhelmed with load. Do you have a particular concern about the repos? e.g. what would you check for to make sure things are 'there and working'? > One question about the bugzilla: > > Will all the referrences on all commit messages get outdated after > bugzilla is dead? > Or bugzilla will stay up for referrence but closed for interaction? > or all old closed stuff are always moved and bugzilla.fd.o as well and > bugzilla.fd.o will be mirroring gitlab? When bugs are migrated from Bugzilla to GitLab, only open bugs are migrated. Closed ones are left in place, as is; open ones have a comment at the end saying that the bug has moved to GitLab, a URL linking to the new GitLab issue, and telling them to please chase it up there. Even when we move everyone completely off Bugzilla, we will keep it as a read-only mirror forever. Even with Phabricator, which very few people ever used, has had all its bugs and code review captured and archived, so we can continue to preserve all the old content and links, without having to run the actual service. Cheers, Daniel ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
Re: [igt-dev] RFC: Migration to Gitlab
Hi, On Wed, 22 Aug 2018 at 15:44, Daniel Vetter wrote: > On Wed, Aug 22, 2018 at 3:13 PM, Jani Nikula > wrote: > > Just a couple of concerns from drm/i915 perspective for starters: > > > > - Patchwork integration. I think we'll want to keep patchwork for at > > least intel-gfx etc. for the time being. IIUC the one thing we need is > > some server side hook to update patchwork on git push. > > > > - Sticking to fdo bugzilla and disabling gitlab issues for at least > > drm-intel for the time being. Doing that migration in the same go is a > > bit much I think. Reassignment across bugzilla and gitlab will be an > > issue. > > Good points, forgot about both. Patchwork reading the mailing list > should keep working as-is, but the update hook needs looking into. All the hooks are retained. gitlab.fd.o pushes to git.fd.o, and git.fd.o still executes all the old hooks. This includes Patchwork, readthedocs, AppVeyor, and whatever else. > For merge requests I think best approach is to enable them very > selectively at first for testing out, and then making a per-subproject > decision whether they make sense. E.g. I think for maintainer-tools > integrating make check and the doc building into gitlab CI would be > sweet, and worth looking into gitlab merge requests just to automate > that. Again best left out of scope for the initial migration. You don't need MRs to do that. You can build a .gitlab-ci.yml file which will run make check or build the docs or whatever, and have that run on pushes. Anyone can then fork the repo, push their changes to that fork, and see the CI results from there. It's like Travis: the CI configuration is a (mutable) part of the codebase, not a separate 'thing' hanging off a specific repo. So if you write the CI pipeline first, you can have people running CI on push completely independently of switching the workflow to use MRs. Cheers, Daniel ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
Re: RFC: Migration to Gitlab
Hi, On Wed, 22 Aug 2018 at 16:02, Emil Velikov wrote: > On 22 August 2018 at 12:44, Daniel Vetter wrote: > > I think it's time to brainstorm a bit about the gitlab migration. Basic > > reasons: > > > > - fd.o admins want to deprecate shell accounts and hand-rolled > > infrastructure, because it's a pain to keep secure > > > > - gitlab will allow us to add committers on our own, greatly > > simplifying that process (and offloading that task from fd.o admins). > > Random thought - I really wish the admins spoke early and louder about issues. > From infra to manpower and adhoc tool maintenance. I thought I mostly had it covered, but fair enough. What knowledge are you missing and how should it best be delivered? One first-order issue is that as documented at https://www.freedesktop.org/wiki/AccountRequests/ creating accounts requires manual admin intervention. You can also search the 'freedesktop.org' product on Bugzilla to see the amount of time we spend shuffling around GPG & SSH keys, which is about the worst possible use of my time. Many other people have had access to drive the ancient shell-script frontend to LDAP before, but for some reason they mostly aren't very enthusiastic about doing it all the time. In the mesa-dev@ thread about Mesa's migration, which is linked from my blog post, I went into quite a lot of detail about why Jenkins was not suitable to roll out across fd.o globally. That knowledge was gained from trial & error, which was a lot of time burnt. The end result is that we don't have any CI, except if people hang Travis/AppVeyor off GitHub mirrors. You've personally seen what's involved in setting up Git repository hooks so we can build docs. We can't give access to let people work on those, without giving them direct access to the literal Git repository itself on disk. The hooks mostly involve bespoke sets of rsync jobs and hand-managed SSH credentials and external services to build docs and so on and so forth. None of this is accessible to a newcomer who wants to make a non-code contribution: you have to find someone with access to the repo to go bash some shell scripts directly and hope you didn't screw it up. None of this is auditable, so if the repo mysteriously gets wiped, then hopefully there are some backups somewhere. But there definitely aren't any logs. Luckily we prevent most people from having access to most repositories via a mandatory 'shell' which only allows people to push Git; this was written by hand by us 12 years ago. What else? Our fork of Patchwork was until recently based on an ancient fork of Django, in a bespoke unreproducible production environment. Bugzilla is patched for spam and templates (making upgrades complex), yet we still have a surprising amount of spam pass through. There's no way to delete spam, but you have to manually move every bug to the 'spam' group, then go through and delete the user which involves copying & pasting the email and a few clicks per user. Mailman is patched to support Recaptcha, bringing more upgrade fun. ikiwiki breaks all the time because it's designed to have the public-access web server on the same host as the writeable Git repositories. Our servers are several years old and approaching life expiry, and we have no more spare disk. You can see in #freedesktop IRC the constant network connectivity issues people have to PSU almost every day. Our attempts to root-cause and solve those have got nowhere. I could go on, but the 'moved elsewhere' list in https://gitlab.freedesktop.org/freedesktop/freedesktop/issues/2 indicates that we do have problems to solve, and that changing peoples' SSH keys for them isn't the best way for us to be spending the extremely limited time that we do have. > > For the full in-depth writeup of everything, see > > > > https://www.fooishbar.org/blog/gitlab-fdo-introduction/ If you haven't seen this, or the post it was linked from, they would be a good start: https://lists.freedesktop.org/archives/freedesktop/2018-July/000370.html There's also the long thread on mesa-dev a long time back which covers a lot of this ground already. > > - Figuring out the actual migration - we've been adding a pile of > > committers since fd.o LDAP was converted to gitlab once back in > > spring. We need to at least figure out how to move the new > > accounts/committers. > > As a observer, allow me to put some ideas. You've mostly covered them > all, my emphasis is to seriously stick with _one_ thing at a time. > Attempting to do multiple things in parallel will end up with > sub-optimal results. > > - (at random point) cleanup the committers list - people who have not > contributed in the last 1 year? libdrm was previously covered under the Mesa ACL. Here are the other committer lists, which you can see via 'getent group' on annarchy or another machine: amdkfd:x:922:fxkuehl,agd5f,deathsimple,danvet,jazhou,jbridgman,hwentland,tstdenis,gitlab-mirror,rui drm-meson:x:936:narmstrong drm:x:940:airlied,danvet
Re: gitlab migration
Hi Alexander, On 12 June 2018 at 14:53, Alexander E. Patrakov wrote: > Daniel Stone : >> > That said, I could not even create an account with a noscript/xhtml basic >> > browser (if you want to test that, install the famous noscript module with >> > an >> > empty "white list" in firefox or chromium, or use lynx or links or w3m). >> >> No need to test; it's guaranteed to fail since we require Recaptcha >> for login due to massive spam issues. > > Have you tested whether Chinese or Russian users can login or sign up? > Asking because Recaptcha was blocked for quite a long time by Russian > authorities. We do have active users from both countries currently. Cheers, Daniel ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
Re: gitlab migration
Hi Sylvain, On 12 June 2018 at 13:38, wrote: > On Tue, Jun 12, 2018 at 12:34:31PM +0100, Daniel Stone wrote: >> GitLab has a pretty comprehensive and well-documented API which might >> help if you don't want to use a web browser. There are also clients >> like 'lab': https://gitlab.com/zaquestion/lab which provide a good CLI >> interface. > > Those "web APIs" usually require the use of an heavy javascript browser for > authentification or "app authorization". > > That said, I could not even create an account with a noscript/xhtml basic > browser (if you want to test that, install the famous noscript module with an > empty "white list" in firefox or chromium, or use lynx or links or w3m). No need to test; it's guaranteed to fail since we require Recaptcha for login due to massive spam issues. Cheers, Daniel ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
Re: gitlab migration
Hi Michel, On 11 June 2018 at 11:33, Michel Dänzer wrote: > On 2018-06-08 08:08 PM, Adam Jackson wrote: >> I'd like us to start moving repos and bug tracking into gitlab. >> Hopefully everyone's aware that gitlab exists and why fdo projects are >> migrating to it. If not, the thread about Mesa's migration provides >> some useful background: >> >> https://lists.x.org/archives/mesa-dev/2018-May/195634.html >> >> This should be mostly mechanical, except for moving some of the older >> junk into the archive and deciding which drivers _not_ to move yet (I >> imagine intel has some of its processes hooked up to bz, for example). > > As the maintainer of xf86-video-amdgpu and -ati, I'm fine with migrating > these to GitLab for Git and patch review. > > However, I'm not sure what to do about bugs/issues. My first thought was > to allow creating new issues in GitLab and disable creating new reports > in Bugzilla, but not to migrate existing reports from Bugzilla. However, > it still happens fairly often that a report is initially filed against > the Xorg drivers, even though it's actually an issue in the X server, > Mesa or the kernel (old habits die hard...). Therefore, I'm inclined to > stick to Bugzilla until at least the X server and Mesa have moved to > GitLab for issue tracking, to hopefully allow moving such misfiled issues. One thing some Mesa people said during that discussion is that they like the ability to move issues between Mesa and the kernel, which won't be possible if they're on split systems. So I probably wouldn't hold my breath for that to be honest. Cheers, Daniel ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
Re: gitlab migration
Hi Sylvain, It looks like Mutt is mangling email addresses - I've fixed Michel's. On 11 June 2018 at 14:30, wrote: > On Mon, Jun 11, 2018 at 12:33:52PM +0200, Michel Dänzer wrote: >> Adding the amd-gfx list, in cases somebody there has concerns or other >> feedback. > > For feedback: > I attempted a migration on gitlab of my repos but I was blocked because it's > not noscript/xhtml basic browser friendly. > > I was successfull with launchpad.net (which has now git support). > > I did not test if the issue system was noscript/xhtml basic friendly though. GitLab has a pretty comprehensive and well-documented API which might help if you don't want to use a web browser. There are also clients like 'lab': https://gitlab.com/zaquestion/lab which provide a good CLI interface. Cheers, Daniel ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
Re: RFC for a render API to support adaptive sync and VRR
Hi, On 20 April 2018 at 21:32, Manasi Navarewrote: > On Wed, Apr 18, 2018 at 09:39:02AM +0200, Daniel Vetter wrote: >> On Wed, Apr 18, 2018 at 5:58 AM, Keith Packard wrote: >> > I'd also encourage using a single unit for all of these values, >> > preferably nanoseconds. Absolute times should all be referenced to >> > CLOCK_MONOTONIC. >> >> +1 on everything Keith said. I got somehow dragged in khr vk >> discussions around preventing micro-stuttering, and consensus seems to >> be that timestamps for scheduling frames is the way to go, most likely >> absolute ones (not everything is running Linux unfortunately, so can't >> go outright and claim it's guaranteed to be CLOCK_MONOTONIC). > > And yes I also got consensus from Mesa and media folks about using the > absolute timestamp for scheduling the frames and then the driver will > modify the vblank logic to "present no earlier than the timestamp" Whilst we're all piling in, here's another AOL. We didn't yet implement this for Wayland because of the fun involved in adding a FIFO mode to a very mailbox window system, but at some point we'll have to suck it up and push it. Cheers, Daniel ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
Re: [PATCH 00/24] drm_framebuffer boilerplate removal
Hi Alex, On 30 March 2018 at 15:47, Alex Deucher <alexdeuc...@gmail.com> wrote: > On Fri, Mar 30, 2018 at 10:11 AM, Daniel Stone <dani...@collabora.com> wrote: >> I intend to remove create_handle when all drivers are converted over >> to placing BOs directly inside drm_framebuffer. For most drivers >> there's a relatively easy conversion to using the helpers for >> basically all framebuffer handling and fbdev emulation as well, though >> that's a bit further than I was willing to go without hardware to test >> on ... > > Series is: > Acked-by: Alex Deucher <alexander.deuc...@amd.com> Thanks a lot for the review! Are you taking the radeon/amdgpu patches through your tree? They don't have any dependencies on core code. Cheers, Daniel ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
[PATCH 24/24] drm/amdgpu: Move GEM BO to drm_framebuffer
Since drm_framebuffer can now store GEM objects directly, place them there rather than in our own subclass. As this makes the framebuffer create_handle and destroy functions the same as the GEM framebuffer helper, we can reuse those. Signed-off-by: Daniel Stone <dani...@collabora.com> Cc: Alex Deucher <alexander.deuc...@amd.com> Cc: Christian König <christian.koe...@amd.com> Cc: David (ChunMing) Zhou <david1.z...@amd.com> Cc: amd-gfx@lists.freedesktop.org --- drivers/gpu/drm/amd/amdgpu/amdgpu_device.c| 6 ++-- drivers/gpu/drm/amd/amdgpu/amdgpu_display.c | 36 +-- drivers/gpu/drm/amd/amdgpu/amdgpu_fb.c| 10 +++ drivers/gpu/drm/amd/amdgpu/amdgpu_mode.h | 1 - drivers/gpu/drm/amd/amdgpu/dce_v10_0.c| 17 --- drivers/gpu/drm/amd/amdgpu/dce_v11_0.c| 17 --- drivers/gpu/drm/amd/amdgpu/dce_v6_0.c | 17 --- drivers/gpu/drm/amd/amdgpu/dce_v8_0.c | 17 --- drivers/gpu/drm/amd/amdgpu/dce_virtual.c | 4 +-- drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 11 +++ 10 files changed, 40 insertions(+), 96 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c index 34af664b9f93..fbc0676c6bcc 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c @@ -2539,7 +2539,7 @@ int amdgpu_device_suspend(struct drm_device *dev, bool suspend, bool fbcon) /* unpin the front buffers and cursors */ list_for_each_entry(crtc, >mode_config.crtc_list, head) { struct amdgpu_crtc *amdgpu_crtc = to_amdgpu_crtc(crtc); - struct amdgpu_framebuffer *rfb = to_amdgpu_framebuffer(crtc->primary->fb); + struct drm_framebuffer *fb = crtc->primary->fb; struct amdgpu_bo *robj; if (amdgpu_crtc->cursor_bo) { @@ -2551,10 +2551,10 @@ int amdgpu_device_suspend(struct drm_device *dev, bool suspend, bool fbcon) } } - if (rfb == NULL || rfb->obj == NULL) { + if (fb == NULL || fb->obj[0] == NULL) { continue; } - robj = gem_to_amdgpu_bo(rfb->obj); + robj = gem_to_amdgpu_bo(fb->obj[0]); /* don't unpin kernel fb objects */ if (!amdgpu_fbdev_robj_is_fb(adev, robj)) { r = amdgpu_bo_reserve(robj, true); diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_display.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_display.c index 93f700ab1bfb..b83ae998fe27 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_display.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_display.c @@ -35,6 +35,7 @@ #include #include #include +#include #include static void amdgpu_display_flip_callback(struct dma_fence *f, @@ -151,8 +152,6 @@ int amdgpu_display_crtc_page_flip_target(struct drm_crtc *crtc, struct drm_device *dev = crtc->dev; struct amdgpu_device *adev = dev->dev_private; struct amdgpu_crtc *amdgpu_crtc = to_amdgpu_crtc(crtc); - struct amdgpu_framebuffer *old_amdgpu_fb; - struct amdgpu_framebuffer *new_amdgpu_fb; struct drm_gem_object *obj; struct amdgpu_flip_work *work; struct amdgpu_bo *new_abo; @@ -174,15 +173,13 @@ int amdgpu_display_crtc_page_flip_target(struct drm_crtc *crtc, work->async = (page_flip_flags & DRM_MODE_PAGE_FLIP_ASYNC) != 0; /* schedule unpin of the old buffer */ - old_amdgpu_fb = to_amdgpu_framebuffer(crtc->primary->fb); - obj = old_amdgpu_fb->obj; + obj = crtc->primary->fb->obj[0]; /* take a reference to the old object */ work->old_abo = gem_to_amdgpu_bo(obj); amdgpu_bo_ref(work->old_abo); - new_amdgpu_fb = to_amdgpu_framebuffer(fb); - obj = new_amdgpu_fb->obj; + obj = fb->obj[0]; new_abo = gem_to_amdgpu_bo(obj); /* pin the new buffer */ @@ -482,28 +479,9 @@ bool amdgpu_display_ddc_probe(struct amdgpu_connector *amdgpu_connector, return true; } -static void amdgpu_display_user_framebuffer_destroy(struct drm_framebuffer *fb) -{ - struct amdgpu_framebuffer *amdgpu_fb = to_amdgpu_framebuffer(fb); - - drm_gem_object_put_unlocked(amdgpu_fb->obj); - drm_framebuffer_cleanup(fb); - kfree(amdgpu_fb); -} - -static int amdgpu_display_user_framebuffer_create_handle( - struct drm_framebuffer *fb, - struct drm_file *file_priv, - unsigned int *handle) -{ - struct amdgpu_framebuffer *amdgpu_fb = to_amdgpu_framebuffer(fb); - - return drm_gem_handle_create(file_priv, amdgpu_fb->obj, handle); -} - static const struct drm_framebuffer_funcs
[PATCH 22/24] drm/radeon: Move GEM BO to drm_framebuffer
Since drm_framebuffer can now store GEM objects directly, place them there rather than in our own subclass. As this makes the framebuffer create_handle and destroy functions the same as the GEM framebuffer helper, we can reuse those. Signed-off-by: Daniel Stone <dani...@collabora.com> Cc: Alex Deucher <alexander.deuc...@amd.com> Cc: Christian König <christian.koe...@amd.com> Cc: David (ChunMing) Zhou <david1.z...@amd.com> Cc: amd-gfx@lists.freedesktop.org --- drivers/gpu/drm/radeon/atombios_crtc.c | 10 +- drivers/gpu/drm/radeon/radeon_device.c | 4 ++-- drivers/gpu/drm/radeon/radeon_display.c | 31 +++-- drivers/gpu/drm/radeon/radeon_fb.c | 8 drivers/gpu/drm/radeon/radeon_legacy_crtc.c | 11 -- drivers/gpu/drm/radeon/radeon_mode.h| 1 - 6 files changed, 22 insertions(+), 43 deletions(-) diff --git a/drivers/gpu/drm/radeon/atombios_crtc.c b/drivers/gpu/drm/radeon/atombios_crtc.c index 02baaaf20e9d..028a811c1462 100644 --- a/drivers/gpu/drm/radeon/atombios_crtc.c +++ b/drivers/gpu/drm/radeon/atombios_crtc.c @@ -1176,7 +1176,7 @@ static int dce4_crtc_do_set_base(struct drm_crtc *crtc, /* If atomic, assume fb object is pinned & idle & fenced and * just update base pointers */ - obj = radeon_fb->obj; + obj = radeon_fb->base.obj[0]; rbo = gem_to_radeon_bo(obj); r = radeon_bo_reserve(rbo, false); if (unlikely(r != 0)) @@ -1442,7 +1442,7 @@ static int dce4_crtc_do_set_base(struct drm_crtc *crtc, if (!atomic && fb && fb != crtc->primary->fb) { radeon_fb = to_radeon_framebuffer(fb); - rbo = gem_to_radeon_bo(radeon_fb->obj); + rbo = gem_to_radeon_bo(radeon_fb->base.obj[0]); r = radeon_bo_reserve(rbo, false); if (unlikely(r != 0)) return r; @@ -1490,7 +1490,7 @@ static int avivo_crtc_do_set_base(struct drm_crtc *crtc, target_fb = crtc->primary->fb; } - obj = radeon_fb->obj; + obj = radeon_fb->base.obj[0]; rbo = gem_to_radeon_bo(obj); r = radeon_bo_reserve(rbo, false); if (unlikely(r != 0)) @@ -1642,7 +1642,7 @@ static int avivo_crtc_do_set_base(struct drm_crtc *crtc, if (!atomic && fb && fb != crtc->primary->fb) { radeon_fb = to_radeon_framebuffer(fb); - rbo = gem_to_radeon_bo(radeon_fb->obj); + rbo = gem_to_radeon_bo(radeon_fb->base.obj[0]); r = radeon_bo_reserve(rbo, false); if (unlikely(r != 0)) return r; @@ -2153,7 +2153,7 @@ static void atombios_crtc_disable(struct drm_crtc *crtc) struct radeon_bo *rbo; radeon_fb = to_radeon_framebuffer(crtc->primary->fb); - rbo = gem_to_radeon_bo(radeon_fb->obj); + rbo = gem_to_radeon_bo(radeon_fb->base.obj[0]); r = radeon_bo_reserve(rbo, false); if (unlikely(r)) DRM_ERROR("failed to reserve rbo before unpin\n"); diff --git a/drivers/gpu/drm/radeon/radeon_device.c b/drivers/gpu/drm/radeon/radeon_device.c index e415d2c097a7..30c5bc20a60b 100644 --- a/drivers/gpu/drm/radeon/radeon_device.c +++ b/drivers/gpu/drm/radeon/radeon_device.c @@ -1599,10 +1599,10 @@ int radeon_suspend_kms(struct drm_device *dev, bool suspend, } } - if (rfb == NULL || rfb->obj == NULL) { + if (rfb == NULL || rfb->base.obj[0] == NULL) { continue; } - robj = gem_to_radeon_bo(rfb->obj); + robj = gem_to_radeon_bo(rfb->base.obj[0]); /* don't unpin kernel fb objects */ if (!radeon_fbdev_robj_is_fb(rdev, robj)) { r = radeon_bo_reserve(robj, false); diff --git a/drivers/gpu/drm/radeon/radeon_display.c b/drivers/gpu/drm/radeon/radeon_display.c index 26129b2b082d..dc300128283d 100644 --- a/drivers/gpu/drm/radeon/radeon_display.c +++ b/drivers/gpu/drm/radeon/radeon_display.c @@ -32,6 +32,7 @@ #include #include +#include #include #include #include @@ -502,14 +503,14 @@ static int radeon_crtc_page_flip_target(struct drm_crtc *crtc, /* schedule unpin of the old buffer */ old_radeon_fb = to_radeon_framebuffer(crtc->primary->fb); - obj = old_radeon_fb->obj; + obj = old_radeon_fb->base.obj[0]; /* take a reference to the old object */ drm_gem_object_get(obj); work->old_rbo = gem_to_radeon_bo(obj); new_radeon_fb = to_radeon_framebuffer(fb); - obj = new_radeon_fb->obj; + obj = new_radeon_fb->base.obj[0]; new_
[PATCH 23/24] drm/radeon: radeon_framebuffer -> drm_framebuffer
Since drm_framebuffer can now store GEM objects directly, place them there rather than in our own subclass. As this makes the framebuffer create_handle and destroy functions the same as the GEM framebuffer helper, we can reuse those. Signed-off-by: Daniel Stone <dani...@collabora.com> Cc: Alex Deucher <alexander.deuc...@amd.com> Cc: Christian König <christian.koe...@amd.com> Cc: David (ChunMing) Zhou <david1.z...@amd.com> Cc: amd-gfx@lists.freedesktop.org --- drivers/gpu/drm/radeon/atombios_crtc.c | 32 - drivers/gpu/drm/radeon/radeon_device.c | 6 +++--- drivers/gpu/drm/radeon/radeon_display.c | 30 --- drivers/gpu/drm/radeon/radeon_fb.c | 20 +- drivers/gpu/drm/radeon/radeon_legacy_crtc.c | 11 +++--- drivers/gpu/drm/radeon/radeon_mode.h| 7 +-- 6 files changed, 39 insertions(+), 67 deletions(-) diff --git a/drivers/gpu/drm/radeon/atombios_crtc.c b/drivers/gpu/drm/radeon/atombios_crtc.c index 028a811c1462..efbd5816082d 100644 --- a/drivers/gpu/drm/radeon/atombios_crtc.c +++ b/drivers/gpu/drm/radeon/atombios_crtc.c @@ -1145,7 +1145,6 @@ static int dce4_crtc_do_set_base(struct drm_crtc *crtc, struct radeon_crtc *radeon_crtc = to_radeon_crtc(crtc); struct drm_device *dev = crtc->dev; struct radeon_device *rdev = dev->dev_private; - struct radeon_framebuffer *radeon_fb; struct drm_framebuffer *target_fb; struct drm_gem_object *obj; struct radeon_bo *rbo; @@ -1164,19 +1163,15 @@ static int dce4_crtc_do_set_base(struct drm_crtc *crtc, return 0; } - if (atomic) { - radeon_fb = to_radeon_framebuffer(fb); + if (atomic) target_fb = fb; - } - else { - radeon_fb = to_radeon_framebuffer(crtc->primary->fb); + else target_fb = crtc->primary->fb; - } /* If atomic, assume fb object is pinned & idle & fenced and * just update base pointers */ - obj = radeon_fb->base.obj[0]; + obj = target_fb->obj[0]; rbo = gem_to_radeon_bo(obj); r = radeon_bo_reserve(rbo, false); if (unlikely(r != 0)) @@ -1441,8 +1436,7 @@ static int dce4_crtc_do_set_base(struct drm_crtc *crtc, WREG32(EVERGREEN_MASTER_UPDATE_MODE + radeon_crtc->crtc_offset, 0); if (!atomic && fb && fb != crtc->primary->fb) { - radeon_fb = to_radeon_framebuffer(fb); - rbo = gem_to_radeon_bo(radeon_fb->base.obj[0]); + rbo = gem_to_radeon_bo(fb->obj[0]); r = radeon_bo_reserve(rbo, false); if (unlikely(r != 0)) return r; @@ -1463,7 +1457,6 @@ static int avivo_crtc_do_set_base(struct drm_crtc *crtc, struct radeon_crtc *radeon_crtc = to_radeon_crtc(crtc); struct drm_device *dev = crtc->dev; struct radeon_device *rdev = dev->dev_private; - struct radeon_framebuffer *radeon_fb; struct drm_gem_object *obj; struct radeon_bo *rbo; struct drm_framebuffer *target_fb; @@ -1481,16 +1474,12 @@ static int avivo_crtc_do_set_base(struct drm_crtc *crtc, return 0; } - if (atomic) { - radeon_fb = to_radeon_framebuffer(fb); + if (atomic) target_fb = fb; - } - else { - radeon_fb = to_radeon_framebuffer(crtc->primary->fb); + else target_fb = crtc->primary->fb; - } - obj = radeon_fb->base.obj[0]; + obj = target_fb->obj[0]; rbo = gem_to_radeon_bo(obj); r = radeon_bo_reserve(rbo, false); if (unlikely(r != 0)) @@ -1641,8 +1630,7 @@ static int avivo_crtc_do_set_base(struct drm_crtc *crtc, WREG32(AVIVO_D1MODE_MASTER_UPDATE_MODE + radeon_crtc->crtc_offset, 3); if (!atomic && fb && fb != crtc->primary->fb) { - radeon_fb = to_radeon_framebuffer(fb); - rbo = gem_to_radeon_bo(radeon_fb->base.obj[0]); + rbo = gem_to_radeon_bo(fb->obj[0]); r = radeon_bo_reserve(rbo, false); if (unlikely(r != 0)) return r; @@ -2149,11 +2137,9 @@ static void atombios_crtc_disable(struct drm_crtc *crtc) atombios_crtc_dpms(crtc, DRM_MODE_DPMS_OFF); if (crtc->primary->fb) { int r; - struct radeon_framebuffer *radeon_fb; struct radeon_bo *rbo; - radeon_fb = to_radeon_framebuffer(crtc->primary->fb); - rbo = gem_to_radeon_bo(radeon_fb->base.obj[0]); + rbo = gem_to_radeon_bo(crtc->primary->fb->obj[0]); r = radeon_bo_reserve(rbo, false); if
[PATCH 00/24] drm_framebuffer boilerplate removal
Hi, I've been working on a getfb2[0] ioctl, which amongst other things supports multi-planar framebuffers as well as modifiers. getfb currently calls the framebuffer's handle_create hook, which doesn't support multiple planes. Thanks to Noralf's recent work, drivers can just store GEM objects directly in drm_framebuffer. I use this directly in getfb2: we need direct access to the GEM objects and not a vfunc in order to not hand out duplicate GEM names for the same object. This series converts all drivers except for nouveau, which was a little too non-trivial for my comfort, to storing GEM objects directly in drm_framebuffer. For those drivers whose driver_framebuffer struct was nothing but drm_framebuffer + BO, it deletes the driver-specific struct. It also makes use of Noralf's generic framebuffer helpers for create_handle and destroy where possible. I don't have the hardware for most of these drivers, so have had to settle for just staring really hard at the diff. I intend to remove create_handle when all drivers are converted over to placing BOs directly inside drm_framebuffer. For most drivers there's a relatively easy conversion to using the helpers for basically all framebuffer handling and fbdev emulation as well, though that's a bit further than I was willing to go without hardware to test on ... Cheers, Daniel [0]: https://lists.freedesktop.org/archives/dri-devel/2018-March/170512.html ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
Re: [PATCH 0/6] drm: tackle byteorder issues, take two
Hi, On 24 April 2017 at 15:26, Ville Syrjäläwrote: > On Mon, Apr 24, 2017 at 04:54:25PM +0900, Michel Dänzer wrote: >> On 24/04/17 04:36 PM, Gerd Hoffmann wrote: >> > When running in opengl mode there will be a hardware-specific mesa >> > driver in userspace, which will either know what the gpu view is (for >> > example because there is only one way to implement this in hardware) or >> > it can use hardware-specific ioctls to ask the kernel driver what the >> > gpu view is. >> >> Not sure this can be hidden in the OpenGL driver. How would e.g. a >> Wayland compositor or the Xorg modesetting driver know which OpenGL >> format corresponds to a given DRM_FORMAT? > > How are GL formats defined? /me needs to go read the spec again. They aren't, per se. Only relative to 'native formats', which for this discussion is the set of GBM formats, which is in turn just drm_fourcc.h. Cheers, Daniel ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
Re: [RFC] Using DC in amdgpu for upcoming GPU
Hi Harry, I've been loathe to jump in here, not least because both cop roles seem to be taken, but ... On 13 December 2016 at 01:49, Harry Wentlandwrote: > On 2016-12-11 09:57 PM, Dave Airlie wrote: >> On 8 December 2016 at 12:02, Harry Wentland wrote: >> Sharing code is a laudable goal and I appreciate the resourcing >> constraints that led us to the point at which we find ourselves, but >> the way forward involves finding resources to upstream this code, >> dedicated people (even one person) who can spend time on a day by day >> basis talking to people in the open and working upstream, improving >> other pieces of the drm as they go, reading atomic patches and >> reviewing them, and can incrementally build the DC experience on top >> of the Linux kernel infrastructure. Then having the corresponding >> changes in the DC codebase happen internally to correspond to how the >> kernel code ends up looking. Lots of this code overlaps with stuff the >> drm already does, lots of is stuff the drm should be doing, so patches >> to the drm should be sent instead. > > Personally I'm with you on this and hope to get us there. I'm learning... > we're learning. I agree that changes on atomic, removing abstractions, etc. > should happen on dri-devel. > > When it comes to brand-new technologies (MST, Freesync), though, we're often > the first which means that we're spending a considerable amount of time to > get things right, working with HW teams, receiver vendors and other partners > internal and external to AMD. By the time we do get it right it's time to > hit the market. This gives us fairly little leeway to work with the > community on patches that won't land in distros for another half a year. > We're definitely hoping to improve some of this but it's not easy and in > some case impossible ahead of time (though definitely possibly after initial > release). Speaking with my Wayland hat on, I think these need to be very carefully considered. Both MST and FreeSync have _significant_ UABI implications, which may not be immediately obvious when working with a single implementation. Having them working and validated with a vertical stack of amdgpu-DC/KMS + xf86-video-amdgpu + Mesa-amdgpu/AMDGPU-Pro is one thing, but looking outside the X11 world we now have Weston, Mutter and KWin all directly driving KMS, plus whatever Mir/Unity ends up doing (presumably the same), and that's just on the desktop. Beyond the desktop, there's also CrOS/Freon and Android/HWC. For better or worse, outside of Xorg and HWC, we no longer have a vendor-provided userspace component driving KMS. It was also easy to get away with loose semantics before with X11 imposing little to no structure on rendering, but we now have the twin requirements of an atomic and timing-precise ABI - see Mario Kleiner's unending quest for accuracy - and also a vendor-independent ABI. So a good part of the (not insignificant) pain incurred in the atomic transition for drivers, was in fact making those drivers conform to the expectations of the KMS UABI contract, which just happened to not have been tripped over previously. Speaking with my Collabora hat on now: we did do a substantial amount of demidlayering on the Exynos driver, including an atomic conversion, on Google's behalf. The original Exynos driver happened to work with the Tizen stack, but ChromeOS exposed a huge amount of subtle behaviour differences between that and other drivers when using Freon. We'd also hit the same issues when attempting to use Weston on Exynos in embedded devices for OEMs we worked with, so took on the project to remove the midlayer and have as much as possible driven from generic code. How the hardware is programmed is of course ultimately up to you, and in this regard AMD will be very different from Intel is very different from Nouveau is very different from Rockchip. But especially for new features like FreeSync, I think we need to be very conscious of walking the line between getting those features in early, and setting unworkable UABI in stone. It would be unfortunate if later on down the line, you had to choose between breaking older xf86-video-amdgpu userspace which depended on specific behaviours of the amdgpu kernel driver, or breaking the expectations of generic userspace such as Weston/Mutter/etc. One good way to make sure you don't get into that position, is to have core KMS code driving as much of the machinery as possible, with a very clear separation of concerns between actual hardware programming, versus things which may be visible to userspace. This I think is DanielV's point expressed at much greater length. ;) I should be clear though that this isn't unique to AMD, nor a problem of your creation. For example, I'm currently looking at a flip-timing issue in Rockchip - a fairly small, recent, atomic-native, and generally exemplary driver - which I'm pretty sure is going to be resolved by deleting more driver
Re: [PATCH 0/6] drm: Explicit target vblank seqno for page flips
On 4 August 2016 at 11:01, Michel Dänzer <mic...@daenzer.net> wrote: > On 04.08.2016 18:51, Daniel Stone wrote: >> On 4 August 2016 at 04:39, Michel Dänzer <mic...@daenzer.net> wrote: >>> Patch 6 extends the ioctl with new flags, which allow userspace to >>> explicitly specify the target vblank seqno. This can also avoid delaying >>> flips in some cases where we are already in the target vertical blank >>> period when the ioctl is called. >> >> Is there open userspace for this? > > Sure, referenced in patch 6: > > https://cgit.freedesktop.org/~daenzer/xf86-video-ati/commit/?id=fc884a8af25345c32bd4104c864ecfeb9bb3db9b > > https://cgit.freedesktop.org/~daenzer/xf86-video-amdgpu/commit/?id=b8631a9ba49c0d0ebe5dcd1dbfb68fcfe907296f > > >> What's the behaviour vs. modeset: does the modeset request block until >> the last-requested flip is complete? If so, is there some kind of upper >> bound on the number of blank periods to wait for? > > Did you read the patch? :) Nope, for some reason my mailer decided it didn't like it, but I did find it in the archives. > The only change compared to the existing ioctl is that userspace can ask > for a flip to take effect in the current vblank seqno. The code added by > the patch checks for target vblank seqno > current vblank seqno + 1 and > returns -EINVAL in that case. This is also documented in drm_mode.h. Is there any particular benefit to having split absolute/relative modes in this case? Personally I'm struggling to see the use of relative. >> Is all this tested somewhere? > > Yes, I've been using it for a while on all my machines. I mean in a test suite. :) Cheers, Daniel ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx