On Thu, Jan 25, 2024 at 8:57 AM Jose Fonseca <jose.fons...@broadcom.com> wrote:
> > So far, we've been trying to build those components in terms of the > Vulkan API itself with calls jumping back into the dispatch table to try > and get inside the driver. This is working but it's getting more and more > fragile the more tools we add to that box. A lot of what I want to do with > gallium2 or whatever we're calling it is to fix our layering problems so > that calls go in one direction and we can untangle the jumble. I'm still > not sure what I want that to look like but I think I want it to look a lot > like Vulkan, just with a handier interface. > > That resonates with my experience. For example, Galllium draw module does > some of this too -- it provides its own internal interfaces for drivers, > but it also loops back into Gallium top interface to set FS and rasterizer > state -- and that has *always* been a source of grief. Having control > flow proceeding through layers in one direction only seems an important > principle to observe. It's fine if the lower interface is the same > interface (e.g., Gallium to Gallium, or Vulkan to Vulkan as you allude), > but they shouldn't be the same exact entry-points/modules (ie, no > reentrancy/recursion.) > > It's also worth considering that Vulkan extensibility could come in hand > too in what you want to achieve. For example, Mesa Vulkan drivers could > have their own VK_MESA_internal_xxxx extensions that could be used by the > shared Vulkan code to do lower level things. > We already do that for a handful of things. The fact that Vulkan doesn't ever check the stuff in the pNext chain is really useful for that. 😅 ~Faith > Jose > > > On Wed, Jan 24, 2024 at 3:26 PM Faith Ekstrand <fa...@gfxstrand.net> > wrote: > >> Jose, >> >> Thanks for your thoughts! >> >> On Wed, Jan 24, 2024 at 4:30 AM Jose Fonseca <jose.fons...@broadcom.com> >> wrote: >> > >> > I don't know much about the current Vulkan driver internals to have or >> provide an informed opinion on the path forward, but I'd like to share my >> backwards looking perspective. >> > >> > Looking back, Gallium was two things effectively: >> > (1) an abstraction layer, that's watertight (as in upper layers >> shouldn't reach through to lower layers) >> > (2) an ecosystem of reusable components (draw, util, tgsi, etc.) >> > >> > (1) was of course important -- and the discipline it imposed is what >> enabled to great simplifications -- but it also became a straight-jacket, >> as GPUs didn't stand still, and sooner or later the >> see-every-hardware-as-the-same lenses stop reflecting reality. >> > >> > If I had to pick one, I'd say that (2) is far more useful and >> practical. Take components like gallium's draw and other util modules. A >> driver can choose to use them or not. One could fork them within Mesa >> source tree, and only the drivers that opt-in into the fork would need to >> be tested/adapted/etc >> > >> > On the flip side, Vulkan API is already a pretty low level HW >> abstraction. It's also very flexible and extensible, so it's hard to >> provide a watertight abstraction underneath it without either taking the >> lowest common denominator, or having lots of optional bits of functionality >> governed by a myriad of caps like you alluded to. >> >> There is a third thing that isn't really recognized in your description: >> >> (3) A common "language" to talk about GPUs and data structures that >> represent that language >> >> This is precisely what the Vulkan runtime today doesn't have. Classic >> meta sucked because we were trying to implement GL in GL. u_blitter, >> on the other hand, is pretty fantastic because Gallium provides a much >> more sane interface to write those common components in terms of. >> >> So far, we've been trying to build those components in terms of the >> Vulkan API itself with calls jumping back into the dispatch table to >> try and get inside the driver. This is working but it's getting more >> and more fragile the more tools we add to that box. A lot of what I >> want to do with gallium2 or whatever we're calling it is to fix our >> layering problems so that calls go in one direction and we can >> untangle the jumble. I'm still not sure what I want that to look like >> but I think I want it to look a lot like Vulkan, just with a handier >> interface. >> >> ~Faith >> >> > Not sure how useful this is in practice to you, but the lesson from my >> POV is that opt-in reusable and shared libraries are always time well spent >> as they can bend and adapt with the times, whereas no opt-out watertight >> abstractions inherently have a shelf life. >> > >> > Jose >> > >> > On Fri, Jan 19, 2024 at 5:30 PM Faith Ekstrand <fa...@gfxstrand.net> >> wrote: >> >> >> >> Yeah, this one's gonna hit Phoronix... >> >> >> >> When we started writing Vulkan drivers back in the day, there was this >> >> notion that Vulkan was a low-level API that directly targets hardware. >> >> Vulkan drivers were these super thin things that just blasted packets >> >> straight into the hardware. What little code was common was small and >> >> pretty easy to just copy+paste around. It was a nice thought... >> >> >> >> What's happened in the intervening 8 years is that Vulkan has grown. A >> lot. >> >> >> >> We already have several places where we're doing significant layering. >> >> It started with sharing the WSI code and some Python for generating >> >> dispatch tables. Later we added common synchronization code and a few >> >> vkFoo2 wrappers. Then render passes and... >> >> >> >> https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27024 >> >> >> >> That's been my project the last couple weeks: A common VkPipeline >> >> implementation built on top of an ESO-like interface. The big >> >> deviation this MR makes from prior art is that I make no attempt at >> >> pretending it's a layered implementation. The vtable for shader >> >> objects looks like ESO but takes its own path when it's useful to do >> >> so. For instance, shader creation always consumes NIR and a handful of >> >> lowering passes are run for you. It's no st_glsl_to_nir but it is a >> >> bit opinionated. Also, a few of the bits that are missing from ESO >> >> such as robustness have been added to the interface. >> >> >> >> In my mind, this marks a pretty fundamental shift in how the Vulkan >> >> runtime works, at least in my mind. Previously, everything was >> >> designed to be a toolbox where you can kind of pick and choose what >> >> you want to use. Also, everything at least tried to act like a layer >> >> where you still implemented Vulkan but you could leave out bits like >> >> render passes if you implemented the new thing and were okay with the >> >> layer. With the ESO code, you implement something that isn't Vulkan >> >> entrypoints and the actual entrypoints live in the runtime. This lets >> >> us expand and adjust the interface as needed for our purposes as well >> >> as sanitize certain things even in the modern API. >> >> >> >> The result is that NVK is starting to feel like a gallium driver. 🙃 >> >> >> >> So here's the question: do we like this? Do we want to push in this >> >> direction? Should we start making more things work more this way? I'm >> >> not looking for MRs just yet nor do I have more reworks directly >> >> planned. I'm more looking for thoughts and opinions as to how the >> >> various Vulkan driver teams feel about this. We'll leave the detailed >> >> planning for the Mesa issue tracker. >> >> >> >> It's worth noting that, even though I said we've tried to keep things >> >> layerish, there are other parts of the runtime that look like this. >> >> The synchronization code is a good example. The vk_sync interface is >> >> pretty significantly different from the Vulkan objects it's used to >> >> implement. That's worked out pretty well, IMO. With as complicated as >> >> something like pipelines or synchronization are, trying to keep the >> >> illusion of a layer just isn't practical. >> >> >> >> So, do we like this? Should we be pushing more towards drivers being a >> >> backed of the runtime instead of a user of it? >> >> >> >> Now, before anyone asks, no, I don't really want to build a multi-API >> >> abstraction with a Vulkan state tracker. If we were doing this 5 years >> >> ago and Zink didn't already exist, one might be able to make an >> >> argument for pushing in that direction. However, that would add a huge >> >> amount of weight to the project and make it even harder to develop the >> >> runtime than it already is and for little benefit at this point. >> >> >> >> Here's a few other constraints on what I'm thinking: >> >> >> >> 1. I want it to still be possible for drivers to implement an >> >> extension without piles of runtime plumbing or even bypass the runtime >> >> on occasion as needed. >> >> >> >> 2. I don't want to recreate the gallium cap disaster drivers should >> >> know exactly what they're advertising. We may want to have some >> >> internal features or properties that are used by the runtime to make >> >> decisions but they'll be in addition to the features and properties in >> >> Vulkan. >> >> >> >> 3. We've got some meta stuff already but we probably want more. >> >> However, I don't want to force meta on folks who don't want it. >> >> >> >> The big thing here is that if we do this, I'm going to need help. I'm >> >> happy to do a lot of the architectural work but drivers are going to >> >> have to keep up with the changes and I can't take on the burden of >> >> moving 8 different drivers forward. I can answer questions and maybe >> >> help out a bit but the refactoring is going to be too much for one >> >> person, even if that person is me. >> >> >> >> Thoughts? >> >> >> >> ~Faith >> > >> > >> > This electronic communication and the information and any files >> transmitted with it, or attached to it, are confidential and are intended >> solely for the use of the individual or entity to whom it is addressed and >> may contain information that is confidential, legally privileged, protected >> by privacy laws, or otherwise restricted from disclosure to anyone else. If >> you are not the intended recipient or the person responsible for delivering >> the e-mail to the intended recipient, you are hereby notified that any use, >> copying, distributing, dissemination, forwarding, printing, or copying of >> this e-mail is strictly prohibited. If you received this e-mail in error, >> please return the e-mail to the sender, delete it from your computer, and >> destroy any printed copy of it. >> > > This electronic communication and the information and any files > transmitted with it, or attached to it, are confidential and are intended > solely for the use of the individual or entity to whom it is addressed and > may contain information that is confidential, legally privileged, protected > by privacy laws, or otherwise restricted from disclosure to anyone else. If > you are not the intended recipient or the person responsible for delivering > the e-mail to the intended recipient, you are hereby notified that any use, > copying, distributing, dissemination, forwarding, printing, or copying of > this e-mail is strictly prohibited. If you received this e-mail in error, > please return the e-mail to the sender, delete it from your computer, and > destroy any printed copy of it.