Re: [Mesa-dev] Plumbing explicit synchronization through the Linux ecosystem

2020-03-18 Thread Lucas Stach
Am Dienstag, den 17.03.2020, 11:33 -0400 schrieb Nicolas Dufresne:
> Le lundi 16 mars 2020 à 23:15 +0200, Laurent Pinchart a écrit :
> > Hi Jason,
> > 
> > On Mon, Mar 16, 2020 at 10:06:07AM -0500, Jason Ekstrand wrote:
> > > On Mon, Mar 16, 2020 at 5:20 AM Laurent Pinchart wrote:
> > > > On Wed, Mar 11, 2020 at 04:18:55PM -0400, Nicolas Dufresne wrote:
> > > > > (I know I'm going to be spammed by so many mailing list ...)
> > > > > 
> > > > > Le mercredi 11 mars 2020 à 14:21 -0500, Jason Ekstrand a écrit :
> > > > > > On Wed, Mar 11, 2020 at 12:31 PM Jason Ekstrand 
> > > > > >  wrote:
> > > > > > > All,
> > > > > > > 
> > > > > > > Sorry for casting such a broad net with this one. I'm sure most 
> > > > > > > people
> > > > > > > who reply will get at least one mailing list rejection.  However, 
> > > > > > > this
> > > > > > > is an issue that affects a LOT of components and that's why it's
> > > > > > > thorny to begin with.  Please pardon the length of this e-mail as
> > > > > > > well; I promise there's a concrete point/proposal at the end.
> > > > > > > 
> > > > > > > 
> > > > > > > Explicit synchronization is the future of graphics and media.  At
> > > > > > > least, that seems to be the consensus among all the graphics 
> > > > > > > people
> > > > > > > I've talked to.  I had a chat with one of the lead Android 
> > > > > > > graphics
> > > > > > > engineers recently who told me that doing explicit sync from the 
> > > > > > > start
> > > > > > > was one of the best engineering decisions Android ever made.  It's
> > > > > > > also the direction being taken by more modern APIs such as Vulkan.
> > > > > > > 
> > > > > > > 
> > > > > > > ## What are implicit and explicit synchronization?
> > > > > > > 
> > > > > > > For those that aren't familiar with this space, GPUs, media 
> > > > > > > encoders,
> > > > > > > etc. are massively parallel and synchronization of some form is
> > > > > > > required to ensure that everything happens in the right order and
> > > > > > > avoid data races.  Implicit synchronization is when bits of work 
> > > > > > > (3D,
> > > > > > > compute, video encode, etc.) are implicitly based on the absolute
> > > > > > > CPU-time order in which API calls occur.  Explicit 
> > > > > > > synchronization is
> > > > > > > when the client (whatever that means in any given context) 
> > > > > > > provides
> > > > > > > the dependency graph explicitly via some sort of synchronization
> > > > > > > primitives.  If you're still confused, consider the following
> > > > > > > examples:
> > > > > > > 
> > > > > > > With OpenGL and EGL, almost everything is implicit sync.  Say you 
> > > > > > > have
> > > > > > > two OpenGL contexts sharing an image where one writes to it and 
> > > > > > > the
> > > > > > > other textures from it.  The way the OpenGL spec works, the 
> > > > > > > client has
> > > > > > > to make the API calls to render to the image before (in CPU time) 
> > > > > > > it
> > > > > > > makes the API calls which texture from the image.  As long as it 
> > > > > > > does
> > > > > > > this (and maybe inserts a glFlush?), the driver will ensure that 
> > > > > > > the
> > > > > > > rendering completes before the texturing happens and you get 
> > > > > > > correct
> > > > > > > contents.
> > > > > > > 
> > > > > > > Implicit synchronization can also happen across processes.  
> > > > > > > Wayland,
> > > > > > > for instance, is currently built on implicit sync where the client
> > > > > > > does their rendering and then does a hand-off (via 
> > > > > > > wl_surface::commit)
> > > > > > > to tell the compositor it's done at which point the compositor 
> > > > > > > can now
> > > > > > > texture from the surface.  The hand-off ensures that the client's
> > > > > > > OpenGL API calls happen before the server's OpenGL API calls.
> > > > > > > 
> > > > > > > A good example of explicit synchronization is the Vulkan API.  
> > > > > > > There,
> > > > > > > a client (or multiple clients) can simultaneously build command
> > > > > > > buffers in different threads where one of those command buffers
> > > > > > > renders to an image and the other textures from it and then submit
> > > > > > > both of them at the same time with instructions to the driver for
> > > > > > > which order to execute them in.  The execution order is described 
> > > > > > > via
> > > > > > > the VkSemaphore primitive.  With the new VK_KHR_timeline_semaphore
> > > > > > > extension, you can even submit the work which does the texturing
> > > > > > > BEFORE the work which does the rendering and the driver will sort 
> > > > > > > it
> > > > > > > out.
> > > > > > > 
> > > > > > > The #1 problem with implicit synchronization (which explicit 
> > > > > > > solves)
> > > > > > > is that it leads to a lot of over-synchronization both in client 
> > > > > > > space
> > > > > > > and in driver/device space.  The client has to synchronize a lot 
> > > > > > > more
> > > > > > > because it has to ensure that the 

Re: [Mesa-dev] Plumbing explicit synchronization through the Linux ecosystem

2020-03-18 Thread Lucas Stach
Am Dienstag, den 17.03.2020, 10:12 -0700 schrieb Jacob Lifshay:
> One related issue with explicit sync using sync_file is that combined
> CPUs/GPUs (the CPU cores *are* the GPU cores) that do all the
> rendering in userspace (like llvmpipe but for Vulkan and with extra
> instructions for GPU tasks) but need to synchronize with other
> drivers/processes is that there should be some way to create an
> explicit fence/semaphore from userspace and later signal it. This
> seems to conflict with the requirement for a sync_file to complete in
> finite time, since the user process could be stopped or killed.
> 
> Any ideas?

Finite just means "not infinite". If you stop the process that's doing
part of the pipeline processing you block the pipeline, you get to keep
the pieces in that case. That's one of the issues with implicit sync
that explicit may solve: a single client taking way too much time to
render something can block the whole pipeline up until the display
flip. With explicit sync the compositor can just decide to use the last
client buffer if the latest buffer isn't ready by some deadline.

With regard to the process getting killed: whatever you sync primitive
is, you need to make sure to signal the fence (possibly with an error
condition set) when you are not going to make progress anymore. So
whatever your means to creating the sync_fd from your software renderer
is, it needs to signal any outstanding fences on the sync_fd when the
fd is closed.

Regards,
Lucas

___
xorg-devel@lists.x.org: X.Org development
Archives: http://lists.x.org/archives/xorg-devel
Info: https://lists.x.org/mailman/listinfo/xorg-devel


Re: [Mesa-dev] Plumbing explicit synchronization through the Linux ecosystem

2020-03-18 Thread Lucas Stach
Am Dienstag, den 17.03.2020, 10:59 -0700 schrieb Jacob Lifshay:
> On Tue, Mar 17, 2020 at 10:21 AM Lucas Stach  wrote:
> > Am Dienstag, den 17.03.2020, 10:12 -0700 schrieb Jacob Lifshay:
> > > One related issue with explicit sync using sync_file is that combined
> > > CPUs/GPUs (the CPU cores *are* the GPU cores) that do all the
> > > rendering in userspace (like llvmpipe but for Vulkan and with extra
> > > instructions for GPU tasks) but need to synchronize with other
> > > drivers/processes is that there should be some way to create an
> > > explicit fence/semaphore from userspace and later signal it. This
> > > seems to conflict with the requirement for a sync_file to complete in
> > > finite time, since the user process could be stopped or killed.
> > > 
> > > Any ideas?
> > 
> > Finite just means "not infinite". If you stop the process that's doing
> > part of the pipeline processing you block the pipeline, you get to keep
> > the pieces in that case.
> 
> Seems reasonable.
> 
> > That's one of the issues with implicit sync
> > that explicit may solve: a single client taking way too much time to
> > render something can block the whole pipeline up until the display
> > flip. With explicit sync the compositor can just decide to use the last
> > client buffer if the latest buffer isn't ready by some deadline.
> > 
> > With regard to the process getting killed: whatever you sync primitive
> > is, you need to make sure to signal the fence (possibly with an error
> > condition set) when you are not going to make progress anymore. So
> > whatever your means to creating the sync_fd from your software renderer
> > is, it needs to signal any outstanding fences on the sync_fd when the
> > fd is closed.
> 
> I think I found a userspace-accessible way to create sync_files and
> dma_fences that would fulfill the requirements:
> https://github.com/torvalds/linux/blob/master/drivers/dma-buf/sw_sync.c
> 
> I'm just not sure if that's a good interface to use, since it appears
> to be designed only for debugging. Will have to check for additional
> requirements of signalling an error when the process that created the
> fence is killed.

Something like that can certainly be lifted for general use if it makes
sense. But then with a software renderer I don't really see how fences
help you at all. With a software renderer you know exactly when the
frame is finished and you can just defer pushing it over to the next
pipeline element until that time. You won't gain any parallelism by
using fences as the CPU is busy doing the rendering and will not run
other stuff concurrently, right?

Regards,
Lucas

___
xorg-devel@lists.x.org: X.Org development
Archives: http://lists.x.org/archives/xorg-devel
Info: https://lists.x.org/mailman/listinfo/xorg-devel


Re: [Mesa-dev] [Intel-gfx] gitlab.fd.o financial situation and impact on services

2020-02-28 Thread Lucas Stach
On Fr, 2020-02-28 at 10:47 +0100, Daniel Vetter wrote:
> On Fri, Feb 28, 2020 at 10:29 AM Erik Faye-Lund
>  wrote:
> > On Fri, 2020-02-28 at 13:37 +1000, Dave Airlie wrote:
> > > On Fri, 28 Feb 2020 at 07:27, Daniel Vetter 
> > > wrote:
> > > > Hi all,
> > > > 
> > > > You might have read the short take in the X.org board meeting
> > > > minutes
> > > > already, here's the long version.
> > > > 
> > > > The good news: gitlab.fd.o has become very popular with our
> > > > communities, and is used extensively. This especially includes all
> > > > the
> > > > CI integration. Modern development process and tooling, yay!
> > > > 
> > > > The bad news: The cost in growth has also been tremendous, and it's
> > > > breaking our bank account. With reasonable estimates for continued
> > > > growth we're expecting hosting expenses totalling 75k USD this
> > > > year,
> > > > and 90k USD next year. With the current sponsors we've set up we
> > > > can't
> > > > sustain that. We estimate that hosting expenses for gitlab.fd.o
> > > > without any of the CI features enabled would total 30k USD, which
> > > > is
> > > > within X.org's ability to support through various sponsorships,
> > > > mostly
> > > > through XDC.
> > > > 
> > > > Note that X.org does no longer sponsor any CI runners themselves,
> > > > we've stopped that. The huge additional expenses are all just in
> > > > storing and serving build artifacts and images to outside CI
> > > > runners
> > > > sponsored by various companies. A related topic is that with the
> > > > growth in fd.o it's becoming infeasible to maintain it all on
> > > > volunteer admin time. X.org is therefore also looking for admin
> > > > sponsorship, at least medium term.
> > > > 
> > > > Assuming that we want cash flow reserves for one year of
> > > > gitlab.fd.o
> > > > (without CI support) and a trimmed XDC and assuming no sponsor
> > > > payment
> > > > meanwhile, we'd have to cut CI services somewhere between May and
> > > > June
> > > > this year. The board is of course working on acquiring sponsors,
> > > > but
> > > > filling a shortfall of this magnitude is neither easy nor quick
> > > > work,
> > > > and we therefore decided to give an early warning as soon as
> > > > possible.
> > > > Any help in finding sponsors for fd.o is very much appreciated.
> > > 
> > > a) Ouch.
> > > 
> > > b) we probably need to take a large step back here.
> > > 
> > 
> > I kinda agree, but maybe the step doesn't have to be *too* large?
> > 
> > I wonder if we could solve this by restructuring the project a bit. I'm
> > talking purely from a Mesa point of view here, so it might not solve
> > the full problem, but:
> > 
> > 1. It feels silly that we need to test changes to e.g the i965 driver
> > on dragonboards. We only have a big "do not run CI at all" escape-
> > hatch.
> > 
> > 2. A lot of us are working for a company that can probably pay for
> > their own needs in terms of CI. Perhaps moving some costs "up front" to
> > the company that needs it can make the future of CI for those who can't
> > do this
> > 
> > 3. I think we need a much more detailed break-down of the cost to make
> > educated changes. For instance, how expensive is Docker image
> > uploads/downloads (e.g intermediary artifacts) compared to build logs
> > and final test-results? What kind of artifacts?
> 
> We have logs somewhere, but no one yet got around to analyzing that.
> Which will be quite a bit of work to do since the cloud storage is
> totally disconnected from the gitlab front-end, making the connection
> to which project or CI job caused something is going to require
> scripting. Volunteers definitely very much welcome I think.

It's very surprising to me that this kind of cost monitoring is treated
as an afterthought, especially since one of the main jobs of the X.Org
board is to keep spending under control and transparent.

Also from all the conversations it's still unclear to me if the google
hosting costs are already over the sponsored credits (so is burning a
hole into X.org bank account right now) or if this is only going to
happen at a later point in time.

Even with CI disabled it seems that the board estimates a cost of 30k
annually for the plain gitlab hosting. Is this covered by the credits
sponsored by google? If not, why wasn't there a board voting on this
spending? All other spending seem to require pre-approval by the board.
Why wasn't gitlab hosting cost discussed much earlier in the public
board meetings, especially if it's going to be such an big chunk of the
overall spending of the X.Org foundation?

Regards,
Lucas

___
xorg-devel@lists.x.org: X.Org development
Archives: http://lists.x.org/archives/xorg-devel
Info: https://lists.x.org/mailman/listinfo/xorg-devel


[PATCH:xf86-video-tegra] make TegraPlatformProbe actually work

2012-12-19 Thread Lucas Stach
This is mostly the same thing as omapdrm does. Makes tegra work on
xservers with Revert xf86: Fix non-PCI configuration-less setups
applied.

Signed-off-by: Lucas Stach d...@lynxeye.de
---
 src/driver.c | 16 
 1 Datei geändert, 8 Zeilen hinzugefügt(+), 8 Zeilen entfernt(-)

diff --git a/src/driver.c b/src/driver.c
index 1d14694..705c6c4 100644
--- a/src/driver.c
+++ b/src/driver.c
@@ -770,15 +770,13 @@ static Bool
 TegraPlatformProbe(DriverPtr driver, int entity_num, int flags,
struct xf86_platform_device *dev, intptr_t match_data)
 {
-char *path = xf86_get_platform_device_attrib(dev, ODEV_ATTRIB_PATH);
+char *busid = xf86_get_platform_device_attrib(dev, ODEV_ATTRIB_BUSID);
 ScrnInfoPtr scrn = NULL;
-int scr_flags = 0;
-
-if (flags  PLATFORM_PROBE_GPU_SCREEN)
-scr_flags = XF86_ALLOCATE_GPU_SCREEN;
+int fd;
 
-if (TegraProbeHardware(path)) {
-scrn = xf86AllocateScreen(driver, scr_flags);
+fd = drmOpen(NULL, busid);
+if (fd != -1) {
+scrn = xf86AllocateScreen(driver, 0);
 
 xf86AddEntityToScreen(scrn, entity_num);
 
@@ -794,7 +792,9 @@ TegraPlatformProbe(DriverPtr driver, int entity_num, int 
flags,
 scrn-ValidMode = TegraValidMode;
 
 xf86DrvMsg(scrn-scrnIndex, X_INFO, using %s\n,
-   path ? path : default device);
+   busid ? busid : default device);
+
+drmClose(fd);
 }
 
 return scrn != NULL;
-- 
1.7.11.7

___
xorg-devel@lists.x.org: X.Org development
Archives: http://lists.x.org/archives/xorg-devel
Info: http://lists.x.org/mailman/listinfo/xorg-devel

Re: NVIDIA Tegra DDX

2012-12-15 Thread Lucas Stach
Am Montag, den 10.12.2012, 13:34 +0100 schrieb Thierry Reding:
 On Mon, Dec 10, 2012 at 12:40:21PM +0100, Marc Dietrich wrote:
  Am Montag, 10. Dezember 2012, 12:01:26 schrieb Michal Suchanek:
 
   Perhaps it could be named xf86-video-tegra-kms or something.
  
  or opentegra (to make clear that this driver is not the closed source one). 
  I 
  know this is a bit fanciless but it fits better IMHO.
 
I actually like the name opentegra as it clearly states what's inside.

 I just remembered that back when I started working on Tegra DRM I wrote
 some small utilities to help with the reverse engineering and some test
 programs. At the time, some other projects were appearing and the trend
 seemed to be to reverse the syllables of the original to obtain a new
 name. So I went ahead and collectively named the utilities grate. I'll
 throw xf86-video-grate in as another potential candidate.
 
Now that we've heard some candidates for the naming, can we please come
to any conclusion?

I would like to start extending the DDX in the next days and it would be
really nice to have the naming thing out of the way for this.

Regards,
Lucas


___
xorg-devel@lists.x.org: X.Org development
Archives: http://lists.x.org/archives/xorg-devel
Info: http://lists.x.org/mailman/listinfo/xorg-devel


Re: xf86-video-tegra or xf86-video-modesetting?

2012-11-26 Thread Lucas Stach
Am Samstag, den 24.11.2012, 22:09 +0100 schrieb Thierry Reding:
 Hi,
 
 With tegra-drm going into Linux 3.8 and NVIDIA posting initial patches
 for 2D acceleration on top of it, I've been looking at the various ways
 how this can best be leveraged.
 
 The most obvious choice would be to start work on an xf86-video-tegra
 driver that uses the code currently in the works to implement the EXA
 callbacks that allow some of the rendering to be offloaded to the GPU.
 The way I would go about this is to fork xf86-video-modesetting, do some
 rebranding and add the various bits required to offload rendering.
 
As much as I dislike to say this, but forking the modesetting driver to
bring in the Tegra specific 2D accel might be the best way to go for
now. Especially looking at the very limited resources available to
tegradrm development and NVIDIAs expressed desire to do as few changes
as possible to their downstream work.

 However, that has all the usual drawbacks of a fork so I thought maybe
 it would be better to write some code to xf86-video-modesetting to add
 GPU-specific acceleration on top. Such code could be leveraged by other
 drivers as well and all of them could share a common base for the
 functionality provided through the standard DRM IOCTLs.
 
We don't have any standard DRM IOCTLs for doing acceleration today. The
single fact that we are stitching together command streams in userspace
for execution by the GPU renders a common interface unusable. We don't
even have a common interface to allocate GPU resources suitable for
acceleration: the dumb IOCTLs are only guaranteed to give you a buffer
the display engine can scan out from, nothing in there let's you set up
more fancy things like tiling etc, which might be needed to operate on
the buffer with other engines in some way.

 That approach has some disadvantages of its own, like the potential
 bloat if many GPUs do the same. It would also be a bit of a step back
 to the old monolithic days of X.
 
For some thoughts about how a unified accelerated driver for various
hardware devices could be done I would like to point at my presentation
at this years XDC.
However doing this right might prove as a major task, so as I already
said it might be more worthwhile to just stuff the Tegra specific bits
into a fork of the modesetting driver.

Regards,
Lucas


___
xorg-devel@lists.x.org: X.Org development
Archives: http://lists.x.org/archives/xorg-devel
Info: http://lists.x.org/mailman/listinfo/xorg-devel