Re: [Mesa3d-dev] Gallium software fallback/draw command failure

Keith Whitwell Mon, 01 Mar 2010 04:44:43 -0800

On Mon, 2010-03-01 at 04:07 -0800, Keith Whitwell wrote:
> On Mon, 2010-03-01 at 03:55 -0800, Jerome Glisse wrote:
> > On Mon, Mar 01, 2010 at 11:46:08AM +0000, Keith Whitwell wrote:
> > > On Mon, 2010-03-01 at 03:21 -0800, José Fonseca wrote:
> > > > On Sun, 2010-02-28 at 11:25 -0800, Jerome Glisse wrote:
> > > > > Hi,
> > > > > 
> > > > > I am a bit puzzled, how a pipe driver should handle
> > > > > draw callback failure ? On radeon (pretty sure nouveau
> > > > > or intel hit the same issue) we can only know when one
> > > > > of the draw_* context callback is call if we can do
> > > > > the rendering or not.
> > > > > 
> > > > > The failure here is dictated by memory constraint, ie
> > > > > if user bind big texture, big vbo ... we might not have
> > > > > enough GPU address space to bind all the desired object
> > > > > (even for drawing a single triangle) ?
> > > > > 
> > > > > What should we do ? None of the draw callback can return
> > > > > a value ? Maybe for a GL stack tracker we should report
> > > > > GL_OUT_OF_MEMORY all way up to app ? Anyway bottom line
> > > > > is i think pipe driver are missing something here. Any
> > > > > idea ? Thought ? Is there already a plan to address that ? :)
> > > > 
> > > > Gallium draw calls had return codes before. They were used for the
> > > > fallover driver IIRC and were recently deleted.
> > > > 
> > > > Either we put the return codes back, or we add a new
> > > > pipe_context::validate() that would ensure that all necessary conditions
> > > > to draw successfully are met.
> > > > 
> > > > Putting return codes on bind time won't work, because one can't set all
> > > > atoms simultaneously -- atoms are set one by one, so when one's setting
> > > > the state there are state combinations which may exceed the available
> > > > resources but that are never drawn with. E.g. you have a huge VB you
> > > > finished drawing, and then you switch to drawing with a small VB with a
> > > > huge texture, but in between it may happen that you have both bound
> > > > simultaneously.
> > > > 
> > > > If ignoring is not an alternative, then I'd prefer a validate call.
> > > > 
> > > > Whether to fallback to software or not -- it seems to me it's really a
> > > > problem that must be decided case by case. Drivers are supposed to be
> > > > useful -- if hardware is so limited that it can't do anything useful
> > > > then falling back to software is sensible. I don't think that a driver
> > > > should support every imaginable thing -- apps should check errors, and
> > > > users should ensure they have enough hardware resources for the
> > > > workloads they want.
> > > > 
> > > > Personally I think state trackers shouldn't emulate anything with CPU
> > > > beyond unsupported pixel formats. If a hardware is so limited that in
> > > > need CPU assistence this should taken care transparently by the pipe
> > > > driver. Nevertheless we can and should provide auxiliary libraries like
> > > > draw to simplify the pipe driver implementation.
> > > 
> > > 
> > > My opinion on this is similar: the pipe driver is responsible for
> > > getting the rendering done.  If it needs to pull in a fallback module to
> > > achieve that, it is the pipe driver's responsibility to do so.
> > > 
> > > Understanding the limitations of hardware and the best ways to work
> > > around those limitations is really something that the driver itself is
> > > best positioned to handle.
> > > 
> > > The slight quirk of OpenGL is that there are some conditions where
> > > theoretically the driver is allowed to throw an OUT_OF_MEMORY error (or
> > > similar) and not render.  This option isn't really available to gallium
> > > drivers, mainly because we don't know inside gallium whether the API
> > > permits this.  Unfortunately, even in OpenGL, very few applications
> > > actually check the error conditions, or do anything sensible when they
> > > fail.
> > > 
> > > I don't really like the idea of pipe drivers being able to fail render
> > > calls, as it means that every state tracker and every bit of utility
> > > code that issues a pipe->draw() call will have to check the return code
> > > and hook in fallback code on failure.
> > > 
> > > One interesting thing would be to consider creating a layer that exposes
> > > a pipe_context interface to the state tracker, but revives some of the
> > > failover ideas internally - maybe as a first step just lifting the draw
> > > module usage up to a layer above the actual hardware driver.
> > > 
> > > Keith
> > > 
> > 
> > So you don't like the pipe_context::validate() of Jose ? My
> > taste goes to the pipe_context::validate() and having state
> > tracker setting the proper flag according to the API they
> > support (GL_OUT_OF_MEMORY for GL), this means just drop
> > rendering command that we can't do.
> 
> I think it's useful as a method for implementing GL_OUT_OF_MEMORY, but
> the pipe driver should:
> 
> a) not rely on validate() being called - ie it is just a query, not a
> mandatory prepare-to-render notification.
> 
> b) make a best effort to render in subsequent draw() calls, even if
> validate has been called - ie. it is just a query, does not modify pipe
> driver behaviour.
> 
> > I am not really interested in doing software fallback. What
> > would be nice is someone testing with closed source driver
> > what happen when you try to draw somethings the GPU can't
> > handle. Maybe even people from closed source world can give
> > us a clue on what they are doing in front of such situation :)
> 
> I've seen various things, but usually they try to render something even
> if its incorrect.


It's always interesting to think about the OpenGL mechanisms and
understand why they do things a particular way.

In this case we bump into a fairly interesting bit of OpenGL -- the
asynchronous error mechanism.  Why doesn't OpenGL just return
OUT_OF_MEMORY from glBegin() or glDrawElemenets()?  Basically GL is
preserving asynchronous operations between the application and the GL
implementation - eg for indirect contexts, but also for cases where the
errors are generated by the memory manager or even the hardware long
after the actual draw() call itself.

I think we probably will face the same issues in gallium.  Nobody has
tried to do a "remote gallium" yet, but any sort of synchronous
round-trip query (like validate, or return codes from draw calls) will
be a pain to accommodate in that environment.  Likewise errors that are
raised by TTM at command-buffer submission would be better handled by an
asynchronous error mechanism.

For now, validate() sounds fine, but at some point in the future a
less-synchronous version may be appealing.

Keith



------------------------------------------------------------------------------
Download Intel&#174; Parallel Studio Eval
Try the new software tools for yourself. Speed compiling, find bugs
proactively, and fine-tune applications for parallel performance.
See why Intel Parallel Studio got high marks during beta.
http://p.sf.net/sfu/intel-sw-dev
_______________________________________________
Mesa3d-dev mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/mesa3d-dev

Re: [Mesa3d-dev] Gallium software fallback/draw command failure

Reply via email to