The post-nir_register future

2023-08-04 Thread Alyssa Rosenzweig
FYI for people who don't follow GitLab--

We've now removed the following deprecated functionality from NIR:

* nir_register
* ALU write masks
* abs/neg/sat modifiers

For background on how we did it and why, see the description of
https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23089 and the
transition tracker
https://gitlab.freedesktop.org/mesa/mesa/-/issues/9051

Now that this is merged, casual benchmarking shows ~8% speed-up in
shader compile time.

Going forward, there's a lot of clean up potential. The plan is outlined
in https://gitlab.freedesktop.org/mesa/mesa/-/issues/9396 ... I am
holding off on this until NVK is merged for Faith's rebasing benefit.

If you are sitting on big NIR branches, now is the time to land them.

Here's to the new NIR :~)

Cheers,

-A


Re: Retiring the GitHub mirrors

2023-01-20 Thread Alyssa Rosenzweig
> Among the people present in this discussion, the consensus was that we
> should delete them.

I wasn't present but +1 from me.


Re: Zink MR signoff tags

2022-10-05 Thread Alyssa Rosenzweig
> > + for not requiring rb/ab tags ...
> 
> I think it's time to think about making this change all over Mesa as
> well. We're deeply in bed with GitLab by now, so I don't think there's
> a realistic chance that this isn't going to just be duplicate info any
> time soon...

I agree, but I don't want to be the one on the hook for the politics if
people disagree (as I think some do).

> > I kinda like the s-o-b tags but those
> > don't require fiddly rebases, just -s in the right place..
> > 
> > On Tue, Oct 04, 2022 at 10:44:31PM -0500, Mike Blumenkrantz wrote:
> > >  Hi,
> > >  After some vigorous and robust discussion with Erik, we've
> > > decided that
> > >  zink will no longer require any rb/ab/etb tags to be applied to
> > > patches in
> > >  MRs.
> > >  Following in Turnip's footsteps, any MR that receives sufficient
> > > reviewage
> > >  in gitlab comments can be merged directly with no further action
> > >  necessary.
> > >  Mike


Re: Zink MR signoff tags

2022-10-05 Thread Alyssa Rosenzweig
+ for not requiring rb/ab tags ... I kinda like the s-o-b tags but those
don't require fiddly rebases, just -s in the right place..

On Tue, Oct 04, 2022 at 10:44:31PM -0500, Mike Blumenkrantz wrote:
>Hi,
>After some vigorous and robust discussion with Erik, we've decided that
>zink will no longer require any rb/ab/etb tags to be applied to patches in
>MRs.
>Following in Turnip's footsteps, any MR that receives sufficient reviewage
>in gitlab comments can be merged directly with no further action
>necessary.
>Mike


Re: Moving amber into separate repo?

2022-09-24 Thread Alyssa Rosenzweig
>   2. repo is growing large. Amber kinda requires long history, modern
>  mesa not. This may be good spot to split if cleanup is required.

mesa absolutely uses long history. there is nothing to clean up. those
bytes of disk space are well worth it.

(Neutral on the other points, I don't work on stuff suported in Amber)


Re: Enable OpenGL software rendering on macOS

2022-05-06 Thread Alyssa Rosenzweig
Having spent enough time wrangling Mesa/MacOS on Arm Macs

the correct solution is to migrate to Linux. everything else is pain.

On Wed, May 04, 2022 at 12:38:55AM +, Jose Fonseca wrote:
>I'm not sure exactly what Homebrew provides, and I'm not able to
>investigate it now.
>Yes, using Mesa software rendering to fill the void left by Opengl
>deprecation makes some sense.   But note is still software rendering, not
>GPU accelerated.  And one needs to consider the x86 -> arm.
>You could consider somehow integrating jogl with osmesa -- offscreen
>rendering -Mesa -- which builds and runs pretty much anywhere Mesa builds
>-- thereby bypassing the headaches of integrating with mac specifics
>technologies such as CGL and Cocoa
>There are also other alternatives with considering, such
>as https://moltengl.com/moltengl/
>Jose
>Get Outlook for Android
> 
>  --
> 
>From: Martin Pernollet 
>Sent: Monday, May 2, 2022 1:31:43 AM
>To: Jose Fonseca 
>Cc: mesa-dev@lists.freedesktop.org 
>Subject: Re: Enable OpenGL software rendering on macOS
> 
> 
>** External Email
> 
>Hi Jose,
>Many thanks for your answer.
>JOGL classes enabling GL binding macOS all refer to CGL, so yes, I think I
>want to rely on CGL (there is however in JOGL a couple of classes allowing
>to work with X11, but they're used on Linux only).
>I can't access the Apple M1 on which I built Mesa right now, but homebrew
>on another macOS provides libGL and not libGLX.
> 
>ls /usr/local/Cellar/mesa/21.1.2/lib
> 
>dri  libGL.dylib  libGLESv1_CM.dylib  libGLESv2.dylib  libglapi.dylib
> 
>libGL.1.dylib  libGLESv1_CM.1.dylib  libGLESv2.2.dylib  libglapi.0.dylib 
>pkgconfig
> 
> 
>One motivation for enabling Mesa on macOS is actually the deprecated
>status of OpenGL. I would expect this software implementation of GL to
>simply provide an image to be copied to a native window/frame, without
>having to rely on CGL. Hence, one could continue working with simple
>OpenGL even if macOS doesn't provide such API.
>Does it sound reasonable? 
>Martin
>Envoye avec la messagerie securisee ProtonMail.
>--- Original Message ---
>Le vendredi 29 avril 2022 a 16:08, Jose Fonseca  a
>ecrit :
> 
>  The difficulty with OpenGL on macOS is that all driver interfaces are
>  both undocumented and deprecated.
>  If you want to override the system OpenGL, you can use apitrace code as
>  reference.  There are two approaches:
>   1. DYLD_FRAMEWORK_PATH  
> https://github.com/apitrace/apitrace/blob/master/cli/cli_trace.cpp
>   2. DYLD_INSERT_LIBRARIES 
> https://github.com/apitrace/apitrace/tree/dyld-interpose (experimental
>  branch)
>   
>  AFAIK, Mesa build for macOS generates a ibGLX which depends on X11,
>  which is probably not what you want.  You want to use Mesa for macOS
>  apps which use CGL as opposed to GLX, right?
>  So, if one wants to have a SW renderer on macOS with llvmpipe without
>  depending on X11, then one would need to implement:
>* a new Gallium frontend that implements CGL API (equivalent to the
>  WGL frontend that exists for Windows)
>* a new SW renderer winsys that draws pixels to Cocoa window somehow
>  (equivalent to the GDI winsys that draws to a Windows GDI surface) 
>  It's not a matter of just integrating existing components together --
>  there's lot of new code that would be need here -- I'd reckon 2 months
>  for somebody familiar with Mesa/macOS, 6 - 12 months for somebody more
>  novice.  And let's be frank, given mac deprecation of OpenGL and
>  migration away from Intel to Arm, the usefulness of this in the long
>  term is dubious.
>  Jose
> 
>  --
> 
>  From: mesa-dev  on behalf of
>  Martin Pernollet 
>  Sent: Friday, April 29, 2022 12:32
>  To: mesa-dev@lists.freedesktop.org 
>  Subject: Enable OpenGL software rendering on macOS
>   
> 
>  ** External Email
> 
>  TLDR : I failed using Mesa software rendering on macOS. I am looking for
>  advice to invoke mesa's libGL.dylib without relying on macOS's system
>  GL.
>  Hi everyone,
>  I am building (java) software involving Mesa for CPU rendering. I use
>  CPU rendering as fallback when JOGL (OpenGL binding for Java) fail to
>  use the GPU natively. This is sometime the case for old Linux
>  distributions, this will certainly be frequent on macOS in the future
>  due to Apple's OpenGL deprecation.
>  Mesa CPU rendering is working great on Ubuntu (by enabling
>  LIBGL_ALWAYS_SOFTWARE=true) and Windows (by simply loading Mesa's DLL
>  instead of system DLL, no need to ask for software mode).

Re: Mesa 20.0 backlog

2022-04-25 Thread Alyssa Rosenzweig
> 2022-03-14 FIXES  d5870c45ae panfrost: Optimise recalculation of max sampler 
> view

Unless Icecream95 feels otherwise, I'd drop this one for stable. It
*is* a fix but only impacts a bit of CPU overhead.


Re: Replacing NIR with SPIR-V?

2022-01-21 Thread Alyssa Rosenzweig
>In principle, all the properties you highlight in your blog as key points
>of NIR also apply to SPIR-V.

Those key points are relative to GLSL IR, the IR it displaces. The
purpose of SPIR-V is so different than both NIR and GLSL IR it isn't
interesting to do a comparison. Comparing with LLVM IR would be more
appropriate. (The relationship between SPIR-V and LLVM IR
notwithstanding)

>NIR starts shining as a more suitable IR than SPIR-V for the
>task of communicating front-end and back-end

That is not the goal of NIR. The frontend can pass SPIR-V if it is so
inclined, that's a single spirv_to_nir pass a way, and that's how the
OpenGL extension and Vulkan SPIR-V works.

As for backends, it makes 0 sense whatsoever for any backend in all of
Mesa to consume SPIR-V. That's not what SPIR-V is for.

>- Arrays and structs: SPIR-V's OpAccessChain would need to be processed by
>a backend and translated to pointer arithmetic plus dereferencing (kind of
>the same thing as having to process a nir_deref). This translation can be
>done in RISC-V with no issue, whether it is OpAccessChain or nir_deref.

NIR does that translation for you if you tell it to.


Re: revenge of CALLOC_STRUCT

2021-12-24 Thread Alyssa Rosenzweig
Remind me the difference between MALLOC and malloc (etc)? Like, why
don't we s/MALLOC/malloc/g the whole codebase and drop u_memory.h? I
guess portability (?) even though this is presumably stdlib...

On Thu, Dec 23, 2021 at 08:35:38AM +1000, Dave Airlie wrote:
> Hey,
> 
> Happy holidays, and as though to consider over the break,
> 
> We have the vmware used MALLOC/FREE/CALLOC/CALLOC_STRUCT wrappers used
> throughout gallium.
> 
> We have ST_CALLOC_STRUCT in use in the mesa state tracker, not used in 
> gallium.
> 
> Merging the state tracker into mesa proper, and even prior to this a
> few CALLOC_STRUCT have started to leak into src/mesa/*.
> 
> Now I don't think we want that, but CALLOC_STRUCT is a genuinely
> useful macro on it's own,
> I'm considering just defined a calloc_struct instead for the
> non-gallium use that goes with malloc/calloc/free.
> 
> Any opinions, or should mesa just get u_memory.h support for Xmas?
> 
> Dave.


Re: Moving code around, post classic

2021-12-07 Thread Alyssa Rosenzweig
> 1. Move src/mesa into src/gallium/frontends/mesa (I have patches for
>this)
> 
>Seems like a pretty obvoius thing to do, given that all of the other
>gallium state trackers live there (OpenCL, video, d3d9, etc)

Ack from me.

> 2. Move src/compiler/glsl into src/gallium/frontends/mesa as well
> 
> Given that there are now no? drivers that use GLSL-IR directly, it
> might make sense to move the glsl compiler into the mesa
> state_tracker, and just have that lower to TGSI or NIR, and treat
> GLSL-IR as an implementation detail of the OpenGL frontend.

It would be an ack from, but...

> Unfortunately, there are a lot of code outside of glsl that uses the
> linked list implementation in the glsl compiler, and not the on in
> util.

If it were just linked lists, I'd say someone should write the
Coccinelle to transform the tree to use the one in util and call it a
day. It's a bit more complicated though, NIR depends on GLSL types.
Though that could probably continue to live in its current location even
if glsl moves? Might breed confusion.

> 3. Move src/gallium* to src/
> 
> This was suggested, though given the existance of Vulkan, it wasn't
> clear that this was a good idea or not

If src/gallium/drivers/* is distributed to src/*/* this becomes a lot
less interesting I think?


Re: Moving code around, post classic

2021-12-07 Thread Alyssa Rosenzweig
>If we're going to do this, I wonder if we don't want to go even further
>and get rid of src/gallium/drivers and move the respective folders to
>src/vendor.** So, instead of src/gallium/drivers/(iris|crocus), we'd have
>src/intel/gallium/iris and src/intel/gallium/crocus or maybe
>src/intel/iris and src/intel/crocus.
>--Jason

If we do that, only one level of indirection -- ie src/intel/(iris|crocus) -- 
is my preference, because
src/panfrost/gallium/panfrost is a silly path. src/panfrost/gallium or
(ick) src/panfrost/panfrost saves the extra nesting. Same goes for any
vendor with only a single Gallium driver (most of them...?)


Re: [Mesa-dev] GitLab Tag Proposal

2021-11-01 Thread Alyssa Rosenzweig
> I'd like to propose adding/changing two tags:
> 
> * change `feature_request` -> `Feature` (this is barely used at present, so 
> renaming shouldn't affect anyone negatively)
> * add `Bug`
> 
> I would use these primarily for tagging open MRs so that reviewers can more 
> appropriately prioritize bug fixes at a glance over (less urgent) feature 
> implementations. They would also service the same feature for issues, though 
> those are usually more obvious from the title anyway.

Sounds reasonable to me.


Re: [Mesa-dev] Workflow Proposal

2021-10-19 Thread Alyssa Rosenzweig
> > >>  Upstream should do what's best for upstream, not for Intel's "unique"
> > >>  management.
> > >>
> > >>Not sure how from Emma explaining how Rb tags were used by Intel
> > >>management it came the conclusion that it were used in that way only 
> > >> by
> > >>Intel management. Spoiler: it is not.
> > >
> > > Sorry, I'll make that point more emphatic.
> > >
> > > Upstream must do what's best for upstream without zero regard for the
> > > whims of management. Doubly so for bad management.
> >
> > If the r-b process ever had any notice from any company's management, I
> > haven't seen it. (Actually, I think most management would rather have
> > the short sighted view of skipping code review to more quickly merge
> > patches.) In terms of who to "track down", that is also a tenuous
> > connection.
> 
> All of the above is true but also totally irrelevant to the actual discussion.
> 
> When R-b as a metric came up at the time of the first switch, I wrote
> a really trivial Python script which used the GitLab API to scrape MR
> discussions and pull 'Reviewed-by: ...' comments out and print a
> leaderboard for number of reviewed MRs over the past calendar month.
> Adapting that to look at approvals rather than comments would cut it
> down to about 10 LoC.
> 
> Whether it's Reviewed-by in the trailer or an approval, both are
> explicitly designed to be machine readable, which means it's trivial
> to turn it into a metric if you want to. Whether or not that's a good
> idea is the problem of whoever wields it.

Fair enough. I don't have strong views on the tags themselves. If
upstream has good reasons to use them, let's use them; if upstream has
good reasons to omit them, let's omit them.

What I oppose is management metrics playing a role in upstream's
discussion. That goes for any company's management; I do not mean to
single out Intel here.

But given the aforementioned script, even the managers will be happy
either way. My apologies for prolonging what in retrospect is a silly
discussion.


Re: [Mesa-dev] Workflow Proposal

2021-10-13 Thread Alyssa Rosenzweig
> > >> I would love to see this be the process across Mesa.  We already don't
> > >> rewrite commit messages for freedreno and i915g, and I only have to do
> > >> the rebase (busy-)work for my projects in other areas of the tree.
> > > Likewise for Panfrost. At least, I don't do the rewriting. Some Panfrost
> > > devs do, which I'm fine with. But it's not a requirement to merging.
> > >
> > > The arguments about "who can help support this years from now?" are moot
> > > at our scale... the team is small enough that the name on the reviewer
> > > is likely the code owner / maintainer, and patches regularly go in
> > > unreviewed for lack of review bandwidth.
> >
> > There is another reason to the Rb tag, that is to measure the quantity
> > of patch review people do.
> >
> > This was well summarized some years ago by Matt Turner, as it was
> > minimized (even suggested to be removed) on a different thread:
> >
> > https://lists.freedesktop.org/archives/mesa-dev/2019-January/213586.html
> 
> I was part of the Intel team when people started doing this r-b
> counting.  I believe that it was being done due to Intel management's
> failure to understand who was doing the work on the team and credit
> them appropriately, and also to encourage those doing less to step up.
> Unfortunately, the problem with Intel management wasn't a lack of
> available information, and I didn't see publishing the counts change
> reviews either.



Upstream should do what's best for upstream, not for Intel's "unique"
management.


Re: [Mesa-dev] Workflow Proposal

2021-10-13 Thread Alyssa Rosenzweig
>  Upstream should do what's best for upstream, not for Intel's "unique"
>  management.
> 
>Not sure how from Emma explaining how Rb tags were used by Intel
>management it came the conclusion that it were used in that way only by
>Intel management. Spoiler: it is not.

Sorry, I'll make that point more emphatic.

Upstream must do what's best for upstream without zero regard for the
whims of management. Doubly so for bad management.


Re: [Mesa-dev] Workflow Proposal

2021-10-08 Thread Alyssa Rosenzweig
> I would love to see this be the process across Mesa.  We already don't
> rewrite commit messages for freedreno and i915g, and I only have to do
> the rebase (busy-)work for my projects in other areas of the tree.

Likewise for Panfrost. At least, I don't do the rewriting. Some Panfrost
devs do, which I'm fine with. But it's not a requirement to merging.

The arguments about "who can help support this years from now?" are moot
at our scale... the team is small enough that the name on the reviewer
is likely the code owner / maintainer, and patches regularly go in
unreviewed for lack of review bandwidth.

I think it's reasonable that small driver teams and large common
components have different review needs, and it's ok for the process to
reflect that.


Re: [Mesa-dev] [RFC] Concrete proposal to split classic

2021-06-17 Thread Alyssa Rosenzweig
> By far the limiting factor for i915g progress now that I've got some
> CI rigged up is review.  My preference would be that we all agree that
> nobody wants to look at i915, and some responsible folks (ajax and a
> couple Intel volunteers, perhaps?) bless me to merge without review
> once an i915g-only MR has been up for a week.

Congratulations, you're the new i915g maintainer ;-)
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] Outstanding Mesa 21.0 patches

2021-04-12 Thread Alyssa Rosenzweig
Hi Dylan,

Commentary on the two Panfrost patches:

> cff5c40fc3 pan/bi: Fix blend shaders using LD_TILE with MRT   
>   
>   
>   
> 
This needs to be backported, fixes a conformance failure on ES3.0 on
Bifrost. Cc'ing Boris -- maybe you could take care of this backport?
Otherwise I can give it a shot. Thank you.

> 3436e5295b pan/bi: Treat +DISCARD.f32 as message-passing  
>   
>   
>   
>   
We're pretty sure this works around a hardware bug exposed by
out-of-order scheduling, but that only landed after the 21.0 branch
point, so this doesn't need to be backport. Feel free to denominate.

Regards,

Alyssa
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [RFC] Concrete proposal to split classic

2021-03-30 Thread Alyssa Rosenzweig
> Probably nv30 would do well to "move on" as well. But it also presents
> an interesting question -- the nv30 driver has lots of problems. I
> have no plans to fix them, nor am I aware of anyone else with such
> plans. However if such a developer were to turn up, would it be
> reasonable to assume that their work would ultimately land in this
> "lts" branch/tree/whatever? Some of the "fixes" will require large-ish
> changes to the driver...

AFAIU, the issue isn't large changes to the driver, but large changes to
common code. So unless you're switching it to NIR or something, it
probably doesn't matter.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [RFC] Concrete proposal to split classic

2021-03-24 Thread Alyssa Rosenzweig
> And, yeah, I'd love to drop vec4 but yeah...  One advantage to keeping
> vec4 in the tree for some stuff is that it means we have full-featured
> hardware able to test vec4 NIR.  That seems like a feature.

I have to keep caring about Midgard for the next indefinitely so please
don't break vec4 NIR ;-)
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [RFC] Concrete proposal to split classic

2021-03-23 Thread Alyssa Rosenzweig
I'd like to see it happen, though I don't understand how to make these
coexist for distros. Would like to hear from the Debian/etc maintainers
of mesa.

Then again I *think* classic-lts doesn't need to be built for
armhf/arm64 at all, so I suppose I'm personally unaffected :-P

On Mon, Mar 22, 2021 at 03:15:30PM -0700, Dylan Baker wrote:
> Hi list,
> 
> We've talked about it a number of times, but I think it's time time to
> discuss splitting the classic drivers off of the main development branch
> again, although this time I have a concrete plan for how this would
> work.
> 
> First, why? Basically, all of the classic drivers are in maintanence
> mode (even i965). Second, many of them rely on code that no one works
> on, and very few people still understand. There is no CI for most of
> them, and the Intel CI is not integrated with gitlab, so it's easy to
> unintentionally break them, and this breakage usually isn't noticed
> until just before or just after a release. 21.0 was held up (in small
> part, also me just getting behind) because of such breakages.
> 
> I konw there is some interest in getting i915g in good enough shape that
> it could replace i915c, at least for the common case. I also am aware
> that Dave, Ilia, and Eric (with some pointers from Ken) have been
> working on a gallium driver to replace i965. Neither of those things are
> ready yet, but I've taken them into account.
> 
> Here's the plan:
> 
> 1) 21.1 release happens
> 2) we remove classic from master
> 3) 21.1 reaches EOL because of 21.2
> 4) we fork the 21.1 branch into a "classic-lts"?? branch
> 5) we disable all vulkan and gallium drivers in said branch, at least at
>the Meson level
> 6) We change the name and precidence of the glvnd loader file
> 7) apply any build fixups (turn of intel generators for versions >= 7.5,
>for example
> 8) maintain that branch with build and critical bug fixes only
> 
> This gives ditros and end users two options.
> 1) then can build *only* the legacy branch in the a normal Mesa provides
>libGL interfaces fashion
> 2) They can use glvnd and install current mesa and the legacy branch in
>parallel
> 
> Because of glvnd, we can control which driver will get loaded first, and
> thus if we decide i915g or the i965 replacement is ready and turn it on
> by default it will be loaded by default. An end user who doesn't like
> this can add a new glvnd loader file that makes the classic drivers
> higher precident and continue to use them.
> 
> Why fork from 21.1 instead of master?
> 
> First, it allows us to delete classic immediately, which will allow
> refactoring to happen earlier in the cycle, and for any fallout to be
> caught and hopefully fixed before the release. Second, it means that
> when a user is switched from 21.1 to the new classic-lts branch, there
> will be no regressions, and no one has to spend time figuring out what
> broke and fixing the lts branch.
> 
> When you say "build and critical bug fixes", what do you mean?
> 
> I mean update Meson if we rely on something that in the future is
> deprecated and removed, and would prevent building the branch or an
> relying on some compiler behavior that changes, gaping exploitable
> security holes, that kind of thing.
> 
> footnotes
> ??Or whatever color you like your bikeshed



> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] docs: consistent language

2021-03-17 Thread Alyssa Rosenzweig
>Absolutely. Apparently I forgot to write the final line which is that I
>think standardising on US English across the board is a good thing

Standardise on any flavour of English you'd like, one of them is still
wrong ;-)
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] Mesa 21.0 release

2021-03-12 Thread Alyssa Rosenzweig
> 3436e5295b pan/bi: Treat +DISCARD.f32 as message-passing  
>   
>   
>   
>   

Looking at actual dates, the bug this fixes didn't materialize until
after the 21.0 branch point, I just goofed with the tag. All good on my
end.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] NIR Debugging Tips

2021-02-26 Thread Alyssa Rosenzweig
Hi Tommy,

Unfortunately NIR is a bit lacking for documentation, save for comments
in src/compiler/nir/nir_intrinsics.py. For Vulkan ones, the best bet is
likely looking at spirv_to_nir (src/compiler/spirv/) and seeing what
generates it, and working up from there.

Debug strategy varies per driver but lots and lots of conformance
testing (VK-GL-CTS) :)

Alyssa

On Tue, Feb 23, 2021 at 10:35:08AM -0800, Tommy Chou wrote:
>Hi,
>Could I get some tips on figuring out what the NIR intrinsics
>do,**specifically the vulkan related ones? Also, what is the debug
>strategy**that is used to debug NIR shaders to check if the implementation
>is correct?
>Thanks,
>Tommy

> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] Perfetto CPU/GPU tracing

2021-02-18 Thread Alyssa Rosenzweig
> But on many other embedded OSes  - at least Google ones like  CrOS and
> Android - the security model is way stricter.  We could argue that is
> bad / undesirable / too draconian but that is something that any of us
> has the power to change. At some point each platform decides where it
> wants to be in the spectrum of "easy to hack" and "secure for the
> user". CrOS model is: you can hack as much as you want, but you need
> first to re-flash it in dev-mode.

Off-topic but, speaking from someone who grew up in the libre software
"purist" circles, I'm a big fan of the CrOS model here.
Draconian is when the user _can't_ put it in dev mode. If you can,
there's nothing wrong with sane defaults
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] Perfetto CPU/GPU tracing

2021-02-16 Thread Alyssa Rosenzweig
> That said, I'm ok with making perfetto support a build-time option
> that is default disabled.  And I think it would be even ok to limit
> use of perfetto to individual drivers (ie. no core mesa/gallium
> perfetto dependency) to start.  And, well, CrOS has plenty of mesa
> contributors so I think you can consider us signed up to maintain
> this.

As a general rule, I have no objection to build-time default disabled,
driver-specific stuff. Not my job to make technical decisions for other
driver teams.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] Perfetto CPU/GPU tracing

2021-02-12 Thread Alyssa Rosenzweig
Sure, I definitely see the use case for virgl :)

On Fri, Feb 12, 2021 at 02:43:25PM -0800, Chia-I Wu wrote:
> For virgl, where the biggest perf gaps often come from unnecessary CPU
> waits or high latencies of fence signaling, being able to insert
> userspace driver trace events and combine them with kernel ftrace
> events are a big plus.  Admittedly, there is no HW counters and my
> needs are simpler (inserting function begin/end and wait begin/end and
> combining them with virtio-gpu and dma-fence ftrace events).
> 
> On Fri, Feb 12, 2021 at 2:13 PM Alyssa Rosenzweig
>  wrote:
> >
> > My 2c for Mali/Panfrost --
> >
> > For us, capturing GPU perf counters is orthogonal to rendering. It's
> > expected (e.g. with Arm's tools) to do this from a separate process.
> > Neither Mesa nor the DDK should require custom instrumentation for the
> > low-level data. Fahien's gfx-pps handles this correctly for Panfrost +
> > Perfetto as it is. So for us I don't see the value in modifying Mesa for
> > tracing.
> >
> > On Fri, Feb 12, 2021 at 01:34:51PM -0800, John Bates wrote:
> > > (responding from correct address this time)
> > >
> > > On Fri, Feb 12, 2021 at 12:03 PM Mark Janes  
> > > wrote:
> > >
> > > > I've recently been using GPUVis to look at trace events.  On Intel
> > > > platforms, GPUVis incorporates ftrace events from the i915 driver,
> > > > performance metrics from igt-gpu-tools, and userspace ftrace markers
> > > > that I locally hack up in Mesa.
> > > >
> > >
> > > GPUVis is great. I would love to see that data combined with
> > > userspace events without any need for local hacks. Perfetto provides
> > > on-demand trace events with lower overhead compared to ftrace, so for
> > > example it is acceptable to have production trace instrumentation that can
> > > be captured without dev builds. To do that with ftrace it may require a 
> > > way
> > > to enable and disable the ftrace file writes to avoid the overhead when
> > > tracing is not in use. This is what Android does with systrace/atrace, for
> > > example, it uses Binder to notify processes about trace sessions. Perfetto
> > > does that in a more portable way.
> > >
> > >
> > > >
> > > > It is very easy to compile the GPUVis UI.  Userspace instrumentation
> > > > requires a single C/C++ header.  You don't have to access an external
> > > > web service to analyze trace data (a big no-no for devs working on
> > > > preproduction hardware).
> > > >
> > > > Is it possible to build and run the Perfetto UI locally?
> > >
> > >
> > > Yes, local UI builds are possible
> > > <https://github.com/google/perfetto/blob/5ff758df67da94d17734c2e70eb6738c4902953e/ui/README.md>.
> > > Also confirmed with the perfetto team <https://discord.gg/35ShE3A> that
> > > trace data is not uploaded unless you use the 'share' feature.
> > >
> > >
> > > >   Can it display
> > > > arbitrary trace events that are written to
> > > > /sys/kernel/tracing/trace_marker ?
> > >
> > >
> > > Yes, I believe it does support that via linux.ftrace data source
> > > <https://perfetto.dev/docs/quickstart/linux-tracing>. We use that for
> > > example to overlay CPU sched data to show what process is on each core
> > > throughout the timeline. There are many ftrace event types
> > > <https://github.com/google/perfetto/tree/5ff758df67da94d17734c2e70eb6738c4902953e/protos/perfetto/trace/ftrace>
> > > in
> > > the perfetto protos.
> > >
> > >
> > > > Can it be extended to show i915 and
> > > > i915-perf-recorder events?
> > > >
> > >
> > > It can be extended to consume custom data sources. One way this is done is
> > > via a bridge daemon, such as traced_probes which is responsible for
> > > capturing data from ftrace and /proc during a trace session and sending it
> > > to traced. traced is the main perfetto tracing daemon that notifies all
> > > trace data sources to start/stop tracing and communicates with user 
> > > tracing
> > > requests via the 'perfetto' command.
> > >
> > >
> > >
> > > >
> > > > John Bates  writes:
> > > >
> > > > > I recently opened issue 4262
> > > > > <https://gitlab.freedesktop.org/mesa/mesa/-/issues/4262> to begin the
> > > >

Re: [Mesa-dev] Perfetto CPU/GPU tracing

2021-02-12 Thread Alyssa Rosenzweig
My 2c for Mali/Panfrost --

For us, capturing GPU perf counters is orthogonal to rendering. It's
expected (e.g. with Arm's tools) to do this from a separate process.
Neither Mesa nor the DDK should require custom instrumentation for the
low-level data. Fahien's gfx-pps handles this correctly for Panfrost +
Perfetto as it is. So for us I don't see the value in modifying Mesa for
tracing.

On Fri, Feb 12, 2021 at 01:34:51PM -0800, John Bates wrote:
> (responding from correct address this time)
> 
> On Fri, Feb 12, 2021 at 12:03 PM Mark Janes  wrote:
> 
> > I've recently been using GPUVis to look at trace events.  On Intel
> > platforms, GPUVis incorporates ftrace events from the i915 driver,
> > performance metrics from igt-gpu-tools, and userspace ftrace markers
> > that I locally hack up in Mesa.
> >
> 
> GPUVis is great. I would love to see that data combined with
> userspace events without any need for local hacks. Perfetto provides
> on-demand trace events with lower overhead compared to ftrace, so for
> example it is acceptable to have production trace instrumentation that can
> be captured without dev builds. To do that with ftrace it may require a way
> to enable and disable the ftrace file writes to avoid the overhead when
> tracing is not in use. This is what Android does with systrace/atrace, for
> example, it uses Binder to notify processes about trace sessions. Perfetto
> does that in a more portable way.
> 
> 
> >
> > It is very easy to compile the GPUVis UI.  Userspace instrumentation
> > requires a single C/C++ header.  You don't have to access an external
> > web service to analyze trace data (a big no-no for devs working on
> > preproduction hardware).
> >
> > Is it possible to build and run the Perfetto UI locally?
> 
> 
> Yes, local UI builds are possible
> .
> Also confirmed with the perfetto team  that
> trace data is not uploaded unless you use the 'share' feature.
> 
> 
> >   Can it display
> > arbitrary trace events that are written to
> > /sys/kernel/tracing/trace_marker ?
> 
> 
> Yes, I believe it does support that via linux.ftrace data source
> . We use that for
> example to overlay CPU sched data to show what process is on each core
> throughout the timeline. There are many ftrace event types
> 
> in
> the perfetto protos.
> 
> 
> > Can it be extended to show i915 and
> > i915-perf-recorder events?
> >
> 
> It can be extended to consume custom data sources. One way this is done is
> via a bridge daemon, such as traced_probes which is responsible for
> capturing data from ftrace and /proc during a trace session and sending it
> to traced. traced is the main perfetto tracing daemon that notifies all
> trace data sources to start/stop tracing and communicates with user tracing
> requests via the 'perfetto' command.
> 
> 
> 
> >
> > John Bates  writes:
> >
> > > I recently opened issue 4262
> > >  to begin the
> > > discussion on integrating perfetto into mesa.
> > >
> > > *Background*
> > >
> > > System-wide tracing is an invaluable tool for developers to find and fix
> > > performance problems. The perfetto project enables a combined view of
> > trace
> > > data from kernel ftrace, GPU driver and various manually-instrumented
> > > tracepoints throughout the application and system. This helps developers
> > > quickly answer questions like:
> > >
> > >- How long are frames taking?
> > >- What caused a particular frame drop?
> > >- Is it CPU bound or GPU bound?
> > >- Did a CPU core frequency drop cause something to go slower than
> > usual?
> > >- Is something else running that is stealing CPU or GPU time? Could I
> > >fix that with better thread/context priorities?
> > >- Are all CPU cores being used effectively? Do I need
> > sched_setaffinity
> > >to keep my thread on a big or little core?
> > >- What’s the latency between CPU frame submit and GPU start?
> > >
> > > *What Does Mesa + Perfetto Provide?*
> > >
> > > Mesa is in a unique position to produce GPU trace data for several GPU
> > > vendors without requiring the developer to build and install additional
> > > tools like gfx-pps .
> > >
> > > The key is making it easy for developers to use. Ideally, perfetto is
> > > eventually available by default in mesa so that if your system has
> > perfetto
> > > traced running, you just need to run perfetto (perhaps along with setting
> > > an environment variable) with the mesa categories to see:
> > >
> > >- GPU processing timeline events.
> > >- GPU counters.
> > >- CPU events for potentially slow functions in mesa like shader
> > compiles.
> > >
> > > 

[Mesa-dev] Developer access

2021-01-21 Thread Alyssa Rosenzweig
Hi all,

Icecream95[1], a long-time Mesa/Panfrost contributor, has requested
developer access to mesa on the GitLab issue tracker [2]. Quoting here
for convenience of those who monitor the mailing list but not the
tracker:

> @alyssa keeps complaining about getting poked to assign stuff to Marge
> for me, so I think it's finally time to ask for Mesa developer access,
> so I can do the endless retries until CI stops flaking myself.
>
> Assuming a platform with char as unsigned, I think my 210 patches in
> Mesa is more than the 32 commits some have suggested to be required
> for the lofty privilege of labelling MRs and bothering the merge bot.

All else being equal, with my Panfrost maintainer hat, I am in favour.
Marge Bot access would reduce merge friction for their frequent
contributions, given the substantial timezone difference with our team
at Collabora. Additionally, labeling access would ease triage for us;
although I subscribe specifically to Panfrost MRs, I have to monitor all
MRs to ensure I haven't missed any from them, a frequent occurrence.

[ The cynic reads: This is more for my benefit than theirs ;-) ]

Thanks,

Alyssa

[1] https://gitlab.freedesktop.org/icecream95
[2] https://gitlab.freedesktop.org/mesa/mesa/-/issues/4135


signature.asc
Description: PGP signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] Rust drivers in Mesa

2020-10-14 Thread Alyssa Rosenzweig
> Since the majority opinion seemed to be "if someone wanted to use it
> in a leaf node without making everyone use it, that's fine", I've
> started trying to put together the CI bits necessary to enable it.
> Currently fighting with meson cross files a bit, but the linting works
> and the amd64 build works.
> 
> https://gitlab.freedesktop.org/anholt/mesa/-/commits/ci-rust

Good luck :)

I didn't think to enforce a lint during CI. Part of me wonders if we
should be doing that for C too, but we can shave that yak sometime else.



signature.asc
Description: PGP signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] Rust drivers in Mesa

2020-10-14 Thread Alyssa Rosenzweig
> > I think it's just going to get more messy and complicated for people who 
> > don't want to learn or use another language. Mesa already requires people 
> > to know C, Python, and now newly Gitlab CI scripts just to get stuff done 
> > and merged. Another language would only exacerbate the issue and steepen 
> > the learning curve.
> 
> To some extent I agree, and I think that's a good reason to start with
> Rust in a leaf-node of the project (a driver or compiler backend)
> rather than in a common piece of infrastructure.

Indeed, at least to probe the waters I'm interested in a backend
compiler, which necessarily only touches a few people and a few
architectures (fewer cross-platform issues. Mali is only* ever
paired with ARM CPUs, which have a good Rust story).

* We don't talk about SoFIA


signature.asc
Description: PGP signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] Rust drivers in Mesa

2020-10-14 Thread Alyssa Rosenzweig
> I have found that other tools like RAII/drop, the closely related smart
> pointer types, and safe containers (vectors, strings etc.) even without
> the borrow checker niceties, to be relatively more useful in preventing
> memory errors. However, these are features that modern C++ also offers,
> along with a seamless integration story with existing C and C++ code. I
> find that Rust has an edge in thread-safety, but I am not sure if this
> is a strong selling point in the context of Mesa, where the current
> design seems to be well served (for now) by the traditional thread
> safety patterns.

That's fair... aesthetically I've found Rust's presentation of these
patterns to be "nicer" than C++'s (although I recognize modern C++ is a
different beast than what I once learned). Techincally you're right that
both language have evolved quite some overlap.

> As an aside, as much as I like the practicality and richness of the
> crate ecosystem, which is one of the strong points of Rust, I do have
> concerns about its current security model. For example, the crates are
> not signed and thus vulnerable to several plausible attacks, like
> compromised github accounts or, even worse due to scale, compromised
> crate repositories. This is further aggravated by the sometimes large
> indirect dependency trees. Such concerns are especially relevant to Mesa
> (and other high-profile projects) since it could be an attractive target
> for malicious entities.  As enticing as it could be sometimes, I would
> be very hesitant to introduce external crate dependencies at this point
> in Mesa.

As mentioned, Rust with nothing more than the standard library is still
richer than what C gives us out-of-the-box. I don't expect to need
external crates flying around, at least for backend compiler purposes.


signature.asc
Description: PGP signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] Rust drivers in Mesa

2020-10-14 Thread Alyssa Rosenzweig
> Yep. Before we can land a single bit of code, we need to do a bunch of
> annoying things like build-system integration, FFI bridging, agreeing
> on conventions and style, etc etc. Trying to do that whilst also
> replacing the GLSL compiler or vtn is way too much; it's 100% doomed
> to failure, even if they're the highest-value targets on some axes.

For some data points, Eric started working on build system integration
(see below email). FFI will depend greatly on the component in question.
For a greenfield backend compiler that does NIR -> backed IR in C and
exposes the backend IR builders from Rust over a C ABI, there's little
in the way of FFI work needed to go. Conventions and style are easy --
go with rustfmt, the language is quite opinionated so we can quit
quibbling and get coding :)


signature.asc
Description: PGP signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] Rust drivers in Mesa

2020-10-14 Thread Alyssa Rosenzweig
> drive-by comment: for something like a gl driver that a lot of other
> things depend on, making it harder for us to depend on other external
> things is actually a good thing

I agree with this as well. The Rust standard library is richer than C's,
if we can get by fine with C + util/, that should be good enough for
Rust as well. We might end up growing a Rust-y version of util/ when we
have multiple Rust pieces in-tree, but meson should be ok with that.


signature.asc
Description: PGP signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] Rust drivers in Mesa

2020-10-04 Thread Alyssa Rosenzweig
Cc'd.

On Sun, Oct 04, 2020 at 03:17:28PM +0300, Michael Shigorin wrote:
>   Hello,
> regarding this proposal:
> http://lists.freedesktop.org/archives/mesa-dev/2020-October/224639.html
> 
> Alyssa, Rust is not "naturally fit for graphics driver
> development" since it's not as universally available as
> C/C++ in terms of mature compilers; you're basically
> trying to put a cart before a horse, which will just
> put more pressure on both Rust developers team *and*
> on those of us working on non-x86 arches capable of
> driving modern GPUs with Mesa.
> 
> For one, I'm porting ALT Linux onto e2k platform
> (there's only an early non-optimizing version of
> Rust port there by now), and we're maintaining repos
> for aarch64, armv7hf, ppc64el, mipsel, and riscv64 either
> -- none of which seem to be described as better than
> "experimental" at http://doc.rust-lang.org/core/arch page
> in terms of Rust support.
> 
> And there's another problem: no kind of tooling helps
> careless developers avoid mistakes as many previous
> silver bullet candidates have shown already; some of
> those only made things actually worse.
> 
> I am no Mesa developer, my closest contribution to it
> is probably an LLVM patch proposal.  But I've seen
> enough deterioration brought with the best intents
> of course -- both in free software during last ten
> years or so and in real life (I used to live in Kiev
> but not anymore, Denis can witness that if needed).
> 
> PS: feel free to CC: mesa-dev@, of course.
> 
> -- 
>   WBR, Michael Shigorin / http://altlinux.org
>   -- http://opennet.ru / http://anna-news.info
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] Rust drivers in Mesa

2020-10-01 Thread Alyssa Rosenzweig
Hi all,

Recently I've been thinking about the potential for the Rust programming
language in Mesa. Rust bills itself a safe system programming language
with comparable performance to C [0], which is a naturally fit for
graphics driver development.

Mesa today is written primarily in C, a notoriously low-level language,
with some components in C++. To handle the impedance mismatch, we've
built up a number of abstractions in-tree, including multiple ad hoc
code generators (GenXML, NIR algebraic passes, Bifrost disassembler). A
higher level language can help avoid the web of metaprogramming and
effect code that is simpler and easier to reason about. Similarly, a
better type system can aid static analysis.

Beyond abstraction, Rust's differentiating feature is the borrow checker
to guarantee memory safety. Historically, safety has not been a primary
concern of graphics drivers, since drivers are implemented as regular
userspace code running in the process of the application calling them.
Unfortunately, now that OpenGL is being exposed to untrusted code via
WebGL, the driver does become an attack vector.

For the time being, Mesa attempts to minimize memory bugs with defensive
programming, safe in-tree abstractions (including ralloc), and static
analysis via Coverity. Nevertheless, these are all heuristic solutions.
Static analysis is imperfect and in our case, proprietary software.
Ideally, the bugs we've been fixing via Coverity could be caught at
compile-time with a free and open source toolchain.

As Rust would allow exactly this, I see the primary benefit of Rust in
verifying correctness and robustness, rather than security concerns per
se.  Indeed, safety guarantees do translate well beyond WebGL.

Practically, how would Rust fit in with our existing C codebase?
Obviously I'm not suggesting a rewrite of Mesa's more than 15 million
lines of C. Instead, I see value in introducing Rust in targeted parts
of the tree. In particular, I envision backend compilers written in part
in Rust. While creating an idiomatic Rust wrapper for NIR or Gallium
would be prohibitively costly for now, a backend compiler could be
written in Rust with IR builders exported for use of the NIR -> backend
IR translator written in C.

This would have minimal impact on the tree. Users that are not building
such a driver would be unaffected. For those who _are_ building Rust
code, the Rust compiler would be added as a build-time dependency and
the (statically linked) Rust standard library would be added as a
runtime dependency. There is concern about the Rust compiler requiring
LLVM as a dependency, but again this is build-time, and no worse than
Mesa already requiring LLVM as a runtime dependency for llvmpipe and
clover. As for the standard library, it is possible to eliminate the
dependency as embedded Rust does, perhaps calling out to the C standard
library via the FFI, but this is likely quixotic. I do regret the binary
size increase, however.

Implications for the build system vary. Rust prefers to be built by its
own package manager, Cargo, which is tricky to integrate with other
build systems. Actually, Meson has native support for Rust, invoking the
compiler directly and skipping Cargo, as if it were C code. This support
is not widely adopted as it prevents linking with external libraries
("crates", in Rust parlance), with discussions between Rust and Meson
developers ending in a stand-still [1]. For Mesa, this might be just
fine. Our out-of-tree run-time dependencies are minimal for the C code,
and Rust's standard library largely avoids the need for us to maintain a
Rust version of util/ in-tree. If this proves impractical in the
long-term, it is possible to integrate Cargo with Meson on our end [2].

One outstanding concern is build-time, which has been a notorious
growing pain for Rust due to both language design and LLVM itself [3],
although there is active work to improve both fronts [4][5]. I build
Mesa on my Arm laptop, so I suppose I'd be hurt more than many of us.
There's also awkward bootstrapping questions, but there is work here too
[6].

If this is of interest, please discuss. It's clear to me Rust is not
going away any time soon, and I see value in Mesa embracing the new
technology. I'd like to hear other Mesa developers' thoughts.

Thanks,

Alyssa

[0] https://www.rust-lang.org/
[1] https://github.com/mesonbuild/meson/issues/2173
[2] https://gitlab.gnome.org/GNOME/fractal/-/blob/master/meson.build
[3] https://pingcap.com/blog/rust-compilation-model-calamity/
[4] 
https://blog.mozilla.org/nnethercote/2020/04/24/how-to-speed-up-the-rust-compiler-in-2020/
[5] https://github.com/bjorn3/rustc_codegen_cranelift
[6] https://github.com/thepowersgang/mrustc


signature.asc
Description: PGP signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 30/30] pan/midgard: Allow scheduling conditions with constants

2019-09-28 Thread Alyssa Rosenzweig
Now that we have constant adjustment logic abstracted, we can do this
safely. Along with the csel inversion patch, this allows many more
common csel ops to inline their condition in the bundle.

Signed-off-by: Alyssa Rosenzweig 
---
 src/panfrost/midgard/midgard_schedule.c | 14 ++
 1 file changed, 10 insertions(+), 4 deletions(-)

diff --git a/src/panfrost/midgard/midgard_schedule.c 
b/src/panfrost/midgard/midgard_schedule.c
index 2d20b617d1a..a26ea05ee50 100644
--- a/src/panfrost/midgard/midgard_schedule.c
+++ b/src/panfrost/midgard/midgard_schedule.c
@@ -654,6 +654,7 @@ static unsigned
 mir_comparison_mobile(
 compiler_context *ctx,
 midgard_instruction **instructions,
+struct midgard_predicate *predicate,
 unsigned count,
 unsigned cond)
 {
@@ -676,9 +677,9 @@ mir_comparison_mobile(
 if 
(GET_CHANNEL_COUNT(alu_opcode_props[instructions[i]->alu.op].props))
 return ~0;
 
-/* TODO: moving conditionals with constants */
+/* Ensure it will fit with constants */
 
-if (instructions[i]->has_constants)
+if (!mir_adjust_constants(instructions[i], predicate, false))
 return ~0;
 
 /* Ensure it is written only once */
@@ -689,6 +690,10 @@ mir_comparison_mobile(
 ret = i;
 }
 
+/* Inject constants now that we are sure we want to */
+if (ret != ~0)
+mir_adjust_constants(instructions[ret], predicate, true);
+
 return ret;
 }
 
@@ -700,6 +705,7 @@ static midgard_instruction *
 mir_schedule_comparison(
 compiler_context *ctx,
 midgard_instruction **instructions,
+struct midgard_predicate *predicate,
 BITSET_WORD *worklist, unsigned count,
 unsigned cond, bool vector, unsigned swizzle,
 midgard_instruction *user)
@@ -707,7 +713,7 @@ mir_schedule_comparison(
 /* TODO: swizzle when scheduling */
 unsigned comp_i =
 (!vector && (swizzle == 0)) ?
-mir_comparison_mobile(ctx, instructions, count, cond) : ~0;
+mir_comparison_mobile(ctx, instructions, predicate, count, 
cond) : ~0;
 
 /* If we can, schedule the condition immediately */
 if ((comp_i != ~0) && BITSET_TEST(worklist, comp_i)) {
@@ -747,7 +753,7 @@ mir_schedule_condition(compiler_context *ctx,
 /* Grab the conditional instruction */
 
 midgard_instruction *cond = mir_schedule_comparison(
-ctx, instructions, worklist, count, 
last->src[condition_index],
+ctx, instructions, predicate, worklist, count, 
last->src[condition_index],
 vector, last->cond_swizzle, last);
 
 /* We have exclusive reign over this (possibly move) conditional
-- 
2.23.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 27/30] pan/midgard: Tightly pack 32-bit constants

2019-09-28 Thread Alyssa Rosenzweig
If we can reuse constant slots from other instructions, we would like to
do so to include more instructions per bundle.

Signed-off-by: Alyssa Rosenzweig 
---
 src/panfrost/midgard/midgard_schedule.c | 129 +---
 1 file changed, 113 insertions(+), 16 deletions(-)

diff --git a/src/panfrost/midgard/midgard_schedule.c 
b/src/panfrost/midgard/midgard_schedule.c
index a35699893bd..2d20b617d1a 100644
--- a/src/panfrost/midgard/midgard_schedule.c
+++ b/src/panfrost/midgard/midgard_schedule.c
@@ -370,26 +370,123 @@ mir_adjust_constants(midgard_instruction *ins,
 if (!ins->has_constants)
 return true;
 
-/* TODO: Deduplicate; permit multiple constants within a bundle */
-
-if (destructive && !pred->constant_count) {
-if (ins->alu.reg_mode == midgard_reg_mode_16) {
-  /* TODO: Fix packing XXX */
-uint16_t *bundles = (uint16_t *) pred->constants;
-uint32_t *constants = (uint32_t *) ins->constants;
-
-/* Copy them wholesale */
-for (unsigned i = 0; i < 4; ++i)
-bundles[i] = constants[i];
-} else {
-memcpy(pred->constants, ins->constants, 16);
-}
+if (ins->alu.reg_mode == midgard_reg_mode_16) {
+/* TODO: 16-bit constant combining */
+if (pred->constant_count)
+return false;
+
+uint16_t *bundles = (uint16_t *) pred->constants;
+uint32_t *constants = (uint32_t *) ins->constants;
+
+/* Copy them wholesale */
+for (unsigned i = 0; i < 4; ++i)
+bundles[i] = constants[i];
 
 pred->constant_count = 16;
-return true;
+} else {
+/* Pack 32-bit constants */
+uint32_t *bundles = (uint32_t *) pred->constants;
+uint32_t *constants = (uint32_t *) ins->constants;
+unsigned r_constant = SSA_FIXED_REGISTER(REGISTER_CONSTANT);
+unsigned mask = mir_mask_of_read_components(ins, r_constant);
+
+/* First, check if it fits */
+unsigned count = DIV_ROUND_UP(pred->constant_count, 
sizeof(uint32_t));
+unsigned existing_count = count;
+
+for (unsigned i = 0; i < 4; ++i) {
+if (!(mask & (1 << i)))
+continue;
+
+bool ok = false;
+
+/* Look for existing constant */
+for (unsigned j = 0; j < existing_count; ++j) {
+if (bundles[j] == constants[i]) {
+ok = true;
+break;
+}
+}
+
+if (ok)
+continue;
+
+/* If the constant is new, check ourselves */
+for (unsigned j = 0; j < i; ++j) {
+if (constants[j] == constants[i]) {
+ok = true;
+break;
+}
+}
+
+if (ok)
+continue;
+
+/* Otherwise, this is a new constant */
+count++;
+}
+
+/* Check if we have space */
+if (count > 4)
+return false;
+
+/* If non-destructive, we're done */
+if (!destructive)
+return true;
+
+/* If destructive, let's copy in the new constants and adjust
+ * swizzles to pack it in. */
+
+uint32_t indices[4] = { 0 };
+
+/* Reset count */
+count = existing_count;
+
+for (unsigned i = 0; i < 4; ++i) {
+if (!(mask & (1 << i)))
+continue;
+
+uint32_t cons = constants[i];
+bool constant_found = false;
+
+/* Search for the constant */
+for (unsigned j = 0; j < count; ++j) {
+if (bundles[j] != cons)
+continue;
+
+/* We found it, reuse */
+indices[i] = j;
+constant_found = true;
+break;
+}
+
+if (constant_found)
+ 

[Mesa-dev] [PATCH 28/30] pan/midgard: Add mir_flip helper

2019-09-28 Thread Alyssa Rosenzweig
Useful for various operations on both commutative and anticommutative
ops.

Signed-off-by: Alyssa Rosenzweig 
---
 src/panfrost/midgard/compiler.h   |  1 +
 src/panfrost/midgard/midgard_opt_invert.c | 13 +++--
 src/panfrost/midgard/mir.c| 17 +
 3 files changed, 21 insertions(+), 10 deletions(-)

diff --git a/src/panfrost/midgard/compiler.h b/src/panfrost/midgard/compiler.h
index cb7d53dd653..8bbf67aa4e7 100644
--- a/src/panfrost/midgard/compiler.h
+++ b/src/panfrost/midgard/compiler.h
@@ -524,6 +524,7 @@ bool mir_nontrivial_outmod(midgard_instruction *ins);
 
 void mir_insert_instruction_before_scheduled(compiler_context *ctx, 
midgard_block *block, midgard_instruction *tag, midgard_instruction ins);
 void mir_insert_instruction_after_scheduled(compiler_context *ctx, 
midgard_block *block, midgard_instruction *tag, midgard_instruction ins);
+void mir_flip(midgard_instruction *ins);
 
 /* MIR goodies */
 
diff --git a/src/panfrost/midgard/midgard_opt_invert.c 
b/src/panfrost/midgard/midgard_opt_invert.c
index 729169f9a4a..f846bca5bf9 100644
--- a/src/panfrost/midgard/midgard_opt_invert.c
+++ b/src/panfrost/midgard/midgard_opt_invert.c
@@ -256,16 +256,9 @@ midgard_opt_fuse_src_invert(compiler_context *ctx, 
midgard_block *block)
 if (both) {
 ins->alu.op = mir_demorgan_op(ins->alu.op);
 } else if (right || (left && !ins->has_inline_constant)) {
-if (left) {
-/* Commute */
-unsigned temp = ins->src[0];
-ins->src[0] = ins->src[1];
-ins->src[1] = temp;
-
-temp = ins->alu.src1;
-ins->alu.src1 = ins->alu.src2;
-ins->alu.src2 = temp;
-}
+/* Commute arguments */
+if (left)
+mir_flip(ins);
 
 ins->alu.op = mir_notright_op(ins->alu.op);
 } else if (left && ins->has_inline_constant) {
diff --git a/src/panfrost/midgard/mir.c b/src/panfrost/midgard/mir.c
index faeac16d18d..f02527ff219 100644
--- a/src/panfrost/midgard/mir.c
+++ b/src/panfrost/midgard/mir.c
@@ -527,3 +527,20 @@ mir_insert_instruction_after_scheduled(
 memcpy(bundles + after + 1, , sizeof(new));
 list_addtail([0]->link, 
_bundle_1->instructions[0]->link);
 }
+
+/* Flip the first-two arguments of a (binary) op. Currently ALU
+ * only, no known uses for ldst/tex */
+
+void
+mir_flip(midgard_instruction *ins)
+{
+unsigned temp = ins->src[0];
+ins->src[0] = ins->src[1];
+ins->src[1] = temp;
+
+assert(ins->type == TAG_ALU_4);
+
+temp = ins->alu.src1;
+ins->alu.src1 = ins->alu.src2;
+ins->alu.src2 = temp;
+}
-- 
2.23.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 25/30] pan/midgard: Allow 6 instructions per bundle

2019-09-28 Thread Alyssa Rosenzweig
We never had a scheduler good enough to hit this case before! :)

Signed-off-by: Alyssa Rosenzweig 
---
 src/panfrost/midgard/compiler.h | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/src/panfrost/midgard/compiler.h b/src/panfrost/midgard/compiler.h
index cf943ea6995..cb7d53dd653 100644
--- a/src/panfrost/midgard/compiler.h
+++ b/src/panfrost/midgard/compiler.h
@@ -198,9 +198,10 @@ typedef struct midgard_bundle {
 /* Tag for the overall bundle */
 int tag;
 
-/* Instructions contained by the bundle */
+/* Instructions contained by the bundle. instruction_count <= 6 (vmul,
+ * sadd, vadd, smul, vlut, branch) */
 int instruction_count;
-midgard_instruction *instructions[5];
+midgard_instruction *instructions[6];
 
 /* Bundle-wide ALU configuration */
 int padding;
-- 
2.23.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 29/30] pan/midgard: Add csel invert optimization

2019-09-28 Thread Alyssa Rosenzweig
Signed-off-by: Alyssa Rosenzweig 
---
 src/panfrost/midgard/compiler.h   |  1 +
 src/panfrost/midgard/midgard_compile.c|  1 +
 src/panfrost/midgard/midgard_opt_invert.c | 25 +++
 3 files changed, 27 insertions(+)

diff --git a/src/panfrost/midgard/compiler.h b/src/panfrost/midgard/compiler.h
index 8bbf67aa4e7..b0b5ba07143 100644
--- a/src/panfrost/midgard/compiler.h
+++ b/src/panfrost/midgard/compiler.h
@@ -654,6 +654,7 @@ void midgard_lower_invert(compiler_context *ctx, 
midgard_block *block);
 bool midgard_opt_not_propagate(compiler_context *ctx, midgard_block *block);
 bool midgard_opt_fuse_src_invert(compiler_context *ctx, midgard_block *block);
 bool midgard_opt_fuse_dest_invert(compiler_context *ctx, midgard_block *block);
+bool midgard_opt_csel_invert(compiler_context *ctx, midgard_block *block);
 bool midgard_opt_promote_fmov(compiler_context *ctx, midgard_block *block);
 
 #endif
diff --git a/src/panfrost/midgard/midgard_compile.c 
b/src/panfrost/midgard/midgard_compile.c
index f1f8b7ecb9a..857e6c70112 100644
--- a/src/panfrost/midgard/midgard_compile.c
+++ b/src/panfrost/midgard/midgard_compile.c
@@ -2512,6 +2512,7 @@ midgard_compile_shader_nir(struct midgard_screen *screen, 
nir_shader *nir, midga
 progress |= midgard_opt_not_propagate(ctx, block);
 progress |= midgard_opt_fuse_src_invert(ctx, block);
 progress |= midgard_opt_fuse_dest_invert(ctx, block);
+progress |= midgard_opt_csel_invert(ctx, block);
 }
 } while (progress);
 
diff --git a/src/panfrost/midgard/midgard_opt_invert.c 
b/src/panfrost/midgard/midgard_opt_invert.c
index f846bca5bf9..3a4c455877c 100644
--- a/src/panfrost/midgard/midgard_opt_invert.c
+++ b/src/panfrost/midgard/midgard_opt_invert.c
@@ -275,3 +275,28 @@ midgard_opt_fuse_src_invert(compiler_context *ctx, 
midgard_block *block)
 
 return progress;
 }
+
+/* Optimizes a .not away when used as the source of a conditional select:
+ *
+ * csel(a, b, c)  = { b if a, c if !a }
+ * csel(!a, b, c) = { b if !a, c if !(!a) } = { c if a, b if !a } = csel(a, c, 
b)
+ * csel(!a, b, c) = csel(a, c, b)
+ */
+
+bool
+midgard_opt_csel_invert(compiler_context *ctx, midgard_block *block)
+{
+bool progress = false;
+
+mir_foreach_instr_in_block_safe(block, ins) {
+if (ins->type != TAG_ALU_4) continue;
+if (!OP_IS_CSEL(ins->alu.op)) continue;
+if (!mir_single_use(ctx, ins->src[2])) continue;
+if (!mir_strip_inverted(ctx, ins->src[2])) continue;
+
+mir_flip(ins);
+progress |= true;
+}
+
+return progress;
+}
-- 
2.23.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 24/30] pan/midgard: Only one conditional per bundle allowed

2019-09-28 Thread Alyssa Rosenzweig
There's no r32 to save ya after you use up r31 :)

Signed-off-by: Alyssa Rosenzweig 
---
 src/panfrost/midgard/midgard_schedule.c | 16 
 1 file changed, 16 insertions(+)

diff --git a/src/panfrost/midgard/midgard_schedule.c 
b/src/panfrost/midgard/midgard_schedule.c
index 5f271608a30..6dfe8d5a72e 100644
--- a/src/panfrost/midgard/midgard_schedule.c
+++ b/src/panfrost/midgard/midgard_schedule.c
@@ -335,6 +335,10 @@ struct midgard_predicate {
 
 /* Exclude this destination (if not ~0) */
 unsigned exclude;
+
+/* Don't schedule instructions consuming conditionals (since we already
+ * scheduled one). Excludes conditional branches and csel */
+bool no_cond;
 };
 
 /* For an instruction that can fit, adjust it to fit and update the constants
@@ -394,12 +398,14 @@ mir_choose_instruction(
 unsigned unit = predicate->unit;
 bool branch = alu && (unit == ALU_ENAB_BR_COMPACT);
 bool scalar = (unit != ~0) && (unit & UNITS_SCALAR);
+bool no_cond = predicate->no_cond;
 
 /* Iterate to find the best instruction satisfying the predicate */
 unsigned i;
 BITSET_WORD tmp;
 
 signed best_index = -1;
+bool best_conditional = false;
 
 /* Enforce a simple metric limiting distance to keep down register
  * pressure. TOOD: replace with liveness tracking for much better
@@ -434,11 +440,18 @@ mir_choose_instruction(
 if (alu && !mir_adjust_constants(instructions[i], predicate, 
false))
 continue;
 
+bool conditional = alu && !branch && 
OP_IS_CSEL(instructions[i]->alu.op);
+conditional |= (branch && !instructions[i]->prepacked_branch 
&& instructions[i]->branch.conditional);
+
+if (conditional && no_cond)
+continue;
+
 /* Simulate in-order scheduling */
 if ((signed) i < best_index)
 continue;
 
 best_index = i;
+best_conditional = conditional;
 }
 
 
@@ -455,6 +468,9 @@ mir_choose_instruction(
 
 if (alu)
 mir_adjust_constants(instructions[best_index], 
predicate, true);
+
+/* Once we schedule a conditional, we can't again */
+predicate->no_cond |= best_conditional;
 }
 
 return instructions[best_index];
-- 
2.23.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 23/30] pan/midgard: Schedule to smul/sadd

2019-09-28 Thread Alyssa Rosenzweig
Signed-off-by: Alyssa Rosenzweig 
---
 src/panfrost/midgard/midgard_schedule.c | 5 +
 1 file changed, 5 insertions(+)

diff --git a/src/panfrost/midgard/midgard_schedule.c 
b/src/panfrost/midgard/midgard_schedule.c
index 094451ceb9d..5f271608a30 100644
--- a/src/panfrost/midgard/midgard_schedule.c
+++ b/src/panfrost/midgard/midgard_schedule.c
@@ -753,6 +753,8 @@ mir_schedule_alu(
 unreachable("Bad condition");
 }
 
+mir_choose_alu(, instructions, worklist, len, , 
UNIT_SMUL);
+
 if (!writeout)
 mir_choose_alu(, instructions, worklist, len, , 
UNIT_VLUT);
 
@@ -777,6 +779,9 @@ mir_schedule_alu(
 unreachable("Bad condition");
 }
 
+/* Stage 2, let's schedule sadd before vmul for writeout */
+mir_choose_alu(, instructions, worklist, len, , 
UNIT_SADD);
+
 /* Check if writeout reads its own register */
 bool bad_writeout = false;
 
-- 
2.23.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 20/30] pan/midgard: Use new scheduler

2019-09-28 Thread Alyssa Rosenzweig
We still emit in-order but we switch to using the bundles created from
the new scheduler, which will allow greater flexibility and room for
out-of-order optimization.

Signed-off-by: Alyssa Rosenzweig 
---
 src/panfrost/midgard/compiler.h |   6 -
 src/panfrost/midgard/midgard_compile.c  | 103 +---
 src/panfrost/midgard/midgard_schedule.c | 699 +---
 3 files changed, 130 insertions(+), 678 deletions(-)

diff --git a/src/panfrost/midgard/compiler.h b/src/panfrost/midgard/compiler.h
index 60d5b9d0e20..cf943ea6995 100644
--- a/src/panfrost/midgard/compiler.h
+++ b/src/panfrost/midgard/compiler.h
@@ -97,12 +97,6 @@ typedef struct midgard_instruction {
 /* I.e. (1 << alu_bit) */
 int unit;
 
-/* When emitting bundle, should this instruction have a break forced
- * before it? Used for r31 writes which are valid only within a single
- * bundle and *need* to happen as early as possible... this is a hack,
- * TODO remove when we have a scheduler */
-bool precede_break;
-
 bool has_constants;
 uint32_t constants[4];
 uint16_t inline_constant;
diff --git a/src/panfrost/midgard/midgard_compile.c 
b/src/panfrost/midgard/midgard_compile.c
index 337f46f2e0d..f1f8b7ecb9a 100644
--- a/src/panfrost/midgard/midgard_compile.c
+++ b/src/panfrost/midgard/midgard_compile.c
@@ -617,82 +617,6 @@ nir_is_non_scalar_swizzle(nir_alu_src *src, unsigned 
nr_components)
 return false;
 }
 
-/* Midgard puts scalar conditionals in r31.w; move an arbitrary source (the
- * output of a conditional test) into that register */
-
-static void
-emit_condition(compiler_context *ctx, nir_src *src, bool for_branch, unsigned 
component)
-{
-int condition = nir_src_index(ctx, src);
-
-/* Source to swizzle the desired component into w */
-
-const midgard_vector_alu_src alu_src = {
-.swizzle = SWIZZLE(component, component, component, component),
-};
-
-/* There is no boolean move instruction. Instead, we simulate a move by
- * ANDing the condition with itself to get it into r31.w */
-
-midgard_instruction ins = {
-.type = TAG_ALU_4,
-
-/* We need to set the conditional as close as possible */
-.precede_break = true,
-.unit = for_branch ? UNIT_SMUL : UNIT_SADD,
-.mask = 1 << COMPONENT_W,
-.src = { condition, condition, ~0 },
-.dest = SSA_FIXED_REGISTER(31),
-
-.alu = {
-.op = midgard_alu_op_iand,
-.outmod = midgard_outmod_int_wrap,
-.reg_mode = midgard_reg_mode_32,
-.dest_override = midgard_dest_override_none,
-.src1 = vector_alu_srco_unsigned(alu_src),
-.src2 = vector_alu_srco_unsigned(alu_src)
-},
-};
-
-emit_mir_instruction(ctx, ins);
-}
-
-/* Or, for mixed conditions (with csel_v), here's a vector version using all of
- * r31 instead */
-
-static void
-emit_condition_mixed(compiler_context *ctx, nir_alu_src *src, unsigned nr_comp)
-{
-int condition = nir_src_index(ctx, >src);
-
-/* Source to swizzle the desired component into w */
-
-const midgard_vector_alu_src alu_src = {
-.swizzle = SWIZZLE_FROM_ARRAY(src->swizzle),
-};
-
-/* There is no boolean move instruction. Instead, we simulate a move by
- * ANDing the condition with itself to get it into r31.w */
-
-midgard_instruction ins = {
-.type = TAG_ALU_4,
-.precede_break = true,
-.mask = mask_of(nr_comp),
-.src = { condition, condition, ~0 },
-.dest = SSA_FIXED_REGISTER(31),
-.alu = {
-.op = midgard_alu_op_iand,
-.outmod = midgard_outmod_int_wrap,
-.reg_mode = midgard_reg_mode_32,
-.dest_override = midgard_dest_override_none,
-.src1 = vector_alu_srco_unsigned(alu_src),
-.src2 = vector_alu_srco_unsigned(alu_src)
-},
-};
-
-emit_mir_instruction(ctx, ins);
-}
-
 #define ALU_CASE(nir, _op) \
case nir_op_##nir: \
op = midgard_alu_op_##_op; \
@@ -980,21 +904,16 @@ emit_alu(compiler_context *ctx, nir_alu_instr *instr)
 bool mixed = nir_is_non_scalar_swizzle(>src[0], 
nr_components);
 op = mixed ? midgard_alu_op_icsel_v : midgard_alu_op_icsel;
 
-/* csel works as a two-arg in Midgard, since the condition is 
hardcoded in r31.w */
-nr_inputs = 2;
-
-/* Emit the condition into r31 */
-
-if (mixed)
-emit_condition_mixed(ctx, &g

[Mesa-dev] [PATCH 26/30] pan/midgard: Allow writeout to see into the future

2019-09-28 Thread Alyssa Rosenzweig
If an instruction could be scheduled to vmul to satisfy the writeout
conditions, let's do that and save an instruction+cycle per fragment
shader.

Signed-off-by: Alyssa Rosenzweig 
---
 src/panfrost/midgard/midgard_schedule.c | 41 -
 1 file changed, 40 insertions(+), 1 deletion(-)

diff --git a/src/panfrost/midgard/midgard_schedule.c 
b/src/panfrost/midgard/midgard_schedule.c
index 6dfe8d5a72e..a35699893bd 100644
--- a/src/panfrost/midgard/midgard_schedule.c
+++ b/src/panfrost/midgard/midgard_schedule.c
@@ -339,6 +339,12 @@ struct midgard_predicate {
 /* Don't schedule instructions consuming conditionals (since we already
  * scheduled one). Excludes conditional branches and csel */
 bool no_cond;
+
+/* Require a minimal mask and (if nonzero) given destination. Used for
+ * writeout optimizations */
+
+unsigned mask;
+unsigned dest;
 };
 
 /* For an instruction that can fit, adjust it to fit and update the constants
@@ -400,6 +406,10 @@ mir_choose_instruction(
 bool scalar = (unit != ~0) && (unit & UNITS_SCALAR);
 bool no_cond = predicate->no_cond;
 
+unsigned mask = predicate->mask;
+unsigned dest = predicate->dest;
+bool needs_dest = mask & 0xF;
+
 /* Iterate to find the best instruction satisfying the predicate */
 unsigned i;
 BITSET_WORD tmp;
@@ -440,6 +450,12 @@ mir_choose_instruction(
 if (alu && !mir_adjust_constants(instructions[i], predicate, 
false))
 continue;
 
+if (needs_dest && instructions[i]->dest != dest)
+continue;
+
+if (mask && ((~instructions[i]->mask) & mask))
+continue;
+
 bool conditional = alu && !branch && 
OP_IS_CSEL(instructions[i]->alu.op);
 conditional |= (branch && !instructions[i]->prepacked_branch 
&& instructions[i]->branch.conditional);
 
@@ -817,7 +833,30 @@ mir_schedule_alu(
 bad_writeout |= mir_has_arg(stages[i], branch->src[0]);
 }
 
-/* Add a move if necessary */
+/* It's possible we'll be able to schedule something into vmul
+ * to fill r0. Let's peak into the future, trying to schedule
+ * vmul specially that way. */
+
+if (!bad_writeout && writeout_mask != 0xF) {
+predicate.unit = UNIT_VMUL;
+predicate.dest = src;
+predicate.mask = writeout_mask ^ 0xF;
+
+struct midgard_instruction *peaked =
+mir_choose_instruction(instructions, worklist, 
len, );
+
+if (peaked) {
+vmul = peaked;
+vmul->unit = UNIT_VMUL;
+writeout_mask |= predicate.mask;
+assert(writeout_mask == 0xF);
+}
+
+/* Cleanup */
+predicate.dest = predicate.mask = 0;
+}
+
+/* Finally, add a move if necessary */
 if (bad_writeout || writeout_mask != 0xF) {
 unsigned temp = (branch->src[0] == ~0) ? 
SSA_FIXED_REGISTER(0) : make_compiler_temp(ctx);
 midgard_instruction mov = v_mov(src, blank_alu_src, 
temp);
-- 
2.23.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 17/30] pan/midgard: Implement load/store pairing

2019-09-28 Thread Alyssa Rosenzweig
We can bundle two load/store together. This eliminates the need for
explicit load/store pairing in a prepass, as well.

Signed-off-by: Alyssa Rosenzweig 
---
 src/panfrost/midgard/midgard_schedule.c | 67 +
 1 file changed, 12 insertions(+), 55 deletions(-)

diff --git a/src/panfrost/midgard/midgard_schedule.c 
b/src/panfrost/midgard/midgard_schedule.c
index 92756e15189..e2641ea0180 100644
--- a/src/panfrost/midgard/midgard_schedule.c
+++ b/src/panfrost/midgard/midgard_schedule.c
@@ -1141,17 +1141,26 @@ mir_schedule_ldst(
 .exclude = ~0
 };
 
+/* Try to pick two load/store ops. Second not gauranteed to exist */
+
 midgard_instruction *ins =
 mir_choose_instruction(instructions, worklist, len, 
);
 
-mir_update_worklist(worklist, len, instructions, ins);
+midgard_instruction *pair =
+mir_choose_instruction(instructions, worklist, len, 
);
 
 struct midgard_bundle out = {
 .tag = TAG_LOAD_STORE_4,
-.instruction_count = 1,
-.instructions = { ins }
+.instruction_count = pair ? 2 : 1,
+.instructions = { ins, pair }
 };
 
+/* We have to update the worklist atomically, since the two
+ * instructions run concurrently (TODO: verify it's not pipelined) */
+
+mir_update_worklist(worklist, len, instructions, ins);
+mir_update_worklist(worklist, len, instructions, pair);
+
 return out;
 }
 
@@ -1341,54 +1350,6 @@ schedule_block(compiler_context *ctx, midgard_block 
*block)
 ctx->quadword_count += block->quadword_count;
 }
 
-/* The following passes reorder MIR instructions to enable better scheduling */
-
-static void
-midgard_pair_load_store(compiler_context *ctx, midgard_block *block)
-{
-mir_foreach_instr_in_block_safe(block, ins) {
-if (ins->type != TAG_LOAD_STORE_4) continue;
-
-/* We've found a load/store op. Check if next is also 
load/store. */
-midgard_instruction *next_op = mir_next_op(ins);
-if (_op->link != >instructions) {
-if (next_op->type == TAG_LOAD_STORE_4) {
-/* If so, we're done since we're a pair */
-ins = mir_next_op(ins);
-continue;
-}
-
-/* Maximum search distance to pair, to avoid register 
pressure disasters */
-int search_distance = 8;
-
-/* Otherwise, we have an orphaned load/store -- search 
for another load */
-mir_foreach_instr_in_block_from(block, c, 
mir_next_op(ins)) {
-/* Terminate search if necessary */
-if (!(search_distance--)) break;
-
-if (c->type != TAG_LOAD_STORE_4) continue;
-
-/* We can only reorder if there are no sources 
*/
-
-bool deps = false;
-
-for (unsigned s = 0; s < ARRAY_SIZE(ins->src); 
++s)
-deps |= (c->src[s] != ~0);
-
-if (deps)
-continue;
-
-/* We found one! Move it up to pair and remove 
it from the old location */
-
-mir_insert_instruction_before(ctx, ins, *c);
-mir_remove_instruction(c);
-
-break;
-}
-}
-}
-}
-
 /* When we're 'squeezing down' the values in the IR, we maintain a hash
  * as such */
 
@@ -1665,10 +1626,6 @@ schedule_program(compiler_context *ctx)
 
 midgard_promote_uniforms(ctx, 16);
 
-mir_foreach_block(ctx, block) {
-midgard_pair_load_store(ctx, block);
-}
-
 /* Must be lowered right before RA */
 mir_squeeze_index(ctx);
 mir_lower_special_reads(ctx);
-- 
2.23.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 21/30] pan/midgard: Don't double check SCALAR units

2019-09-28 Thread Alyssa Rosenzweig
Signed-off-by: Alyssa Rosenzweig 
---
 src/panfrost/midgard/midgard_schedule.c | 4 
 1 file changed, 4 deletions(-)

diff --git a/src/panfrost/midgard/midgard_schedule.c 
b/src/panfrost/midgard/midgard_schedule.c
index f883830cf86..e71f65f6004 100644
--- a/src/panfrost/midgard/midgard_schedule.c
+++ b/src/panfrost/midgard/midgard_schedule.c
@@ -188,10 +188,6 @@ is_single_component_mask(unsigned mask)
 static bool
 mir_is_scalar(midgard_instruction *ains)
 {
-/* Does the op support scalar units? */
-if (!(alu_opcode_props[ains->alu.op].props & UNITS_SCALAR))
-return false;
-
 /* Do we try to use it as a vector op? */
 if (!is_single_component_mask(ains->mask))
 return false;
-- 
2.23.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 16/30] pan/midgard: Extend csel_swizzle to branches

2019-09-28 Thread Alyssa Rosenzweig
Conditions for branches don't have a swizzle explicitly in the emitted
binary, but they do implicitly get swizzled in whatever instruction
wrote r31, so we need to handle that.

Signed-off-by: Alyssa Rosenzweig 
---
 src/panfrost/midgard/compiler.h| 4 ++--
 src/panfrost/midgard/midgard_compile.c | 2 +-
 src/panfrost/midgard/mir.c | 9 +++--
 3 files changed, 10 insertions(+), 5 deletions(-)

diff --git a/src/panfrost/midgard/compiler.h b/src/panfrost/midgard/compiler.h
index 780e4323554..60d5b9d0e20 100644
--- a/src/panfrost/midgard/compiler.h
+++ b/src/panfrost/midgard/compiler.h
@@ -88,8 +88,8 @@ typedef struct midgard_instruction {
 unsigned src[3];
 unsigned dest;
 
-/* Swizzle for the conditional for a csel */
-unsigned csel_swizzle;
+/* Swizzle for the conditional for a csel/branch */
+unsigned cond_swizzle;
 
 /* Special fields for an ALU instruction */
 midgard_reg_info registers;
diff --git a/src/panfrost/midgard/midgard_compile.c 
b/src/panfrost/midgard/midgard_compile.c
index 95ec48e9563..337f46f2e0d 100644
--- a/src/panfrost/midgard/midgard_compile.c
+++ b/src/panfrost/midgard/midgard_compile.c
@@ -1095,7 +1095,7 @@ emit_alu(compiler_context *ctx, nir_alu_instr *instr)
 };
 
 if (nr_inputs == 3) {
-ins.csel_swizzle = SWIZZLE_FROM_ARRAY(nirmods[2]->swizzle);
+ins.cond_swizzle = SWIZZLE_FROM_ARRAY(nirmods[2]->swizzle);
 assert(!nirmods[2]->abs);
 assert(!nirmods[2]->negate);
 }
diff --git a/src/panfrost/midgard/mir.c b/src/panfrost/midgard/mir.c
index 9e5ba7abcb0..faeac16d18d 100644
--- a/src/panfrost/midgard/mir.c
+++ b/src/panfrost/midgard/mir.c
@@ -42,8 +42,8 @@ unsigned
 mir_get_swizzle(midgard_instruction *ins, unsigned idx)
 {
 if (ins->type == TAG_ALU_4) {
-if (idx == 2)
-return ins->csel_swizzle;
+if (idx == 2 || ins->compact_branch)
+return ins->cond_swizzle;
 
 unsigned b = (idx == 0) ? ins->alu.src1 : ins->alu.src2;
 
@@ -105,6 +105,11 @@ void
 mir_set_swizzle(midgard_instruction *ins, unsigned idx, unsigned new)
 {
 if (ins->type == TAG_ALU_4) {
+if (idx == 2 || ins->compact_branch) {
+ins->cond_swizzle = new;
+return;
+}
+
 unsigned b = (idx == 0) ? ins->alu.src1 : ins->alu.src2;
 
 midgard_vector_alu_src s =
-- 
2.23.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 18/30] pan/midgard: Add mir_choose_alu helper

2019-09-28 Thread Alyssa Rosenzweig
Based on a given unit.

Signed-off-by: Alyssa Rosenzweig 
---
 src/panfrost/midgard/midgard_schedule.c | 24 
 1 file changed, 24 insertions(+)

diff --git a/src/panfrost/midgard/midgard_schedule.c 
b/src/panfrost/midgard/midgard_schedule.c
index e2641ea0180..dea8b023e9d 100644
--- a/src/panfrost/midgard/midgard_schedule.c
+++ b/src/panfrost/midgard/midgard_schedule.c
@@ -818,6 +818,9 @@ struct midgard_predicate {
 /* True if we want to pop off the chosen instruction */
 bool destructive;
 
+/* For ALU, choose only this unit */
+unsigned unit;
+
 /* State for bundle constants. constants is the actual constants
  * for the bundle. constant_count is the number of bytes (up to
  * 16) currently in use for constants. When picking in destructive
@@ -957,6 +960,27 @@ mir_choose_bundle(
 return ~0;
 }
 
+/* We want to choose an ALU instruction filling a given unit */
+static void
+mir_choose_alu(midgard_instruction **slot,
+midgard_instruction **instructions,
+BITSET_WORD *worklist, unsigned len,
+struct midgard_predicate *predicate,
+unsigned unit)
+{
+/* Did we already schedule to this slot? */
+if ((*slot) != NULL)
+return;
+
+/* Try to schedule something, if not */
+predicate->unit = unit;
+*slot = mir_choose_instruction(instructions, worklist, len, predicate);
+
+/* Store unit upon scheduling */
+if (*slot && !((*slot)->compact_branch))
+(*slot)->unit = unit;
+}
+
 /* When we are scheduling a branch/csel, we need the consumed condition in the
  * same block as a pipeline register. There are two options to enable this:
  *
-- 
2.23.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 05/30] pan/midgard: Calculate dependency graph

2019-09-28 Thread Alyssa Rosenzweig
Signed-off-by: Alyssa Rosenzweig 
---
 src/panfrost/midgard/compiler.h |  10 ++
 src/panfrost/midgard/midgard_schedule.c | 121 
 2 files changed, 131 insertions(+)

diff --git a/src/panfrost/midgard/compiler.h b/src/panfrost/midgard/compiler.h
index 8612bab7686..780e4323554 100644
--- a/src/panfrost/midgard/compiler.h
+++ b/src/panfrost/midgard/compiler.h
@@ -136,6 +136,16 @@ typedef struct midgard_instruction {
 /* Generic hint for intra-pass use */
 bool hint;
 
+/* During scheduling, the backwards dependency graph
+ * (DAG). nr_dependencies is the number of unscheduled
+ * instructions that must still be scheduled after
+ * (before) this instruction. dependents are which
+ * instructions need to be scheduled before (after) this
+ * instruction. */
+
+unsigned nr_dependencies;
+BITSET_WORD *dependents;
+
 union {
 midgard_load_store_word load_store;
 midgard_vector_alu alu;
diff --git a/src/panfrost/midgard/midgard_schedule.c 
b/src/panfrost/midgard/midgard_schedule.c
index 75295b5d123..70fa030390c 100644
--- a/src/panfrost/midgard/midgard_schedule.c
+++ b/src/panfrost/midgard/midgard_schedule.c
@@ -55,6 +55,119 @@
  *
  */
 
+/* We create the dependency graph with per-component granularity */
+
+#define COMPONENT_COUNT 8
+
+static void
+add_dependency(struct util_dynarray *table, unsigned index, unsigned mask, 
midgard_instruction **instructions, unsigned child)
+{
+for (unsigned i = 0; i < COMPONENT_COUNT; ++i) {
+if (!(mask & (1 << i)))
+continue;
+
+struct util_dynarray *parents = [(COMPONENT_COUNT * 
index) + i];
+
+util_dynarray_foreach(parents, unsigned, parent) {
+BITSET_WORD *dependents = 
instructions[*parent]->dependents;
+
+/* Already have the dependency */
+if (BITSET_TEST(dependents, child))
+continue;
+
+BITSET_SET(dependents, child);
+instructions[child]->nr_dependencies++;
+}
+}
+}
+
+static void
+mark_access(struct util_dynarray *table, unsigned index, unsigned mask, 
unsigned parent)
+{
+for (unsigned i = 0; i < COMPONENT_COUNT; ++i) {
+if (!(mask & (1 << i)))
+continue;
+
+util_dynarray_append([(COMPONENT_COUNT * index) + i], 
unsigned, parent);
+}
+}
+
+static void
+mir_create_dependency_graph(midgard_instruction **instructions, unsigned 
count, unsigned node_count)
+{
+size_t sz = node_count * COMPONENT_COUNT;
+
+struct util_dynarray *last_read = calloc(sizeof(struct util_dynarray), 
sz);
+struct util_dynarray *last_write = calloc(sizeof(struct 
util_dynarray), sz);
+
+for (unsigned i = 0; i < sz; ++i) {
+util_dynarray_init(_read[i], NULL);
+util_dynarray_init(_write[i], NULL);
+}
+
+/* Initialize dependency graph */
+for (unsigned i = 0; i < count; ++i) {
+instructions[i]->dependents =
+calloc(BITSET_WORDS(count), sizeof(BITSET_WORD));
+
+instructions[i]->nr_dependencies = 0;
+}
+
+/* Populate dependency graph */
+for (signed i = count - 1; i >= 0; --i) {
+if (instructions[i]->compact_branch)
+continue;
+
+unsigned dest = instructions[i]->dest;
+unsigned mask = instructions[i]->mask;
+
+mir_foreach_src((*instructions), s) {
+unsigned src = instructions[i]->src[s];
+
+if (src < node_count) {
+unsigned readmask = 
mir_mask_of_read_components(instructions[i], src);
+add_dependency(last_write, src, readmask, 
instructions, i);
+}
+}
+
+if (dest < node_count) {
+add_dependency(last_read, dest, mask, instructions, i);
+add_dependency(last_write, dest, mask, instructions, 
i);
+mark_access(last_write, dest, mask, i);
+}
+
+mir_foreach_src((*instructions), s) {
+unsigned src = instructions[i]->src[s];
+
+if (src < node_count) {
+unsigned readmask = 
mir_mask_of_read_components(instructions[i], src);
+mark_access(last_read, src, readmask, i);
+}
+}
+}
+
+/* If there is a branch, all instructions depend on it, as interblock
+ * execution must be pu

[Mesa-dev] [PATCH 04/30] pan/midgard: Add flatten_mir helper

2019-09-28 Thread Alyssa Rosenzweig
We would like to flatten a linked list of midgard_instructions into an
array of midgard_instruction pointers on the heap.

Signed-off-by: Alyssa Rosenzweig 
---
 src/panfrost/midgard/midgard_schedule.c | 22 ++
 1 file changed, 22 insertions(+)

diff --git a/src/panfrost/midgard/midgard_schedule.c 
b/src/panfrost/midgard/midgard_schedule.c
index 2ed25da2e0d..75295b5d123 100644
--- a/src/panfrost/midgard/midgard_schedule.c
+++ b/src/panfrost/midgard/midgard_schedule.c
@@ -620,6 +620,28 @@ schedule_bundle(compiler_context *ctx, midgard_block 
*block, midgard_instruction
 return bundle;
 }
 
+/* We would like to flatten the linked list of midgard_instructions in a bundle
+ * to an array of pointers on the heap for easy indexing */
+
+static midgard_instruction **
+flatten_mir(midgard_block *block, unsigned *len)
+{
+*len = list_length(>instructions);
+
+if (!(*len))
+return NULL;
+
+midgard_instruction **instructions =
+calloc(sizeof(midgard_instruction *), *len);
+
+unsigned i = 0;
+
+mir_foreach_instr_in_block(block, ins)
+instructions[i++] = ins;
+
+return instructions;
+}
+
 /* Schedule a single block by iterating its instruction to create bundles.
  * While we go, tally about the bundle sizes to compute the block size. */
 
-- 
2.23.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 00/30] pan/midgard: Implement out-of-order scheduler

2019-09-28 Thread Alyssa Rosenzweig
For each block, we compute an instruction dependency graph, which allows
us to maintain a worklist of instructions able to be scheduled (no
dependencies). We traverse this graph backwards, emitting instructions
backwards and reversing the block at the end, to allow conditionals and
writeout to be expressed naturally and to enable liveness tracking in
the future.

At the moment, we do not schedule for register pressure; we just aim to
minimize bundles (register pressure is capped here by a distance
metric). Scheduling for register pressure will be the next step as soon
as this series is finalized; at that point, we'll augment
mir_choose_instruction and mir_choose_bundle with pressure-aware
heuristics.

Overall, I'm quite happy with how this came out. Full stats below.
Highlight is a 10% reduction of cycle count for glmark shaders.

total instructions in shared programs: 3500 -> 3472 (-0.80%)
instructions in affected programs: 685 -> 657 (-4.09%)
helped: 15
HURT: 1
helped stats (abs) min: 1 max: 5 x̄: 1.93 x̃: 1
helped stats (rel) min: 1.49% max: 12.20% x̄: 4.65% x̃: 3.45%
HURT stats (abs)   min: 1 max: 1 x̄: 1.00 x̃: 1
HURT stats (rel)   min: 4.00% max: 4.00% x̄: 4.00% x̃: 4.00%
95% mean confidence interval for instructions value: -2.61 -0.89
95% mean confidence interval for instructions %-change: -6.03% -2.18%
Instructions are helped.

total bundles in shared programs: 2020 -> 1804 (-10.69%)
bundles in affected programs: 1343 -> 1127 (-16.08%)
helped: 45
HURT: 0
helped stats (abs) min: 1 max: 48 x̄: 4.80 x̃: 2
helped stats (rel) min: 5.56% max: 34.78% x̄: 13.50% x̃: 12.50%
95% mean confidence interval for bundles value: -7.15 -2.45
95% mean confidence interval for bundles %-change: -15.98% -11.03%
Bundles are helped.

total quadwords in shared programs: 3336 -> 3130 (-6.18%)
quadwords in affected programs: 2162 -> 1956 (-9.53%)
helped: 51
HURT: 0
helped stats (abs) min: 1 max: 17 x̄: 4.04 x̃: 2
helped stats (rel) min: 1.72% max: 22.92% x̄: 10.94% x̃: 11.11%
95% mean confidence interval for quadwords value: -5.32 -2.76
95% mean confidence interval for quadwords %-change: -12.47% -9.42%
Quadwords are helped.

total registers in shared programs: 343 -> 327 (-4.66%)
registers in affected programs: 123 -> 107 (-13.01%)
helped: 22
HURT: 5
helped stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1
helped stats (rel) min: 12.50% max: 33.33% x̄: 25.61% x̃: 25.00%
HURT stats (abs)   min: 1 max: 2 x̄: 1.20 x̃: 1
HURT stats (rel)   min: 8.33% max: 100.00% x̄: 37.00% x̃: 20.00%
95% mean confidence interval for registers value: -0.94 -0.24
95% mean confidence interval for registers %-change: -25.65% -2.38%
Registers are helped.

total threads in shared programs: 451 -> 459 (1.77%)
threads in affected programs: 8 -> 16 (100.00%)
helped: 4
HURT: 0
helped stats (abs) min: 2 max: 2 x̄: 2.00 x̃: 2
helped stats (rel) min: 100.00% max: 100.00% x̄: 100.00% x̃: 100.00%
95% mean confidence interval for threads value: 2.00 2.00
95% mean confidence interval for threads %-change: 100.00% 100.00%

total spills in shared programs: 3 -> 3 (0.00%)
spills in affected programs: 0 -> 0
helped: 0
HURT: 0

total fills in shared programs: 5 -> 6 (20.00%)
fills in affected programs: 5 -> 6 (20.00%)
helped: 0
HURT: 1

Alyssa Rosenzweig (30):
  pan/midgard: Add missing parans in SWIZZLE definition
  pan/midgard: Fix component count handling for ldst
  pan/midgard: Squeeze indices before scheduling
  pan/midgard: Add flatten_mir helper
  pan/midgard: Calculate dependency graph
  pan/midgard: Initialize worklist
  pan/midgard: Add mir_choose_instruction stub
  pan/midgard: Add mir_update_worklist helper
  pan/midgard: Add mir_choose_bundle helper
  pan/midgard: Add mir_schedule_texture/ldst/alu helpers
  pan/midgard: Remove csel constant unit force
  pan/midgard: Add constant intersection filters
  pan/midgard: Add predicate->exclude
  pan/midgard: Implement predicate->unit
  pan/midgard: Add helpers for scheduling conditionals
  pan/midgard: Extend csel_swizzle to branches
  pan/midgard: Implement load/store pairing
  pan/midgard: Add mir_choose_alu helper
  pan/midgard: Add distance metric to choose_instruction
  pan/midgard: Use new scheduler
  pan/midgard: Don't double check SCALAR units
  pan/midgard: Extend choose_instruction for scalar units
  pan/midgard: Schedule to smul/sadd
  pan/midgard: Only one conditional per bundle allowed
  pan/midgard: Allow 6 instructions per bundle
  pan/midgard: Allow writeout to see into the future
  pan/midgard: Tightly pack 32-bit constants
  pan/midgard: Add mir_flip helper
  pan/midgard: Add csel invert optimization
  pan/midgard: Allow scheduling conditions with constants

 src/panfrost/midgard/compiler.h   |   27 +-
 src/panfrost/midgard/helpers.h|   17 +-
 src/panfrost/midgard/midgard_compile.c|  109 +-
 src/panfrost/midgard/midgard_opt_invert.c |   38 +-
 src/panfrost/midgard/midgard_schedule.c   | 1303 ++---
 src/panfrost

[Mesa-dev] [PATCH 22/30] pan/midgard: Extend choose_instruction for scalar units

2019-09-28 Thread Alyssa Rosenzweig
Signed-off-by: Alyssa Rosenzweig 
---
 src/panfrost/midgard/midgard_schedule.c | 4 
 1 file changed, 4 insertions(+)

diff --git a/src/panfrost/midgard/midgard_schedule.c 
b/src/panfrost/midgard/midgard_schedule.c
index e71f65f6004..094451ceb9d 100644
--- a/src/panfrost/midgard/midgard_schedule.c
+++ b/src/panfrost/midgard/midgard_schedule.c
@@ -393,6 +393,7 @@ mir_choose_instruction(
 bool alu = tag == TAG_ALU_4;
 unsigned unit = predicate->unit;
 bool branch = alu && (unit == ALU_ENAB_BR_COMPACT);
+bool scalar = (unit != ~0) && (unit & UNITS_SCALAR);
 
 /* Iterate to find the best instruction satisfying the predicate */
 unsigned i;
@@ -427,6 +428,9 @@ mir_choose_instruction(
 if (branch && !instructions[i]->compact_branch)
 continue;
 
+if (alu && scalar && !mir_is_scalar(instructions[i]))
+continue;
+
 if (alu && !mir_adjust_constants(instructions[i], predicate, 
false))
 continue;
 
-- 
2.23.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 02/30] pan/midgard: Fix component count handling for ldst

2019-09-28 Thread Alyssa Rosenzweig
It's not based on the writemask and it can't be inferred; it's just
intrinsic to the op itself.

Signed-off-by: Alyssa Rosenzweig 
---
 src/panfrost/midgard/helpers.h | 15 +++--
 src/panfrost/midgard/mir.c | 59 ++
 2 files changed, 37 insertions(+), 37 deletions(-)

diff --git a/src/panfrost/midgard/helpers.h b/src/panfrost/midgard/helpers.h
index 343fad0fea8..024a6380d29 100644
--- a/src/panfrost/midgard/helpers.h
+++ b/src/panfrost/midgard/helpers.h
@@ -392,9 +392,20 @@ swizzle_to_component(unsigned swizzle)
 
 
 static inline unsigned
-component_to_swizzle(unsigned c)
+component_to_swizzle(unsigned c, unsigned count)
 {
-return SWIZZLE(c, c, c, c);
+switch (count) {
+case 1:
+return SWIZZLE(c, c, c, c);
+case 2:
+return SWIZZLE(c, c + 1, c + 1, c + 1);
+case 3:
+return SWIZZLE(c, c + 1, c + 2, c + 2);
+case 4:
+return SWIZZLE(c, c + 1, c + 2, c + 3);
+default:
+unreachable("Invalid component count");
+}
 }
 
 #endif
diff --git a/src/panfrost/midgard/mir.c b/src/panfrost/midgard/mir.c
index 8874937aa5d..9e5ba7abcb0 100644
--- a/src/panfrost/midgard/mir.c
+++ b/src/panfrost/midgard/mir.c
@@ -64,7 +64,24 @@ mir_get_swizzle(midgard_instruction *ins, unsigned idx)
 uint8_t raw =
 (idx == 2) ? ins->load_store.arg_2 : 
ins->load_store.arg_1;
 
-return 
component_to_swizzle(midgard_ldst_select(raw).component);
+/* TODO: Integrate component count with properties */
+unsigned components = 1;
+switch (ins->load_store.op) {
+case midgard_op_ld_int4:
+components = (idx == 0) ? 2 : 1;
+break;
+case midgard_op_st_int4:
+components = (idx == 1) ? 2 : 1;
+break;
+case midgard_op_ld_cubemap_coords:
+components = 3;
+break;
+default:
+components = 1;
+break;
+}
+
+return 
component_to_swizzle(midgard_ldst_select(raw).component, components);
 }
 default:
 unreachable("Unknown load/store source");
@@ -362,26 +379,6 @@ mir_source_count(midgard_instruction *ins)
 }
 }
 
-static unsigned
-mir_component_count_implicit(midgard_instruction *ins, unsigned i)
-{
-if (ins->type == TAG_LOAD_STORE_4) {
-switch (ins->load_store.op) {
-/* Address implicitly 64-bit */
-case midgard_op_ld_int4:
-return (i == 0) ? 1 : 0;
-
-case midgard_op_st_int4:
-return (i == 1) ? 1 : 0;
-
-default:
-return 0;
-}
-}
-
-return 0;
-}
-
 unsigned
 mir_mask_of_read_components(midgard_instruction *ins, unsigned node)
 {
@@ -394,21 +391,13 @@ mir_mask_of_read_components(midgard_instruction *ins, 
unsigned node)
 if (ins->compact_branch && ins->writeout && (i == 0))
 return 0xF;
 
-unsigned swizzle = mir_get_swizzle(ins, i);
-unsigned m = mir_mask_of_read_components_single(swizzle, 
ins->mask);
-
-/* Sometimes multi-arg ops are passed implicitly */
-unsigned implicit = mir_component_count_implicit(ins, i);
-assert(implicit < 2);
-
-/* Extend the mask */
-if (implicit == 1) {
-/* Ensure it's a single bit currently */
-assert((m >> __builtin_ctz(m)) == 0x1);
+/* ALU ops act componentwise so we need to pay attention to
+ * their mask. Texture/ldst does not so we don't clamp source
+ * readmasks based on the writemask */
+unsigned qmask = (ins->type == TAG_ALU_4) ? ins->mask : 0xF;
 
-/* Set the next bit to extend one*/
-m |= (m << 1);
-}
+unsigned swizzle = mir_get_swizzle(ins, i);
+unsigned m = mir_mask_of_read_components_single(swizzle, 
qmask);
 
 /* Handle dot products and things */
 if (ins->type == TAG_ALU_4 && !ins->compact_branch) {
-- 
2.23.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 19/30] pan/midgard: Add distance metric to choose_instruction

2019-09-28 Thread Alyssa Rosenzweig
We require chosen instructions to be "close", to avoid ballooning
register pressure. This is a kludge that will go away once we have
proper liveness tracking in the scheduler, but for now it prevents a lot
of needless spilling.

Signed-off-by: Alyssa Rosenzweig 
---
 src/panfrost/midgard/midgard_schedule.c | 14 ++
 1 file changed, 14 insertions(+)

diff --git a/src/panfrost/midgard/midgard_schedule.c 
b/src/panfrost/midgard/midgard_schedule.c
index dea8b023e9d..6177d0072f7 100644
--- a/src/panfrost/midgard/midgard_schedule.c
+++ b/src/panfrost/midgard/midgard_schedule.c
@@ -898,7 +898,21 @@ mir_choose_instruction(
 
 signed best_index = -1;
 
+/* Enforce a simple metric limiting distance to keep down register
+ * pressure. TOOD: replace with liveness tracking for much better
+ * results */
+
+unsigned max_active = 0;
+unsigned max_distance = 8;
+
 BITSET_FOREACH_SET(i, tmp, worklist, count) {
+max_active = MAX2(max_active, i);
+}
+
+BITSET_FOREACH_SET(i, tmp, worklist, count) {
+if ((max_active - i) >= max_distance)
+continue;
+
 if (tag != ~0 && instructions[i]->type != tag)
 continue;
 
-- 
2.23.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 15/30] pan/midgard: Add helpers for scheduling conditionals

2019-09-28 Thread Alyssa Rosenzweig
Conditional instructions (csel and conditional branches) require their
condition to be written to a special condition pipeline register (r31.w
for scalar, r31.xyzw for vector). However, pipeline registers are live
only for the duration of a single bundle. As such, the logic to schedule
conditionals correct is surprisingly complex. Essentially, we see if we
could stuff the conditional within the same bundle as the csel/branch
without breaking anything; if we can, we do that. If we can't, we add a
dummy move to make room.

Signed-off-by: Alyssa Rosenzweig 
---
 src/panfrost/midgard/midgard_schedule.c | 146 
 1 file changed, 146 insertions(+)

diff --git a/src/panfrost/midgard/midgard_schedule.c 
b/src/panfrost/midgard/midgard_schedule.c
index eae73868e2a..92756e15189 100644
--- a/src/panfrost/midgard/midgard_schedule.c
+++ b/src/panfrost/midgard/midgard_schedule.c
@@ -957,6 +957,152 @@ mir_choose_bundle(
 return ~0;
 }
 
+/* When we are scheduling a branch/csel, we need the consumed condition in the
+ * same block as a pipeline register. There are two options to enable this:
+ *
+ *  - Move the conditional into the bundle. Preferred, but only works if the
+ *conditional is used only once and is from this block.
+ *  - Copy the conditional.
+ *
+ * We search for the conditional. If it's in this block, single-use, and
+ * without embedded constants, we schedule it immediately. Otherwise, we
+ * schedule a move for it.
+ *
+ * mir_comparison_mobile is a helper to find the moveable condition.
+ */
+
+static unsigned
+mir_comparison_mobile(
+compiler_context *ctx,
+midgard_instruction **instructions,
+unsigned count,
+unsigned cond)
+{
+if (!mir_single_use(ctx, cond))
+return ~0;
+
+unsigned ret = ~0;
+
+for (unsigned i = 0; i < count; ++i) {
+if (instructions[i]->dest != cond)
+continue;
+
+/* Must fit in an ALU bundle */
+if (instructions[i]->type != TAG_ALU_4)
+return ~0;
+
+/* We'll need to rewrite to .w but that doesn't work for vector
+ * ops that don't replicate (ball/bany), so bail there */
+
+if 
(GET_CHANNEL_COUNT(alu_opcode_props[instructions[i]->alu.op].props))
+return ~0;
+
+/* TODO: moving conditionals with constants */
+
+if (instructions[i]->has_constants)
+return ~0;
+
+/* Ensure it is written only once */
+
+if (ret != ~0)
+return ~0;
+else
+ret = i;
+}
+
+return ret;
+}
+
+/* Using the information about the moveable conditional itself, we either pop
+ * that condition off the worklist for use now, or create a move to
+ * artificially schedule instead as a fallback */
+
+static midgard_instruction *
+mir_schedule_comparison(
+compiler_context *ctx,
+midgard_instruction **instructions,
+BITSET_WORD *worklist, unsigned count,
+unsigned cond, bool vector, unsigned swizzle,
+midgard_instruction *user)
+{
+/* TODO: swizzle when scheduling */
+unsigned comp_i =
+(!vector && (swizzle == 0)) ?
+mir_comparison_mobile(ctx, instructions, count, cond) : ~0;
+
+/* If we can, schedule the condition immediately */
+if ((comp_i != ~0) && BITSET_TEST(worklist, comp_i)) {
+assert(comp_i < count);
+BITSET_CLEAR(worklist, comp_i);
+return instructions[comp_i];
+}
+
+/* Otherwise, we insert a move */
+midgard_vector_alu_src csel = {
+.swizzle = swizzle
+};
+
+midgard_instruction mov = v_mov(cond, csel, cond);
+mov.mask = vector ? 0xF : 0x1;
+
+return mir_insert_instruction_before(ctx, user, mov);
+}
+
+/* Most generally, we need instructions writing to r31 in the appropriate
+ * components */
+
+static midgard_instruction *
+mir_schedule_condition(compiler_context *ctx,
+struct midgard_predicate *predicate,
+BITSET_WORD *worklist, unsigned count,
+midgard_instruction **instructions,
+midgard_instruction *last)
+{
+/* For a branch, the condition is the only argument; for csel, third */
+bool branch = last->compact_branch;
+unsigned condition_index = branch ? 0 : 2;
+
+/* csel_v is vector; otherwise, conditions are scalar */
+bool vector = !branch && OP_IS_CSEL_V(last->alu.op);
+
+/* Grab the conditional instruction */
+
+midgard_instruction *cond = mir_schedule_comparison(
+ctx, instructi

[Mesa-dev] [PATCH 13/30] pan/midgard: Add predicate->exclude

2019-09-28 Thread Alyssa Rosenzweig
A bit of a kludge but allows setting an implicit dependency of synthetic
conditional moves on the actual condition, fixing code generated like:

   vmul.feq r0, ..
   sadd.imov r31, .., r0
   vadd.fcsel [...]

The imov runs simultaneous with feq so it gets garbage results, but it's
too late to add an actual dependency practically speaking, since the new
synthetic imov doesn't have a node associated.

Signed-off-by: Alyssa Rosenzweig 
---
 src/panfrost/midgard/midgard_schedule.c | 18 ++
 1 file changed, 14 insertions(+), 4 deletions(-)

diff --git a/src/panfrost/midgard/midgard_schedule.c 
b/src/panfrost/midgard/midgard_schedule.c
index 86a77149c78..4c72844679c 100644
--- a/src/panfrost/midgard/midgard_schedule.c
+++ b/src/panfrost/midgard/midgard_schedule.c
@@ -827,6 +827,9 @@ struct midgard_predicate {
 uint8_t *constants;
 unsigned constant_count;
 bool blend_constant;
+
+/* Exclude this destination (if not ~0) */
+unsigned exclude;
 };
 
 /* For an instruction that can fit, adjust it to fit and update the constants
@@ -893,6 +896,9 @@ mir_choose_instruction(
 if (tag != ~0 && instructions[i]->type != tag)
 continue;
 
+if (predicate->exclude != ~0 && instructions[i]->dest == 
predicate->exclude)
+continue;
+
 /* Simulate in-order scheduling */
 if ((signed) i < best_index)
 continue;
@@ -930,7 +936,8 @@ mir_choose_bundle(
 
 struct midgard_predicate predicate = {
 .tag = ~0,
-.destructive = false
+.destructive = false,
+.exclude = ~0
 };
 
 midgard_instruction *chosen = mir_choose_instruction(instructions, 
worklist, count, );
@@ -950,7 +957,8 @@ mir_schedule_texture(
 {
 struct midgard_predicate predicate = {
 .tag = TAG_TEXTURE_4,
-.destructive = true
+.destructive = true,
+.exclude = ~0
 };
 
 midgard_instruction *ins =
@@ -974,7 +982,8 @@ mir_schedule_ldst(
 {
 struct midgard_predicate predicate = {
 .tag = TAG_LOAD_STORE_4,
-.destructive = true
+.destructive = true,
+.exclude = ~0
 };
 
 midgard_instruction *ins =
@@ -1003,7 +1012,8 @@ mir_schedule_alu(
 
 struct midgard_predicate predicate = {
 .tag = TAG_ALU_4,
-.destructive = true
+.destructive = true,
+.exclude = ~0
 };
 
 midgard_instruction *ins =
-- 
2.23.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 14/30] pan/midgard: Implement predicate->unit

2019-09-28 Thread Alyssa Rosenzweig
This allows ALUs to select for each unit of the bundle separately.

Signed-off-by: Alyssa Rosenzweig 
---
 src/panfrost/midgard/midgard_schedule.c | 9 +
 1 file changed, 9 insertions(+)

diff --git a/src/panfrost/midgard/midgard_schedule.c 
b/src/panfrost/midgard/midgard_schedule.c
index 4c72844679c..eae73868e2a 100644
--- a/src/panfrost/midgard/midgard_schedule.c
+++ b/src/panfrost/midgard/midgard_schedule.c
@@ -885,6 +885,9 @@ mir_choose_instruction(
 {
 /* Parse the predicate */
 unsigned tag = predicate->tag;
+bool alu = tag == TAG_ALU_4;
+unsigned unit = predicate->unit;
+bool branch = alu && (unit == ALU_ENAB_BR_COMPACT);
 
 /* Iterate to find the best instruction satisfying the predicate */
 unsigned i;
@@ -899,6 +902,12 @@ mir_choose_instruction(
 if (predicate->exclude != ~0 && instructions[i]->dest == 
predicate->exclude)
 continue;
 
+if (alu && !branch && 
!(alu_opcode_props[instructions[i]->alu.op].props & unit))
+continue;
+
+if (branch && !instructions[i]->compact_branch)
+continue;
+
 /* Simulate in-order scheduling */
 if ((signed) i < best_index)
 continue;
-- 
2.23.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 03/30] pan/midgard: Squeeze indices before scheduling

2019-09-28 Thread Alyssa Rosenzweig
This allows node_count to be correct while scheduling.

Signed-off-by: Alyssa Rosenzweig 
---
 src/panfrost/midgard/midgard_schedule.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/src/panfrost/midgard/midgard_schedule.c 
b/src/panfrost/midgard/midgard_schedule.c
index b8d9b5ec9be..2ed25da2e0d 100644
--- a/src/panfrost/midgard/midgard_schedule.c
+++ b/src/panfrost/midgard/midgard_schedule.c
@@ -983,6 +983,7 @@ schedule_program(compiler_context *ctx)
 /* Must be lowered right before RA */
 mir_squeeze_index(ctx);
 mir_lower_special_reads(ctx);
+mir_squeeze_index(ctx);
 
 /* Lowering can introduce some dead moves */
 
-- 
2.23.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 08/30] pan/midgard: Add mir_update_worklist helper

2019-09-28 Thread Alyssa Rosenzweig
After we've chosen an instruction, popped it off, and processed it, it's
time to update the worklist, removing that instruction from the
dependency graph to allow its dependents to be put onto the worklist.

Signed-off-by: Alyssa Rosenzweig 
---
 src/panfrost/midgard/midgard_schedule.c | 39 +
 1 file changed, 39 insertions(+)

diff --git a/src/panfrost/midgard/midgard_schedule.c 
b/src/panfrost/midgard/midgard_schedule.c
index 046480ba2ac..8f324805cdb 100644
--- a/src/panfrost/midgard/midgard_schedule.c
+++ b/src/panfrost/midgard/midgard_schedule.c
@@ -767,6 +767,45 @@ mir_initialize_worklist(BITSET_WORD *worklist, 
midgard_instruction **instruction
 }
 }
 
+/* Update the worklist after an instruction terminates. Remove its edges from
+ * the graph and if that causes any node to have no dependencies, add it to the
+ * worklist */
+
+static void
+mir_update_worklist(
+BITSET_WORD *worklist, unsigned count,
+midgard_instruction **instructions, midgard_instruction *done)
+{
+/* Sanity check: if no instruction terminated, there is nothing to do.
+ * If the instruction that terminated had dependencies, that makes no
+ * sense and means we messed up the worklist. Finally, as the purpose
+ * of this routine is to update dependents, we abort early if there are
+ * no dependents defined. */
+
+if (!done)
+return;
+
+assert(done->nr_dependencies == 0);
+
+if (!done->dependents)
+return;
+
+/* We have an instruction with dependents. Iterate each dependent to
+ * remove one dependency (`done`), adding dependents to the worklist
+ * where possible. */
+
+unsigned i;
+BITSET_WORD tmp;
+BITSET_FOREACH_SET(i, tmp, done->dependents, count) {
+assert(instructions[i]->nr_dependencies);
+
+if (!(--instructions[i]->nr_dependencies))
+BITSET_SET(worklist, i);
+}
+
+free(done->dependents);
+}
+
 /* While scheduling, we need to choose instructions satisfying certain
  * criteria. As we schedule backwards, we choose the *last* instruction in the
  * worklist to simulate in-order scheduling. Chosen instructions must satisfy a
-- 
2.23.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 11/30] pan/midgard: Remove csel constant unit force

2019-09-28 Thread Alyssa Rosenzweig
Signed-off-by: Alyssa Rosenzweig 
---
 src/panfrost/midgard/midgard_compile.c | 3 ---
 1 file changed, 3 deletions(-)

diff --git a/src/panfrost/midgard/midgard_compile.c 
b/src/panfrost/midgard/midgard_compile.c
index 585591d9356..95ec48e9563 100644
--- a/src/panfrost/midgard/midgard_compile.c
+++ b/src/panfrost/midgard/midgard_compile.c
@@ -1991,9 +1991,6 @@ inline_alu_constants(compiler_context *ctx, midgard_block 
*block)
 midgard_instruction ins = 
v_mov(SSA_FIXED_REGISTER(REGISTER_CONSTANT), blank_alu_src, scratch);
 attach_constants(ctx, , entry, alu->src[1] 
+ 1);
 
-/* Force a break XXX Defer r31 writes */
-ins.unit = UNIT_VLUT;
-
 /* Set the source */
 alu->src[1] = scratch;
 
-- 
2.23.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 12/30] pan/midgard: Add constant intersection filters

2019-09-28 Thread Alyssa Rosenzweig
In the future, we will want to keep track of which components of
constants of various sizes correspond to which parts of the bundle
constants, like in the old scheduler. For now, let's just stub it out
for a simple rule of one instruction with embedded constants per bundle.
We can eventually do better, of course.

Signed-off-by: Alyssa Rosenzweig 
---
 src/panfrost/midgard/midgard_schedule.c | 55 +
 1 file changed, 55 insertions(+)

diff --git a/src/panfrost/midgard/midgard_schedule.c 
b/src/panfrost/midgard/midgard_schedule.c
index 72c6ec42980..86a77149c78 100644
--- a/src/panfrost/midgard/midgard_schedule.c
+++ b/src/panfrost/midgard/midgard_schedule.c
@@ -817,8 +817,63 @@ struct midgard_predicate {
 
 /* True if we want to pop off the chosen instruction */
 bool destructive;
+
+/* State for bundle constants. constants is the actual constants
+ * for the bundle. constant_count is the number of bytes (up to
+ * 16) currently in use for constants. When picking in destructive
+ * mode, the constants array will be updated, and the instruction
+ * will be adjusted to index into the constants array */
+
+uint8_t *constants;
+unsigned constant_count;
+bool blend_constant;
 };
 
+/* For an instruction that can fit, adjust it to fit and update the constants
+ * array, in destructive mode. Returns whether the fitting was successful. */
+
+static bool
+mir_adjust_constants(midgard_instruction *ins,
+struct midgard_predicate *pred,
+bool destructive)
+{
+/* Blend constants dominate */
+if (ins->has_blend_constant) {
+if (pred->constant_count)
+return false;
+else if (destructive) {
+pred->blend_constant = true;
+pred->constant_count = 16;
+return true;
+}
+}
+
+/* No constant, nothing to adjust */
+if (!ins->has_constants)
+return true;
+
+/* TODO: Deduplicate; permit multiple constants within a bundle */
+
+if (destructive && !pred->constant_count) {
+if (ins->alu.reg_mode == midgard_reg_mode_16) {
+  /* TODO: Fix packing XXX */
+uint16_t *bundles = (uint16_t *) pred->constants;
+uint32_t *constants = (uint32_t *) ins->constants;
+
+/* Copy them wholesale */
+for (unsigned i = 0; i < 4; ++i)
+bundles[i] = constants[i];
+} else {
+memcpy(pred->constants, ins->constants, 16);
+}
+
+pred->constant_count = 16;
+return true;
+}
+
+return !pred->constant_count;
+}
+
 static midgard_instruction *
 mir_choose_instruction(
 midgard_instruction **instructions,
-- 
2.23.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 07/30] pan/midgard: Add mir_choose_instruction stub

2019-09-28 Thread Alyssa Rosenzweig
In the future, this routine will implement the core scheduling logic to
decide which instruction out of the worklist will be scheduled next, in
a way that minimizes cycle count and register pressure.

In the present, we are more interested in replicating in-order
scheduling with the much-more-powerful out-of-order model. So rather
than discriminating by a register pressure estimate, we simply choose
the latest possible instruction in the worklist.

Signed-off-by: Alyssa Rosenzweig 
---
 src/panfrost/midgard/midgard_schedule.c | 55 +
 1 file changed, 55 insertions(+)

diff --git a/src/panfrost/midgard/midgard_schedule.c 
b/src/panfrost/midgard/midgard_schedule.c
index aff93399a80..046480ba2ac 100644
--- a/src/panfrost/midgard/midgard_schedule.c
+++ b/src/panfrost/midgard/midgard_schedule.c
@@ -767,6 +767,61 @@ mir_initialize_worklist(BITSET_WORD *worklist, 
midgard_instruction **instruction
 }
 }
 
+/* While scheduling, we need to choose instructions satisfying certain
+ * criteria. As we schedule backwards, we choose the *last* instruction in the
+ * worklist to simulate in-order scheduling. Chosen instructions must satisfy a
+ * given predicate. */
+
+struct midgard_predicate {
+/* TAG or ~0 for dont-care */
+unsigned tag;
+
+/* True if we want to pop off the chosen instruction */
+bool destructive;
+};
+
+static midgard_instruction *
+mir_choose_instruction(
+midgard_instruction **instructions,
+BITSET_WORD *worklist, unsigned count,
+struct midgard_predicate *predicate)
+{
+/* Parse the predicate */
+unsigned tag = predicate->tag;
+
+/* Iterate to find the best instruction satisfying the predicate */
+unsigned i;
+BITSET_WORD tmp;
+
+signed best_index = -1;
+
+BITSET_FOREACH_SET(i, tmp, worklist, count) {
+if (tag != ~0 && instructions[i]->type != tag)
+continue;
+
+/* Simulate in-order scheduling */
+if ((signed) i < best_index)
+continue;
+
+best_index = i;
+}
+
+
+/* Did we find anything?  */
+
+if (best_index < 0)
+return NULL;
+
+/* If we found something, remove it from the worklist */
+assert(best_index < count);
+
+if (predicate->destructive) {
+BITSET_CLEAR(worklist, best_index);
+}
+
+return instructions[best_index];
+}
+
 /* Schedule a single block by iterating its instruction to create bundles.
  * While we go, tally about the bundle sizes to compute the block size. */
 
-- 
2.23.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 09/30] pan/midgard: Add mir_choose_bundle helper

2019-09-28 Thread Alyssa Rosenzweig
It's not always obvious what the optimal bundle type should be. Let's
break out the logic to decide.

Currently set for purely in-order operation.

Signed-off-by: Alyssa Rosenzweig 
---
 src/panfrost/midgard/midgard_schedule.c | 25 +
 1 file changed, 25 insertions(+)

diff --git a/src/panfrost/midgard/midgard_schedule.c 
b/src/panfrost/midgard/midgard_schedule.c
index 8f324805cdb..969460235d1 100644
--- a/src/panfrost/midgard/midgard_schedule.c
+++ b/src/panfrost/midgard/midgard_schedule.c
@@ -861,6 +861,31 @@ mir_choose_instruction(
 return instructions[best_index];
 }
 
+/* Still, we don't choose instructions in a vacuum. We need a way to choose the
+ * best bundle type (ALU, load/store, texture). Nondestructive. */
+
+static unsigned
+mir_choose_bundle(
+midgard_instruction **instructions,
+BITSET_WORD *worklist, unsigned count)
+{
+/* At the moment, our algorithm is very simple - use the bundle of the
+ * best instruction, regardless of what else could be scheduled
+ * alongside it. This is not optimal but it works okay for in-order */
+
+struct midgard_predicate predicate = {
+.tag = ~0,
+.destructive = false
+};
+
+midgard_instruction *chosen = mir_choose_instruction(instructions, 
worklist, count, );
+
+if (chosen)
+return chosen->type;
+else
+return ~0;
+}
+
 /* Schedule a single block by iterating its instruction to create bundles.
  * While we go, tally about the bundle sizes to compute the block size. */
 
-- 
2.23.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 06/30] pan/midgard: Initialize worklist

2019-09-28 Thread Alyssa Rosenzweig
This flows naturally from the dependency graph

Signed-off-by: Alyssa Rosenzweig 
---
 src/panfrost/midgard/midgard_schedule.c | 17 +
 1 file changed, 17 insertions(+)

diff --git a/src/panfrost/midgard/midgard_schedule.c 
b/src/panfrost/midgard/midgard_schedule.c
index 70fa030390c..aff93399a80 100644
--- a/src/panfrost/midgard/midgard_schedule.c
+++ b/src/panfrost/midgard/midgard_schedule.c
@@ -755,6 +755,18 @@ flatten_mir(midgard_block *block, unsigned *len)
 return instructions;
 }
 
+/* The worklist is the set of instructions that can be scheduled now; that is,
+ * the set of instructions with no remaining dependencies */
+
+static void
+mir_initialize_worklist(BITSET_WORD *worklist, midgard_instruction 
**instructions, unsigned count)
+{
+for (unsigned i = 0; i < count; ++i) {
+if (instructions[i]->nr_dependencies == 0)
+BITSET_SET(worklist, i);
+}
+}
+
 /* Schedule a single block by iterating its instruction to create bundles.
  * While we go, tally about the bundle sizes to compute the block size. */
 
@@ -769,6 +781,11 @@ schedule_block(compiler_context *ctx, midgard_block *block)
 unsigned node_count = ctx->temp_count + 1;
 mir_create_dependency_graph(instructions, len, node_count);
 
+/* Allocate the worklist */
+size_t sz = BITSET_WORDS(len) * sizeof(BITSET_WORD);
+BITSET_WORD *worklist = calloc(sz, 1);
+mir_initialize_worklist(worklist, instructions, len);
+
 util_dynarray_init(>bundles, NULL);
 
 block->quadword_count = 0;
-- 
2.23.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 01/30] pan/midgard: Add missing parans in SWIZZLE definition

2019-09-28 Thread Alyssa Rosenzweig
TODO: Move me to front of series.

Signed-off-by: Alyssa Rosenzweig 
---
 src/panfrost/midgard/helpers.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/panfrost/midgard/helpers.h b/src/panfrost/midgard/helpers.h
index ac58fd50327..343fad0fea8 100644
--- a/src/panfrost/midgard/helpers.h
+++ b/src/panfrost/midgard/helpers.h
@@ -189,7 +189,7 @@ quadword_size(int tag)
 
 /* Swizzle support */
 
-#define SWIZZLE(A, B, C, D) ((D << 6) | (C << 4) | (B << 2) | (A << 0))
+#define SWIZZLE(A, B, C, D) (((D) << 6) | ((C) << 4) | ((B) << 2) | ((A) << 0))
 #define SWIZZLE_FROM_ARRAY(r) SWIZZLE(r[0], r[1], r[2], r[3])
 #define COMPONENT_X 0x0
 #define COMPONENT_Y 0x1
-- 
2.23.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 10/30] pan/midgard: Add mir_schedule_texture/ldst/alu helpers

2019-09-28 Thread Alyssa Rosenzweig
We don't actually do any scheduling here yet, but add per-tag helpers to
consume an instruction, print it, pop it off the worklist.

Signed-off-by: Alyssa Rosenzweig 
---
 src/panfrost/midgard/midgard_schedule.c | 190 
 1 file changed, 190 insertions(+)

diff --git a/src/panfrost/midgard/midgard_schedule.c 
b/src/panfrost/midgard/midgard_schedule.c
index 969460235d1..72c6ec42980 100644
--- a/src/panfrost/midgard/midgard_schedule.c
+++ b/src/panfrost/midgard/midgard_schedule.c
@@ -886,9 +886,199 @@ mir_choose_bundle(
 return ~0;
 }
 
+/* Schedules a single bundle of the given type */
+
+static midgard_bundle
+mir_schedule_texture(
+midgard_instruction **instructions,
+BITSET_WORD *worklist, unsigned len)
+{
+struct midgard_predicate predicate = {
+.tag = TAG_TEXTURE_4,
+.destructive = true
+};
+
+midgard_instruction *ins =
+mir_choose_instruction(instructions, worklist, len, 
);
+
+mir_update_worklist(worklist, len, instructions, ins);
+
+struct midgard_bundle out = {
+.tag = TAG_TEXTURE_4,
+.instruction_count = 1,
+.instructions = { ins }
+};
+
+return out;
+}
+
+static midgard_bundle
+mir_schedule_ldst(
+midgard_instruction **instructions,
+BITSET_WORD *worklist, unsigned len)
+{
+struct midgard_predicate predicate = {
+.tag = TAG_LOAD_STORE_4,
+.destructive = true
+};
+
+midgard_instruction *ins =
+mir_choose_instruction(instructions, worklist, len, 
);
+
+mir_update_worklist(worklist, len, instructions, ins);
+
+struct midgard_bundle out = {
+.tag = TAG_LOAD_STORE_4,
+.instruction_count = 1,
+.instructions = { ins }
+};
+
+return out;
+}
+
+static midgard_bundle
+mir_schedule_alu(
+compiler_context *ctx,
+midgard_instruction **instructions,
+BITSET_WORD *worklist, unsigned len)
+{
+struct midgard_bundle bundle = {};
+
+unsigned bytes_emitted = sizeof(bundle.control);
+
+struct midgard_predicate predicate = {
+.tag = TAG_ALU_4,
+.destructive = true
+};
+
+midgard_instruction *ins =
+mir_choose_instruction(instructions, worklist, len, 
);
+
+midgard_instruction *vmul = NULL;
+midgard_instruction *vadd = NULL;
+midgard_instruction *vlut = NULL;
+midgard_instruction *smul = NULL;
+midgard_instruction *sadd = NULL;
+midgard_instruction *branch = NULL;
+
+mir_update_worklist(worklist, len, instructions, ins);
+
+if (ins->compact_branch) {
+branch = ins;
+} else if (!ins->unit) {
+unsigned units = alu_opcode_props[ins->alu.op].props;
+
+if (units & UNIT_VMUL) {
+ins->unit = UNIT_VMUL;
+vmul = ins;
+} else if (units & UNIT_VADD) {
+ins->unit = UNIT_VADD;
+vadd = ins;
+} else if (units & UNIT_VLUT) {
+ins->unit = UNIT_VLUT;
+vlut = ins;
+} else
+assert(0);
+}
+
+bundle.has_embedded_constants = ins->has_constants;
+bundle.has_blend_constant = ins->has_blend_constant;
+
+if (ins->alu.reg_mode == midgard_reg_mode_16) {
+  /* TODO: Fix packing XXX */
+uint16_t *bundles = (uint16_t *) bundle.constants;
+uint32_t *constants = (uint32_t *) ins->constants;
+
+/* Copy them wholesale */
+for (unsigned i = 0; i < 4; ++i)
+bundles[i] = constants[i];
+} else {
+memcpy(bundle.constants, ins->constants, 
sizeof(bundle.constants));
+}
+
+if (ins->writeout) {
+unsigned src = (branch->src[0] == ~0) ? SSA_FIXED_REGISTER(0) 
: branch->src[0];
+unsigned temp = (branch->src[0] == ~0) ? SSA_FIXED_REGISTER(0) 
: make_compiler_temp(ctx);
+midgard_instruction mov = v_mov(src, blank_alu_src, temp);
+vmul = mem_dup(, sizeof(midgard_instruction));
+vmul->unit = UNIT_VMUL;
+vmul->mask = 0xF;
+/* TODO: Don't leak */
+
+/* Rewrite to use our temp */
+midgard_instruction *stages[] = { sadd, vadd, smul };
+
+for (unsigned i = 0; i < ARRAY_SIZE(stages); ++i) {
+if (stages[i])
+mir_rewrite_index_dst_single(stages[i], src, 
temp);
+   

Re: [Mesa-dev] [PATCH 1/3] panfrost: Make sure a clear does not re-use a pre-existing batch

2019-09-23 Thread Alyssa Rosenzweig
> One more thing: optimization of the above scenario is probably
> something we'll want to have at some point, but I think the current
> patch is worth applying in the meantime. All this patch does is
> enforcing ordering of clears/draws to make sure the end result matches
> users expectation.

Hum alright, R-b in that case, it's worth it to fix so many tests. But
please add a "TODO" comment explaining what we just emailed about, so it
doesn't get lost to the voids of future mailing lists :)


signature.asc
Description: PGP signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH v3 01/17] panfrost: Extend the panfrost_batch_add_bo() API to pass access flags

2019-09-22 Thread Alyssa Rosenzweig
> +your collabora address

Thank you

> > > > I think this batch_add_bo should probably dropped altogether? This loop
> > > > is dealing with constant buffers; the shaders themselves were added  
> > > 
> > > I'll double check. I couldn't find where BOs containing shader programs
> > > were added last time I looked.  
> > 
> > Masking a real bug :o
> > 
> > It should probably happen in panfrost_patch_shader_state?
> 
> Ok, I'll add it there, I wasn't sure this function was called for all
> shaders, but looking at the code a second time it seems to be the case.

I think so as well, yeah.

> > As I stated before, I thought we should be adding the BO for
> > wallpapering when we actually wallpaper, since that's a slow path. Not
> > wallpapering is the default and ideally what most apps should do.
> 
> Wallpapering happens too late (when we are flushing the batch) to have
> an impact on the dep graph, but we can probably know that wallpapering
> will be needed before that. My question remains though, are
> vertex/tiler supposed to touch the texture BO they're reading from, or
> should we only flag the BO for FRAGMENT use.

Vertex/tiler should not touch the texture BO, unless you're texturing
from the vertex shader (which we're not for wallpapering).


signature.asc
Description: PGP signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 1/3] panfrost: Make sure a clear does not re-use a pre-existing batch

2019-09-22 Thread Alyssa Rosenzweig
> > To be clear, if we have a batch and do the following operations:
> > 
> > clear red
> > draw 1
> > clear green
> > draw 2
> > flush
> > 
> > All we should see is #2 on a green background, which this patch handles
> > by the second clear invalidating all the clears/draws that came before
> > it (provided there is no flush in between). 
> > 
> > I might just be tripped up by the "freeze" name. That really means throw
> > away / free here, I guess?
> 
> Nope. Freeze means "stop queuing new draws to this batch". I guess we
> could free the batch as well if the result of the previous draws/clear
> are really overwritten by this new clear, but that implies checking the
> new clear flags to make sure they target the same depth/stencil/color
> combination. On the other hand, I'm wondering if it's really the
> driver's job to try to optimize silly things the apps might do. I mean,
> the sequence you describe does not look like a wise thing to do since
> the "clear red+draw 1" end up being overwritten by "clear green + draw
> 2".

I'm quite confused how this patch works, then.

A few thoughts: if the app clears all buffers in the middle, then yes
it's silly and yes we may as well optimize it out. (Should that be a thing GL
drivers have to do? I mean, if the other drivers are too...)

If the sequence is more like:

clear all buffers
draw 1
clear color buffer (preserve depth stencil)
draw 2
flush

That second clear should really be done by drawing a full screen quad,
just like if we were wallpapering, except loading its colour from a
uniform instead of a texture.

Similarly, a depth-only clear mid-frame can be emulated by drawing a
full-screen quad with the gl_Position.zw components juryrigged to the
desired depth components, and disabling colour draws by setting the
colour mask to 0x0. That also means you can skip having any shader at
all (literally set the shader pointer to 0x0) so that's faster.

Finally, a stencil-only clear can occur similarly playing tricks with
the stencil test parameters.

I suspect u_blitter or mesa/st is capable of doing these sorts of tricks
on our behalf, but I have not researched it extensively.

In any case, for a partial clear mid-frame, we would much rather do:

clear all buffers
draw 1
draw fullscreen quad (disable Z/S writes)
draw 2
flush

than what it sounds like this patch does

clear all buffers
draw 1
flush

clear all buffers
wallpaper
draw 2
flush

Please do correct me if I've misunderstood the logic here.

> 
> > 
> > Provided that's the idea (and we're not somehow saving the original draw
> > 1), it's Reviewed-by A R 
> > 
> > On Fri, Sep 20, 2019 at 04:53:37PM +0200, Boris Brezillon wrote:
> > > glClear()s are expected to be the first thing GL apps do before drawing
> > > new things. If there's already an existing batch targetting the same
> > > FBO that has draws attached to it, we should make sure the new clear
> > > gets a new batch assigned to guaranteed that the FB content is actually
> > > cleared with the requested color/depth/stencil values.
> > > 
> > > We create a panfrost_get_fresh_batch_for_fbo() helper for that and
> > > call it from panfrost_clear().
> > > 
> > > Signed-off-by: Boris Brezillon 
> > > ---
> > >  src/gallium/drivers/panfrost/pan_context.c |  2 +-
> > >  src/gallium/drivers/panfrost/pan_job.c | 21 +
> > >  src/gallium/drivers/panfrost/pan_job.h |  3 +++
> > >  3 files changed, 25 insertions(+), 1 deletion(-)
> > > 
> > > diff --git a/src/gallium/drivers/panfrost/pan_context.c 
> > > b/src/gallium/drivers/panfrost/pan_context.c
> > > index ac01461a07fe..b2f2a9da7a51 100644
> > > --- a/src/gallium/drivers/panfrost/pan_context.c
> > > +++ b/src/gallium/drivers/panfrost/pan_context.c
> > > @@ -162,7 +162,7 @@ panfrost_clear(
> > >  double depth, unsigned stencil)
> > >  {
> > >  struct panfrost_context *ctx = pan_context(pipe);
> > > -struct panfrost_batch *batch = panfrost_get_batch_for_fbo(ctx);
> > > +struct panfrost_batch *batch = 
> > > panfrost_get_fresh_batch_for_fbo(ctx);
> > >  
> > >  panfrost_batch_add_fbo_bos(batch);
> > >  panfrost_batch_clear(batch, buffers, color, depth, stencil);
> > > diff --git a/src/gallium/drivers/panfrost/pan_job.c 
> > > b/src/gallium/drivers/panfrost/pan_job.c
> > > index d8330bc133a6..4ec2aa0565d7 100644
> > > --- a/src/gallium/drivers/panfrost/pan_job.c
> > > +++ b/src/gallium/drivers/panfrost/pan_job.c
> > > @@ -298,6 +298,27 @@ panfrost_get_batch_for_fbo(struct panfrost_context 
> > > *ctx)
> > >  return batch;
> > >  }
> > >  
> > > +struct panfrost_batch *
> > > +panfrost_get_fresh_batch_for_fbo(struct panfrost_context *ctx)
> > > +{
> > > +struct panfrost_batch *batch;
> > > +
> > > +batch = panfrost_get_batch(ctx, >pipe_framebuffer);
> > > +
> > > + 

Re: [Mesa-dev] [PATCH v3 01/17] panfrost: Extend the panfrost_batch_add_bo() API to pass access flags

2019-09-22 Thread Alyssa Rosenzweig
> > Although actually I am not at all sure what this batch_add_bo is doing
> > at all?
> > 
> > I think this batch_add_bo should probably dropped altogether? This loop
> > is dealing with constant buffers; the shaders themselves were added
> 
> I'll double check. I couldn't find where BOs containing shader programs
> were added last time I looked.

Masking a real bug :o

It should probably happen in panfrost_patch_shader_state?

> > >  void panfrost_batch_add_fbo_bos(struct panfrost_batch *batch)
> > >  {
> > > +uint32_t flags = PAN_BO_ACCESS_SHARED | PAN_BO_ACCESS_WRITE |
> > > + PAN_BO_ACCESS_VERTEX_TILER |
> > > + PAN_BO_ACCESS_FRAGMENT;  
> > 
> > I think we can drop VERTEX_TILER here...? The buffers are written right
> > at the end of the FRAGMENT job, not touched before that.
> 
> What about the read done when drawing the wallpaper? I guess it's also
> only read by the fragment job, but I wasn't sure.

As I stated before, I thought we should be adding the BO for
wallpapering when we actually wallpaper, since that's a slow path. Not
wallpapering is the default and ideally what most apps should do.

> > If nothing else is broken, this should allow a nice perf boost with
> > pipelining, so the vertex/tiler from frame n+1 can run in parallel with
> > the fragment of frame n (rather than blocking on frame n finishing with
> > the FBOs).
> 
> Would require the kernel patches I posted earlier for that to
> happen ;-). Right now all jobs touching the same BO are serialized
> because of the implicit BO fences added by the kernel driver.

Sure~ Maybe this sort of bug was the reason you weren't seeing
improvement from those kernel patches?
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH v3 00/17] panfrost: Support batch pipelining

2019-09-20 Thread Alyssa Rosenzweig
Series looks quite good, overall. Just a few minor issues, but probably
not even enough to justify a respin. Congratulations! :p

On Wed, Sep 18, 2019 at 03:24:22PM +0200, Boris Brezillon wrote:
> Hello,
> 
> This is the third attempt at supporting batch pipelining. This time I
> implemented it using a dependency graph (as suggested by Alyssa and
> Steven) so that batch submission can be delayed even more: the only
> time we flush batches now is when we have an explicit flush or when
> the CPU needs to access a BO (we might want to tweak that a bit to
> avoid the extra latency incurred by this solution). With that in place
> we hope to increase GPU utilization.
> 
> Patches 15 and 16 are optional, but I remember reading (I think it was
> Steven who mentioned that) that draw order matters when queueing render
> operations for different frames (frame N should ideally be ready before
> frame N+1). Not sure if enforcing draw call order is enough to guarantee
> that rendering of frame N always finishes before frame N+1 though.
> If that's something you don't want to merge, I can drop it.
> 
> Regards,
> 
> Boris
> 
> Boris Brezillon (17):
>   panfrost: Extend the panfrost_batch_add_bo() API to pass access flags
>   panfrost: Make panfrost_batch->bos a hash table
>   panfrost: Add a batch fence
>   panfrost: Use the per-batch fences to wait on the last submitted batch
>   panfrost: Add a panfrost_freeze_batch() helper
>   panfrost: Start tracking inter-batch dependencies
>   panfrost: Prepare panfrost_fence for batch pipelining
>   panfrost: Add a panfrost_flush_all_batches() helper
>   panfrost: Add a panfrost_flush_batches_accessing_bo() helper
>   panfrost: Kill the explicit serialization in panfrost_batch_submit()
>   panfrost: Get rid of the flush in panfrost_set_framebuffer_state()
>   panfrost: Add flags to reflect the BO imported/exported state
>   panfrost: Make sure the BO is 'ready' when picked from the cache
>   panfrost: Do fine-grained flushing when preparing BO for CPU accesses
>   panfrost: Rename ctx->batches into ctx->fbo_to_batch
>   panfrost: Take draw call order into account
>   panfrost/ci: New tests are passing
> 
>  .../drivers/panfrost/ci/expected-failures.txt |   4 -
>  src/gallium/drivers/panfrost/pan_allocate.c   |  14 +-
>  src/gallium/drivers/panfrost/pan_blend_cso.c  |   6 +-
>  src/gallium/drivers/panfrost/pan_bo.c | 116 ++-
>  src/gallium/drivers/panfrost/pan_bo.h |  44 ++
>  src/gallium/drivers/panfrost/pan_compute.c|   2 +-
>  src/gallium/drivers/panfrost/pan_context.c| 121 ++--
>  src/gallium/drivers/panfrost/pan_context.h|  15 +-
>  src/gallium/drivers/panfrost/pan_instancing.c |   5 +-
>  src/gallium/drivers/panfrost/pan_job.c| 668 --
>  src/gallium/drivers/panfrost/pan_job.h|  58 +-
>  src/gallium/drivers/panfrost/pan_resource.c   |  27 +-
>  src/gallium/drivers/panfrost/pan_screen.c |  65 +-
>  src/gallium/drivers/panfrost/pan_screen.h |   3 +-
>  src/gallium/drivers/panfrost/pan_varyings.c   |  10 +-
>  15 files changed, 956 insertions(+), 202 deletions(-)
> 
> -- 
> 2.21.0
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH v3 15/17] panfrost: Rename ctx->batches into ctx->fbo_to_batch

2019-09-20 Thread Alyssa Rosenzweig
Erm, un r-b, sorry, didn't realize this was the optional. Let's hold off
on this patch and the succeeding one for now.

On Wed, Sep 18, 2019 at 03:24:37PM +0200, Boris Brezillon wrote:
> We are about to add a batch queue to keep track of submission order.
> Let's rename the existing batches hash table (which is used to get the
> batch attached to an FBO) into fbo_to_batch to avoid confusion.
> 
> Signed-off-by: Boris Brezillon 
> ---
>  src/gallium/drivers/panfrost/pan_context.c  |  2 +-
>  src/gallium/drivers/panfrost/pan_context.h  |  2 +-
>  src/gallium/drivers/panfrost/pan_job.c  | 21 +++--
>  src/gallium/drivers/panfrost/pan_resource.c | 16 
>  4 files changed, 21 insertions(+), 20 deletions(-)
> 
> diff --git a/src/gallium/drivers/panfrost/pan_context.c 
> b/src/gallium/drivers/panfrost/pan_context.c
> index 07bafad58a00..0330b5852676 100644
> --- a/src/gallium/drivers/panfrost/pan_context.c
> +++ b/src/gallium/drivers/panfrost/pan_context.c
> @@ -1355,7 +1355,7 @@ panfrost_flush(
>   */
>  if (fence) {
>  util_dynarray_init(, NULL);
> -hash_table_foreach(ctx->batches, hentry) {
> +hash_table_foreach(ctx->fbo_to_batch, hentry) {
>  struct panfrost_batch *batch = hentry->data;
>  
>  panfrost_batch_fence_reference(batch->out_sync);
> diff --git a/src/gallium/drivers/panfrost/pan_context.h 
> b/src/gallium/drivers/panfrost/pan_context.h
> index d50ed57d5d8a..f13967f51b46 100644
> --- a/src/gallium/drivers/panfrost/pan_context.h
> +++ b/src/gallium/drivers/panfrost/pan_context.h
> @@ -112,7 +112,7 @@ struct panfrost_context {
>  
>  /* Bound job batch and map of panfrost_batch_key to job batches */
>  struct panfrost_batch *batch;
> -struct hash_table *batches;
> +struct hash_table *fbo_to_batch;
>  
>  /* panfrost_bo -> panfrost_bo_access */
>  struct hash_table *accessed_bos;
> diff --git a/src/gallium/drivers/panfrost/pan_job.c 
> b/src/gallium/drivers/panfrost/pan_job.c
> index a56f4044fda0..45f9d9d24b41 100644
> --- a/src/gallium/drivers/panfrost/pan_job.c
> +++ b/src/gallium/drivers/panfrost/pan_job.c
> @@ -132,9 +132,9 @@ panfrost_freeze_batch(struct panfrost_batch *batch)
>   * matches. This way, next draws/clears targeting this FBO will 
> trigger
>   * the creation of a new batch.
>   */
> -entry = _mesa_hash_table_search(ctx->batches, >key);
> +entry = _mesa_hash_table_search(ctx->fbo_to_batch, >key);
>  if (entry && entry->data == batch)
> -_mesa_hash_table_remove(ctx->batches, entry);
> +_mesa_hash_table_remove(ctx->fbo_to_batch, entry);
>  
>  /* If this is the bound batch, the panfrost_context parameters are
>   * relevant so submitting it invalidates those parameters, but if 
> it's
> @@ -153,7 +153,7 @@ static bool panfrost_batch_is_frozen(struct 
> panfrost_batch *batch)
>  struct panfrost_context *ctx = batch->ctx;
>  struct hash_entry *entry;
>  
> -entry = _mesa_hash_table_search(ctx->batches, >key);
> +entry = _mesa_hash_table_search(ctx->fbo_to_batch, >key);
>  if (entry && entry->data == batch)
>  return false;
>  
> @@ -248,7 +248,8 @@ panfrost_get_batch(struct panfrost_context *ctx,
> const struct pipe_framebuffer_state *key)
>  {
>  /* Lookup the job first */
> -struct hash_entry *entry = _mesa_hash_table_search(ctx->batches, 
> key);
> +struct hash_entry *entry = _mesa_hash_table_search(ctx->fbo_to_batch,
> +   key);
>  
>  if (entry)
>  return entry->data;
> @@ -258,7 +259,7 @@ panfrost_get_batch(struct panfrost_context *ctx,
>  struct panfrost_batch *batch = panfrost_create_batch(ctx, key);
>  
>  /* Save the created job */
> -_mesa_hash_table_insert(ctx->batches, >key, batch);
> +_mesa_hash_table_insert(ctx->fbo_to_batch, >key, batch);
>  
>  return batch;
>  }
> @@ -915,7 +916,7 @@ panfrost_flush_all_batches(struct panfrost_context *ctx, 
> bool wait)
>  util_dynarray_init(, NULL);
>  }
>  
> -hash_table_foreach(ctx->batches, hentry) {
> +hash_table_foreach(ctx->fbo_to_batch, hentry) {
>  struct panfrost_batch *batch = hentry->data;
>  
>  assert(batch);
> @@ -931,7 +932,7 @@ panfrost_flush_all_batches(struct panfrost_context *ctx, 
> bool wait)
>  panfrost_batch_submit(batch);
>  }
>  
> -assert(!ctx->batches->entries);
> +assert(!ctx->fbo_to_batch->entries);
>  
>  /* Collect batch fences before returning */
>  panfrost_gc_fences(ctx);
> @@ -1183,9 +1184,9 @@ panfrost_batch_is_scanout(struct panfrost_batch *batch)
>  void
>  

Re: [Mesa-dev] [PATCH v3 15/17] panfrost: Rename ctx->batches into ctx->fbo_to_batch

2019-09-20 Thread Alyssa Rosenzweig
R-b
On Wed, Sep 18, 2019 at 03:24:37PM +0200, Boris Brezillon wrote:
> We are about to add a batch queue to keep track of submission order.
> Let's rename the existing batches hash table (which is used to get the
> batch attached to an FBO) into fbo_to_batch to avoid confusion.
> 
> Signed-off-by: Boris Brezillon 
> ---
>  src/gallium/drivers/panfrost/pan_context.c  |  2 +-
>  src/gallium/drivers/panfrost/pan_context.h  |  2 +-
>  src/gallium/drivers/panfrost/pan_job.c  | 21 +++--
>  src/gallium/drivers/panfrost/pan_resource.c | 16 
>  4 files changed, 21 insertions(+), 20 deletions(-)
> 
> diff --git a/src/gallium/drivers/panfrost/pan_context.c 
> b/src/gallium/drivers/panfrost/pan_context.c
> index 07bafad58a00..0330b5852676 100644
> --- a/src/gallium/drivers/panfrost/pan_context.c
> +++ b/src/gallium/drivers/panfrost/pan_context.c
> @@ -1355,7 +1355,7 @@ panfrost_flush(
>   */
>  if (fence) {
>  util_dynarray_init(, NULL);
> -hash_table_foreach(ctx->batches, hentry) {
> +hash_table_foreach(ctx->fbo_to_batch, hentry) {
>  struct panfrost_batch *batch = hentry->data;
>  
>  panfrost_batch_fence_reference(batch->out_sync);
> diff --git a/src/gallium/drivers/panfrost/pan_context.h 
> b/src/gallium/drivers/panfrost/pan_context.h
> index d50ed57d5d8a..f13967f51b46 100644
> --- a/src/gallium/drivers/panfrost/pan_context.h
> +++ b/src/gallium/drivers/panfrost/pan_context.h
> @@ -112,7 +112,7 @@ struct panfrost_context {
>  
>  /* Bound job batch and map of panfrost_batch_key to job batches */
>  struct panfrost_batch *batch;
> -struct hash_table *batches;
> +struct hash_table *fbo_to_batch;
>  
>  /* panfrost_bo -> panfrost_bo_access */
>  struct hash_table *accessed_bos;
> diff --git a/src/gallium/drivers/panfrost/pan_job.c 
> b/src/gallium/drivers/panfrost/pan_job.c
> index a56f4044fda0..45f9d9d24b41 100644
> --- a/src/gallium/drivers/panfrost/pan_job.c
> +++ b/src/gallium/drivers/panfrost/pan_job.c
> @@ -132,9 +132,9 @@ panfrost_freeze_batch(struct panfrost_batch *batch)
>   * matches. This way, next draws/clears targeting this FBO will 
> trigger
>   * the creation of a new batch.
>   */
> -entry = _mesa_hash_table_search(ctx->batches, >key);
> +entry = _mesa_hash_table_search(ctx->fbo_to_batch, >key);
>  if (entry && entry->data == batch)
> -_mesa_hash_table_remove(ctx->batches, entry);
> +_mesa_hash_table_remove(ctx->fbo_to_batch, entry);
>  
>  /* If this is the bound batch, the panfrost_context parameters are
>   * relevant so submitting it invalidates those parameters, but if 
> it's
> @@ -153,7 +153,7 @@ static bool panfrost_batch_is_frozen(struct 
> panfrost_batch *batch)
>  struct panfrost_context *ctx = batch->ctx;
>  struct hash_entry *entry;
>  
> -entry = _mesa_hash_table_search(ctx->batches, >key);
> +entry = _mesa_hash_table_search(ctx->fbo_to_batch, >key);
>  if (entry && entry->data == batch)
>  return false;
>  
> @@ -248,7 +248,8 @@ panfrost_get_batch(struct panfrost_context *ctx,
> const struct pipe_framebuffer_state *key)
>  {
>  /* Lookup the job first */
> -struct hash_entry *entry = _mesa_hash_table_search(ctx->batches, 
> key);
> +struct hash_entry *entry = _mesa_hash_table_search(ctx->fbo_to_batch,
> +   key);
>  
>  if (entry)
>  return entry->data;
> @@ -258,7 +259,7 @@ panfrost_get_batch(struct panfrost_context *ctx,
>  struct panfrost_batch *batch = panfrost_create_batch(ctx, key);
>  
>  /* Save the created job */
> -_mesa_hash_table_insert(ctx->batches, >key, batch);
> +_mesa_hash_table_insert(ctx->fbo_to_batch, >key, batch);
>  
>  return batch;
>  }
> @@ -915,7 +916,7 @@ panfrost_flush_all_batches(struct panfrost_context *ctx, 
> bool wait)
>  util_dynarray_init(, NULL);
>  }
>  
> -hash_table_foreach(ctx->batches, hentry) {
> +hash_table_foreach(ctx->fbo_to_batch, hentry) {
>  struct panfrost_batch *batch = hentry->data;
>  
>  assert(batch);
> @@ -931,7 +932,7 @@ panfrost_flush_all_batches(struct panfrost_context *ctx, 
> bool wait)
>  panfrost_batch_submit(batch);
>  }
>  
> -assert(!ctx->batches->entries);
> +assert(!ctx->fbo_to_batch->entries);
>  
>  /* Collect batch fences before returning */
>  panfrost_gc_fences(ctx);
> @@ -1183,9 +1184,9 @@ panfrost_batch_is_scanout(struct panfrost_batch *batch)
>  void
>  panfrost_batch_init(struct panfrost_context *ctx)
>  {
> -ctx->batches = _mesa_hash_table_create(ctx,
> -

Re: [Mesa-dev] [PATCH v3 14/17] panfrost: Do fine-grained flushing when preparing BO for CPU accesses

2019-09-20 Thread Alyssa Rosenzweig
Looks good, still r-b. But while we're here, let's get this right:

> @@ -578,10 +578,8 @@ panfrost_transfer_map(struct pipe_context *pctx,
>  is_bound |= fb->cbufs[c]->texture == resource;
>  }
>  
> -if (is_bound && (usage & PIPE_TRANSFER_READ)) {
> -assert(level == 0);
> -panfrost_flush_all_batches(ctx, true);
> -}
> +if (is_bound && (usage & PIPE_TRANSFER_READ))
> + assert(level == 0);

Everything from "Check if we're bound..." to "flush_all_batches()}" can
be removed along with this commit :)

I.e. just delete all the lines from L566 - L585 on master as of
46b7512b0a73 ... of course your line numbers are a bit different.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH v3 13/17] panfrost: Make sure the BO is 'ready' when picked from the cache

2019-09-20 Thread Alyssa Rosenzweig
Very nice! I'm quite happy with this version, all considered, so R-b!

On Wed, Sep 18, 2019 at 03:24:35PM +0200, Boris Brezillon wrote:
> This is needed if we want to free the panfrost_batch object at submit
> time in order to not have to GC the batch on the next job submission.
> 
> Signed-off-by: Boris Brezillon 
> ---
> Changes in v3:
> * Move the patch later in the series and squash "panfrost: Cache GPU
>   accesses to BOs" in it
> * Add extra comments to explain what we're doing
> ---
>  src/gallium/drivers/panfrost/pan_bo.c  | 112 -
>  src/gallium/drivers/panfrost/pan_bo.h  |   9 ++
>  src/gallium/drivers/panfrost/pan_job.c |  11 +++
>  3 files changed, 109 insertions(+), 23 deletions(-)
> 
> diff --git a/src/gallium/drivers/panfrost/pan_bo.c 
> b/src/gallium/drivers/panfrost/pan_bo.c
> index 9daddf9d0cc2..37602688d630 100644
> --- a/src/gallium/drivers/panfrost/pan_bo.c
> +++ b/src/gallium/drivers/panfrost/pan_bo.c
> @@ -23,6 +23,7 @@
>   * Authors (Collabora):
>   *   Alyssa Rosenzweig 
>   */
> +#include 
>  #include 
>  #include 
>  #include 
> @@ -101,6 +102,63 @@ panfrost_bo_free(struct panfrost_bo *bo)
>  ralloc_free(bo);
>  }
>  
> +/* Returns true if the BO is ready, false otherwise.
> + * access_type is encoding the type of access one wants to ensure is done.
> + * Say you want to make sure all writers are done writing, you should pass
> + * PAN_BO_ACCESS_WRITE.
> + * If you want to wait for all users, you should pass PAN_BO_ACCESS_RW.
> + * PAN_BO_ACCESS_READ would work too as waiting for readers implies
> + * waiting for writers as well, but we want to make things explicit and 
> waiting
> + * only for readers is impossible.
> + */
> +bool
> +panfrost_bo_wait(struct panfrost_bo *bo, int64_t timeout_ns,
> + uint32_t access_type)
> +{
> +struct drm_panfrost_wait_bo req = {
> +.handle = bo->gem_handle,
> + .timeout_ns = timeout_ns,
> +};
> +int ret;
> +
> +assert(access_type == PAN_BO_ACCESS_WRITE ||
> +   access_type == PAN_BO_ACCESS_RW);
> +
> +/* If the BO has been exported or imported we can't rely on the 
> cached
> + * state, we need to call the WAIT_BO ioctl.
> + */
> +if (!(bo->flags & (PAN_BO_IMPORTED | PAN_BO_EXPORTED))) {
> +/* If ->gpu_access is 0, the BO is idle, no need to wait. */
> +if (!bo->gpu_access)
> +return true;
> +
> +/* If the caller only wants to wait for writers and no
> + * writes are pending, we don't have to wait.
> + */
> +if (access_type == PAN_BO_ACCESS_WRITE &&
> +!(bo->gpu_access & PAN_BO_ACCESS_WRITE))
> +return true;
> +}
> +
> +/* The ioctl returns >= 0 value when the BO we are waiting for is 
> ready
> + * -1 otherwise.
> + */
> +ret = drmIoctl(bo->screen->fd, DRM_IOCTL_PANFROST_WAIT_BO, );
> +if (ret != -1) {
> +/* Set gpu_access to 0 so that the next call to bo_wait()
> + * doesn't have to call the WAIT_BO ioctl.
> + */
> +bo->gpu_access = 0;
> +return true;
> +}
> +
> +/* If errno is not ETIMEDOUT or EBUSY that means the handle we passed
> + * is invalid, which shouldn't happen here.
> + */
> +assert(errno == ETIMEDOUT || errno == EBUSY);
> +return false;
> +}
> +
>  /* Helper to calculate the bucket index of a BO */
>  
>  static unsigned
> @@ -137,9 +195,8 @@ pan_bucket(struct panfrost_screen *screen, unsigned size)
>   * BO. */
>  
>  static struct panfrost_bo *
> -panfrost_bo_cache_fetch(
> -struct panfrost_screen *screen,
> -size_t size, uint32_t flags)
> +panfrost_bo_cache_fetch(struct panfrost_screen *screen,
> +size_t size, uint32_t flags, bool dontwait)
>  {
>  pthread_mutex_lock(>bo_cache_lock);
>  struct list_head *bucket = pan_bucket(screen, size);
> @@ -147,27 +204,30 @@ panfrost_bo_cache_fetch(
>  
>  /* Iterate the bucket looking for something suitable */
>  list_for_each_entry_safe(struct panfrost_bo, entry, bucket, link) {
> -if (entry->size >= size &&
> -entry->flags == flags) {
> -int ret;
> -struct drm_panfrost_madvise madv;
> +if (entry->size <

Re: [Mesa-dev] [PATCH v3 08/17] panfrost: Add a panfrost_flush_all_batches() helper

2019-09-20 Thread Alyssa Rosenzweig
(Still r-b)

On Wed, Sep 18, 2019 at 03:24:30PM +0200, Boris Brezillon wrote:
> And use it in panfrost_flush() to flush all batches, and not only the
> one currently bound to the context.
> 
> We also replace all internal calls to panfrost_flush() by
> panfrost_flush_all_batches() ones.
> 
> Signed-off-by: Boris Brezillon 
> Reviewed-by: Alyssa Rosenzweig 
> ---
> Changes in v3:
> * Add missing blank line
> * Collect R-b
> ---
>  src/gallium/drivers/panfrost/pan_compute.c  |  2 +-
>  src/gallium/drivers/panfrost/pan_context.c  | 23 +++
>  src/gallium/drivers/panfrost/pan_job.c  | 46 -
>  src/gallium/drivers/panfrost/pan_job.h  |  2 +-
>  src/gallium/drivers/panfrost/pan_resource.c |  6 +--
>  5 files changed, 64 insertions(+), 15 deletions(-)
> 
> diff --git a/src/gallium/drivers/panfrost/pan_compute.c 
> b/src/gallium/drivers/panfrost/pan_compute.c
> index 4639c1b03c38..036dffbb17be 100644
> --- a/src/gallium/drivers/panfrost/pan_compute.c
> +++ b/src/gallium/drivers/panfrost/pan_compute.c
> @@ -133,7 +133,7 @@ panfrost_launch_grid(struct pipe_context *pipe,
>  /* Queue the job */
>  panfrost_scoreboard_queue_compute_job(batch, transfer);
>  
> -panfrost_flush(pipe, NULL, PIPE_FLUSH_END_OF_FRAME);
> +panfrost_flush_all_batches(ctx, true);
>  }
>  
>  void
> diff --git a/src/gallium/drivers/panfrost/pan_context.c 
> b/src/gallium/drivers/panfrost/pan_context.c
> index aad69e3f9991..861b4b621602 100644
> --- a/src/gallium/drivers/panfrost/pan_context.c
> +++ b/src/gallium/drivers/panfrost/pan_context.c
> @@ -1348,7 +1348,6 @@ panfrost_flush(
>  unsigned flags)
>  {
>  struct panfrost_context *ctx = pan_context(pipe);
> -struct panfrost_batch *batch = panfrost_get_batch_for_fbo(ctx);
>  struct util_dynarray fences;
>  
>  /* We must collect the fences before the flush is done, otherwise 
> we'll
> @@ -1356,13 +1355,18 @@ panfrost_flush(
>   */
>  if (fence) {
>  util_dynarray_init(, NULL);
> -panfrost_batch_fence_reference(batch->out_sync);
> -util_dynarray_append(, struct panfrost_batch_fence *,
> - batch->out_sync);
> +hash_table_foreach(ctx->batches, hentry) {
> +struct panfrost_batch *batch = hentry->data;
> +
> +panfrost_batch_fence_reference(batch->out_sync);
> +util_dynarray_append(,
> + struct panfrost_batch_fence *,
> + batch->out_sync);
> +}
>  }
>  
> -/* Submit the frame itself */
> -panfrost_batch_submit(batch);
> +/* Submit all pending jobs */
> +panfrost_flush_all_batches(ctx, false);
>  
>  if (fence) {
>  struct panfrost_fence *f = panfrost_fence_create(ctx, 
> );
> @@ -2321,7 +2325,7 @@ panfrost_set_framebuffer_state(struct pipe_context 
> *pctx,
>  }
>  
>  if (!is_scanout || has_draws)
> -panfrost_flush(pctx, NULL, PIPE_FLUSH_END_OF_FRAME);
> +panfrost_flush_all_batches(ctx, true);
>  else
>  
> assert(!ctx->payloads[PIPE_SHADER_VERTEX].postfix.framebuffer &&
> 
> !ctx->payloads[PIPE_SHADER_FRAGMENT].postfix.framebuffer);
> @@ -2553,6 +2557,7 @@ panfrost_get_query_result(struct pipe_context *pipe,
>union pipe_query_result *vresult)
>  {
>  struct panfrost_query *query = (struct panfrost_query *) q;
> +struct panfrost_context *ctx = pan_context(pipe);
>  
>  
>  switch (query->type) {
> @@ -2560,7 +2565,7 @@ panfrost_get_query_result(struct pipe_context *pipe,
>  case PIPE_QUERY_OCCLUSION_PREDICATE:
>  case PIPE_QUERY_OCCLUSION_PREDICATE_CONSERVATIVE:
>  /* Flush first */
> -panfrost_flush(pipe, NULL, PIPE_FLUSH_END_OF_FRAME);
> +panfrost_flush_all_batches(ctx, true);
>  
>  /* Read back the query results */
>  unsigned *result = (unsigned *) query->transfer.cpu;
> @@ -2576,7 +2581,7 @@ panfrost_get_query_result(struct pipe_context *pipe,
>  
>  case PIPE_QUERY_PRIMITIVES_GENERATED:
>  case PIPE_QUERY_PRIMITIVES_EMITTED:
> -panfrost_flush(pipe, NULL, PIPE_FLUSH_END_OF_FRAME);
> +panfrost_flush_all_batches(ctx, true);
>  vresult->u64 = query->end - query-&

Re: [Mesa-dev] [PATCH v3 09/17] panfrost: Add a panfrost_flush_batches_accessing_bo() helper

2019-09-20 Thread Alyssa Rosenzweig
"
On Wed, Sep 18, 2019 at 03:24:31PM +0200, Boris Brezillon wrote:
> This will allow us to only flush batches touching a specific resource,
> which is particularly useful when the CPU needs to access a BO.
> 
> Signed-off-by: Boris Brezillon 
> Reviewed-by: Alyssa Rosenzweig 
> ---
> Changes in v3:
> * Collect R-b
> ---
>  src/gallium/drivers/panfrost/pan_job.c | 31 ++
>  src/gallium/drivers/panfrost/pan_job.h |  4 
>  2 files changed, 35 insertions(+)
> 
> diff --git a/src/gallium/drivers/panfrost/pan_job.c 
> b/src/gallium/drivers/panfrost/pan_job.c
> index 3ccf4bb6b3e9..e7eae399830f 100644
> --- a/src/gallium/drivers/panfrost/pan_job.c
> +++ b/src/gallium/drivers/panfrost/pan_job.c
> @@ -952,6 +952,37 @@ panfrost_flush_all_batches(struct panfrost_context *ctx, 
> bool wait)
>  util_dynarray_fini();
>  }
>  
> +void
> +panfrost_flush_batches_accessing_bo(struct panfrost_context *ctx,
> +struct panfrost_bo *bo,
> +uint32_t access_type)
> +{
> +struct panfrost_bo_access *access;
> +struct hash_entry *hentry;
> +
> +/* It doesn't make any to flush only the readers. */
> +assert(access_type == PAN_BO_ACCESS_WRITE ||
> +   access_type == PAN_BO_ACCESS_RW);
> +
> +hentry = _mesa_hash_table_search(ctx->accessed_bos, bo);
> +access = hentry ? hentry->data : NULL;
> +if (!access)
> +return;
> +
> +if (access_type & PAN_BO_ACCESS_WRITE && access->writer &&
> +access->writer->batch)
> +panfrost_batch_submit(access->writer->batch);
> +
> +if (!(access_type & PAN_BO_ACCESS_READ))
> +return;
> +
> +util_dynarray_foreach(>readers, struct panfrost_batch_fence 
> *,
> +  reader) {
> +if (*reader && (*reader)->batch)
> +panfrost_batch_submit((*reader)->batch);
> +}
> +}
> +
>  void
>  panfrost_batch_set_requirements(struct panfrost_batch *batch)
>  {
> diff --git a/src/gallium/drivers/panfrost/pan_job.h 
> b/src/gallium/drivers/panfrost/pan_job.h
> index e95e156a40f8..25905b516739 100644
> --- a/src/gallium/drivers/panfrost/pan_job.h
> +++ b/src/gallium/drivers/panfrost/pan_job.h
> @@ -185,6 +185,10 @@ panfrost_batch_create_bo(struct panfrost_batch *batch, 
> size_t size,
>  void
>  panfrost_flush_all_batches(struct panfrost_context *ctx, bool wait);
>  
> +void
> +panfrost_flush_batches_accessing_bo(struct panfrost_context *ctx,
> +struct panfrost_bo *bo, uint32_t flags);
> +
>  void
>  panfrost_batch_set_requirements(struct panfrost_batch *batch);
>  
> -- 
> 2.21.0
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH v3 07/17] panfrost: Prepare panfrost_fence for batch pipelining

2019-09-20 Thread Alyssa Rosenzweig
(Still r-b)

On Wed, Sep 18, 2019 at 03:24:29PM +0200, Boris Brezillon wrote:
> The panfrost_fence logic currently waits on the last submitted batch,
> but the batch serialization that was enforced in
> panfrost_batch_submit() is about to go away, allowing for several
> batches to be pipelined, and the last submitted one is not necessarily
> the one that will finish last.
> 
> We need to make sure the fence logic waits on all flushed batches, not
> only the last one.
> 
> Signed-off-by: Boris Brezillon 
> Reviewed-by: Alyssa Rosenzweig 
> ---
> Changes in v3:
> * Fix a comment
> * Adjust things to match the changes done in "panfrost: Add a batch fence"
> ---
>  src/gallium/drivers/panfrost/pan_context.c | 18 +-
>  src/gallium/drivers/panfrost/pan_context.h |  5 +-
>  src/gallium/drivers/panfrost/pan_job.c | 16 -
>  src/gallium/drivers/panfrost/pan_screen.c  | 71 +++---
>  src/gallium/drivers/panfrost/pan_screen.h  |  3 +-
>  5 files changed, 57 insertions(+), 56 deletions(-)
> 
> diff --git a/src/gallium/drivers/panfrost/pan_context.c 
> b/src/gallium/drivers/panfrost/pan_context.c
> index 312a9e93e455..aad69e3f9991 100644
> --- a/src/gallium/drivers/panfrost/pan_context.c
> +++ b/src/gallium/drivers/panfrost/pan_context.c
> @@ -1349,14 +1349,30 @@ panfrost_flush(
>  {
>  struct panfrost_context *ctx = pan_context(pipe);
>  struct panfrost_batch *batch = panfrost_get_batch_for_fbo(ctx);
> +struct util_dynarray fences;
> +
> +/* We must collect the fences before the flush is done, otherwise 
> we'll
> + * lose track of them.
> + */
> +if (fence) {
> +util_dynarray_init(, NULL);
> +panfrost_batch_fence_reference(batch->out_sync);
> +util_dynarray_append(, struct panfrost_batch_fence *,
> + batch->out_sync);
> +}
>  
>  /* Submit the frame itself */
>  panfrost_batch_submit(batch);
>  
>  if (fence) {
> -struct panfrost_fence *f = panfrost_fence_create(ctx);
> +struct panfrost_fence *f = panfrost_fence_create(ctx, 
> );
>  pipe->screen->fence_reference(pipe->screen, fence, NULL);
>  *fence = (struct pipe_fence_handle *)f;
> +
> +util_dynarray_foreach(, struct panfrost_batch_fence 
> *, fence)
> +panfrost_batch_fence_unreference(*fence);
> +
> +util_dynarray_fini();
>  }
>  }
>  
> diff --git a/src/gallium/drivers/panfrost/pan_context.h 
> b/src/gallium/drivers/panfrost/pan_context.h
> index 3b09952345cf..d50ed57d5d8a 100644
> --- a/src/gallium/drivers/panfrost/pan_context.h
> +++ b/src/gallium/drivers/panfrost/pan_context.h
> @@ -94,7 +94,7 @@ struct panfrost_query {
>  
>  struct panfrost_fence {
>  struct pipe_reference reference;
> -int fd;
> +struct util_dynarray syncfds;
>  };
>  
>  struct panfrost_streamout {
> @@ -193,9 +193,6 @@ struct panfrost_context {
>  
>  /* True for t6XX, false for t8xx. */
>  bool is_t6xx;
> -
> -/* The out sync fence of the last submitted batch. */
> -struct panfrost_batch_fence *last_out_sync;
>  };
>  
>  /* Corresponds to the CSO */
> diff --git a/src/gallium/drivers/panfrost/pan_job.c 
> b/src/gallium/drivers/panfrost/pan_job.c
> index b0494af3482f..211e48bafd4e 100644
> --- a/src/gallium/drivers/panfrost/pan_job.c
> +++ b/src/gallium/drivers/panfrost/pan_job.c
> @@ -819,13 +819,6 @@ panfrost_batch_submit_ioctl(struct panfrost_batch *batch,
>  free(bo_handles);
>  free(in_syncs);
>  
> -/* Release the last batch fence if any, and retain the new one */
> -if (ctx->last_out_sync)
> -panfrost_batch_fence_unreference(ctx->last_out_sync);
> -
> -panfrost_batch_fence_reference(batch->out_sync);
> -ctx->last_out_sync = batch->out_sync;
> -
>  if (ret) {
>  fprintf(stderr, "Error submitting: %m\n");
>  return errno;
> @@ -884,15 +877,6 @@ panfrost_batch_submit(struct panfrost_batch *batch)
>   * to wait on it.
>   */
>  batch->out_sync->signaled = true;
> -
> -/* Release the last batch fence if any, and set 
> ->last_out_sync
> - * to NULL
> - */
> -if (ctx->last_out_sync) {
> -panfrost_batch_fence_unreference(ctx->last_out_sync);
> -  

Re: [Mesa-dev] [PATCH v3 06/17] panfrost: Start tracking inter-batch dependencies

2019-09-20 Thread Alyssa Rosenzweig
R-b. nice work!

On Wed, Sep 18, 2019 at 03:24:28PM +0200, Boris Brezillon wrote:
> The idea is to track which BO are being accessed and the type of access
> to determine when a dependency exists. Thanks to that we can build a
> dependency graph that will allow us to flush batches in the correct
> order.
> 
> Signed-off-by: Boris Brezillon 
> ---
> Changes in v3:
> * Fix coding style issues
> * Do not check for batch presence in the reader array when updating
>   a BO access (we already have this information)
> * Add more comments to explain what we're doing and why we're doing
>   it like that
> ---
>  src/gallium/drivers/panfrost/pan_context.h |   3 +
>  src/gallium/drivers/panfrost/pan_job.c | 355 -
>  src/gallium/drivers/panfrost/pan_job.h |   3 +
>  3 files changed, 356 insertions(+), 5 deletions(-)
> 
> diff --git a/src/gallium/drivers/panfrost/pan_context.h 
> b/src/gallium/drivers/panfrost/pan_context.h
> index ce3e0c899a4f..3b09952345cf 100644
> --- a/src/gallium/drivers/panfrost/pan_context.h
> +++ b/src/gallium/drivers/panfrost/pan_context.h
> @@ -114,6 +114,9 @@ struct panfrost_context {
>  struct panfrost_batch *batch;
>  struct hash_table *batches;
>  
> +/* panfrost_bo -> panfrost_bo_access */
> +struct hash_table *accessed_bos;
> +
>  /* Within a launch_grid call.. */
>  const struct pipe_grid_info *compute_grid;
>  
> diff --git a/src/gallium/drivers/panfrost/pan_job.c 
> b/src/gallium/drivers/panfrost/pan_job.c
> index 872c846207bf..b0494af3482f 100644
> --- a/src/gallium/drivers/panfrost/pan_job.c
> +++ b/src/gallium/drivers/panfrost/pan_job.c
> @@ -36,6 +36,29 @@
>  #include "pan_util.h"
>  #include "pandecode/decode.h"
>  
> +/* panfrost_bo_access is here to help us keep track of batch accesses to BOs
> + * and build a proper dependency graph such that batches can be pipelined for
> + * better GPU utilization.
> + *
> + * Each accessed BO has a corresponding entry in the ->accessed_bos hash 
> table.
> + * A BO is either being written or read at any time, that's what the type 
> field
> + * encodes.
> + * When the last access is a write, the batch writing the BO might have read
> + * dependencies (readers that have not been executed yet and want to read the
> + * previous BO content), and when the last access is a read, all readers 
> might
> + * depend on another batch to push its results to memory. That's what the
> + * readers/writers keep track off.
> + * There can only be one writer at any given time, if a new batch wants to
> + * write to the same BO, a dependency will be added between the new writer 
> and
> + * the old writer (at the batch level), and panfrost_bo_access->writer will 
> be
> + * updated to point to the new writer.
> + */
> +struct panfrost_bo_access {
> +uint32_t type;
> +struct util_dynarray readers;
> +struct panfrost_batch_fence *writer;
> +};
> +
>  static struct panfrost_batch_fence *
>  panfrost_create_batch_fence(struct panfrost_batch *batch)
>  {
> @@ -92,6 +115,7 @@ panfrost_create_batch(struct panfrost_context *ctx,
>  
>  util_dynarray_init(>headers, batch);
>  util_dynarray_init(>gpu_headers, batch);
> +util_dynarray_init(>dependencies, batch);
>  batch->out_sync = panfrost_create_batch_fence(batch);
>  util_copy_framebuffer_state(>key, key);
>  
> @@ -151,6 +175,11 @@ panfrost_free_batch(struct panfrost_batch *batch)
>  hash_table_foreach(batch->bos, entry)
>  panfrost_bo_unreference((struct panfrost_bo *)entry->key);
>  
> +util_dynarray_foreach(>dependencies,
> +  struct panfrost_batch_fence *, dep) {
> +panfrost_batch_fence_unreference(*dep);
> +}
> +
>  /* The out_sync fence lifetime is different from the the batch one
>   * since other batches might want to wait on a fence of already
>   * submitted/signaled batch. All we need to do here is make sure the
> @@ -164,6 +193,56 @@ panfrost_free_batch(struct panfrost_batch *batch)
>  ralloc_free(batch);
>  }
>  
> +#ifndef NDEBUG
> +static bool
> +panfrost_dep_graph_contains_batch(struct panfrost_batch *root,
> +  struct panfrost_batch *batch)
> +{
> +if (!root)
> +return false;
> +
> +util_dynarray_foreach(>dependencies,
> +  struct panfrost_batch_fence *, dep) {
> +if ((*dep)->batch == batch ||
> +panfrost_dep_graph_contains_batch((*dep)->batch, batch))
> +return true;
> +}
> +
> +return false;
> +}
> +#endif
> +
> +static void
> +panfrost_batch_add_dep(struct panfrost_batch *batch,
> +   struct panfrost_batch_fence *newdep)
> +{
> +if (batch == newdep->batch)
> +return;
> +
> +/* We might want to turn ->dependencies into a set 

Re: [Mesa-dev] [PATCH v3 05/17] panfrost: Add a panfrost_freeze_batch() helper

2019-09-20 Thread Alyssa Rosenzweig
Still r-b, but please use the collabora address:)
On Wed, Sep 18, 2019 at 03:24:27PM +0200, Boris Brezillon wrote:
> We'll soon need to freeze a batch not only when it's flushed, but also
> when another batch depends on us, so let's add a helper to avoid
> duplicating the logic.
> 
> Signed-off-by: Boris Brezillon 
> Reviewed-by: Alyssa Rosenzweig 
> ---
> Changes in v3:
> * Collect R-b
> ---
>  src/gallium/drivers/panfrost/pan_job.c | 62 ++
>  1 file changed, 44 insertions(+), 18 deletions(-)
> 
> diff --git a/src/gallium/drivers/panfrost/pan_job.c 
> b/src/gallium/drivers/panfrost/pan_job.c
> index 55780dd3d9d6..872c846207bf 100644
> --- a/src/gallium/drivers/panfrost/pan_job.c
> +++ b/src/gallium/drivers/panfrost/pan_job.c
> @@ -98,22 +98,59 @@ panfrost_create_batch(struct panfrost_context *ctx,
>  return batch;
>  }
>  
> +static void
> +panfrost_freeze_batch(struct panfrost_batch *batch)
> +{
> +struct panfrost_context *ctx = batch->ctx;
> +struct hash_entry *entry;
> +
> +/* Remove the entry in the FBO -> batch hash table if the batch
> + * matches. This way, next draws/clears targeting this FBO will 
> trigger
> + * the creation of a new batch.
> + */
> +entry = _mesa_hash_table_search(ctx->batches, >key);
> +if (entry && entry->data == batch)
> +_mesa_hash_table_remove(ctx->batches, entry);
> +
> +/* If this is the bound batch, the panfrost_context parameters are
> + * relevant so submitting it invalidates those parameters, but if 
> it's
> + * not bound, the context parameters are for some other batch so we
> + * can't invalidate them.
> + */
> +if (ctx->batch == batch) {
> +panfrost_invalidate_frame(ctx);
> +ctx->batch = NULL;
> +}
> +}
> +
> +#ifndef NDEBUG
> +static bool panfrost_batch_is_frozen(struct panfrost_batch *batch)
> +{
> +struct panfrost_context *ctx = batch->ctx;
> +struct hash_entry *entry;
> +
> +entry = _mesa_hash_table_search(ctx->batches, >key);
> +if (entry && entry->data == batch)
> +return false;
> +
> +if (ctx->batch == batch)
> +return false;
> +
> +return true;
> +}
> +#endif
> +
>  static void
>  panfrost_free_batch(struct panfrost_batch *batch)
>  {
>  if (!batch)
>  return;
>  
> -struct panfrost_context *ctx = batch->ctx;
> +assert(panfrost_batch_is_frozen(batch));
>  
>  hash_table_foreach(batch->bos, entry)
>  panfrost_bo_unreference((struct panfrost_bo *)entry->key);
>  
> -_mesa_hash_table_remove_key(ctx->batches, >key);
> -
> -if (ctx->batch == batch)
> -ctx->batch = NULL;
> -
>  /* The out_sync fence lifetime is different from the the batch one
>   * since other batches might want to wait on a fence of already
>   * submitted/signaled batch. All we need to do here is make sure the
> @@ -529,19 +566,8 @@ panfrost_batch_submit(struct panfrost_batch *batch)
>  fprintf(stderr, "panfrost_batch_submit failed: %d\n", ret);
>  
>  out:
> -/* If this is the bound batch, the panfrost_context parameters are
> - * relevant so submitting it invalidates those paramaters, but if 
> it's
> - * not bound, the context parameters are for some other batch so we
> - * can't invalidate them.
> - */
> -if (ctx->batch == batch)
> -panfrost_invalidate_frame(ctx);
> -
> -/* The job has been submitted, let's invalidate the current FBO job
> - * cache.
> -  */
> +panfrost_freeze_batch(batch);
>  assert(!ctx->batch || batch == ctx->batch);
> -ctx->batch = NULL;
>  
>  /* We always stall the pipeline for correct results since pipelined
>   * rendering is quite broken right now (to be fixed by the 
> panfrost_job
> -- 
> 2.21.0
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH v3 04/17] panfrost: Use the per-batch fences to wait on the last submitted batch

2019-09-20 Thread Alyssa Rosenzweig
R-b
On Wed, Sep 18, 2019 at 03:24:26PM +0200, Boris Brezillon wrote:
> We just replace the per-context out_sync object by a pointer to the
> the fence of the last last submitted batch. Pipelining of batches will
> come later.
> 
> Signed-off-by: Boris Brezillon 
> ---
> Alyssa, I dropped your R-b since the other changes you asked me to do
> in "panfrost: Add a batch fence" had some impact on this patch.
> 
> Changes in v3:
> * Make sure we don't try to wait on dummy batches (those with no
>   vertex/tiler/fragment jobs)
> ---
>  src/gallium/drivers/panfrost/pan_context.c |  6 
>  src/gallium/drivers/panfrost/pan_context.h |  3 +-
>  src/gallium/drivers/panfrost/pan_job.c | 35 ++
>  src/gallium/drivers/panfrost/pan_screen.c  | 18 +--
>  4 files changed, 47 insertions(+), 15 deletions(-)
> 
> diff --git a/src/gallium/drivers/panfrost/pan_context.c 
> b/src/gallium/drivers/panfrost/pan_context.c
> index 65a6c7f8c5ae..312a9e93e455 100644
> --- a/src/gallium/drivers/panfrost/pan_context.c
> +++ b/src/gallium/drivers/panfrost/pan_context.c
> @@ -2702,12 +2702,6 @@ panfrost_create_context(struct pipe_screen *screen, 
> void *priv, unsigned flags)
>  panfrost_blend_context_init(gallium);
>  panfrost_compute_context_init(gallium);
>  
> -ASSERTED int ret;
> -
> -ret = drmSyncobjCreate(pscreen->fd, DRM_SYNCOBJ_CREATE_SIGNALED,
> -   >out_sync);
> -assert(!ret);
> -
>  /* XXX: leaks */
>  gallium->stream_uploader = u_upload_create_default(gallium);
>  gallium->const_uploader = gallium->stream_uploader;
> diff --git a/src/gallium/drivers/panfrost/pan_context.h 
> b/src/gallium/drivers/panfrost/pan_context.h
> index c145d589757e..ce3e0c899a4f 100644
> --- a/src/gallium/drivers/panfrost/pan_context.h
> +++ b/src/gallium/drivers/panfrost/pan_context.h
> @@ -191,7 +191,8 @@ struct panfrost_context {
>  /* True for t6XX, false for t8xx. */
>  bool is_t6xx;
>  
> -uint32_t out_sync;
> +/* The out sync fence of the last submitted batch. */
> +struct panfrost_batch_fence *last_out_sync;
>  };
>  
>  /* Corresponds to the CSO */
> diff --git a/src/gallium/drivers/panfrost/pan_job.c 
> b/src/gallium/drivers/panfrost/pan_job.c
> index b6763da66a97..55780dd3d9d6 100644
> --- a/src/gallium/drivers/panfrost/pan_job.c
> +++ b/src/gallium/drivers/panfrost/pan_job.c
> @@ -425,11 +425,13 @@ panfrost_batch_submit_ioctl(struct panfrost_batch 
> *batch,
>  uint32_t *bo_handles;
>  int ret;
>  
> -submit.in_syncs = (u64) (uintptr_t) >out_sync;
> -submit.in_sync_count = 1;
>  
> -submit.out_sync = ctx->out_sync;
> +if (ctx->last_out_sync) {
> +submit.in_sync_count = 1;
> +submit.in_syncs = (uintptr_t)>last_out_sync->syncobj;
> +}
>  
> +submit.out_sync = batch->out_sync->syncobj;
>  submit.jc = first_job_desc;
>  submit.requirements = reqs;
>  
> @@ -445,6 +447,14 @@ panfrost_batch_submit_ioctl(struct panfrost_batch *batch,
>  submit.bo_handles = (u64) (uintptr_t) bo_handles;
>  ret = drmIoctl(screen->fd, DRM_IOCTL_PANFROST_SUBMIT, );
>  free(bo_handles);
> +
> +/* Release the last batch fence if any, and retain the new one */
> +if (ctx->last_out_sync)
> +panfrost_batch_fence_unreference(ctx->last_out_sync);
> +
> +panfrost_batch_fence_reference(batch->out_sync);
> +ctx->last_out_sync = batch->out_sync;
> +
>  if (ret) {
>  fprintf(stderr, "Error submitting: %m\n");
>  return errno;
> @@ -453,7 +463,8 @@ panfrost_batch_submit_ioctl(struct panfrost_batch *batch,
>  /* Trace the job if we're doing that */
>  if (pan_debug & PAN_DBG_TRACE) {
>  /* Wait so we can get errors reported back */
> -drmSyncobjWait(screen->fd, >out_sync, 1, INT64_MAX, 0, 
> NULL);
> +drmSyncobjWait(screen->fd, >out_sync->syncobj, 1,
> +   INT64_MAX, 0, NULL);
>  pandecode_jc(submit.jc, FALSE);
>  }
>  
> @@ -495,6 +506,15 @@ panfrost_batch_submit(struct panfrost_batch *batch)
>   * to wait on it.
>   */
>  batch->out_sync->signaled = true;
> +
> +/* Release the last batch fence if any, and set 
> ->last_out_sync
> + * to NULL
> + */
> +if (ctx->last_out_sync) {
> +panfrost_batch_fence_unreference(ctx->last_out_sync);
> +ctx->last_out_sync = NULL;
> +}
> +
>  goto out;
>  }
>  
> @@ -527,8 +547,11 @@ out:
>   * rendering is quite broken right now (to be fixed by the 
> panfrost_job
>   * refactor, just take the perf hit for correctness)
>   */
> -   

Re: [Mesa-dev] [PATCH v3 03/17] panfrost: Add a batch fence

2019-09-20 Thread Alyssa Rosenzweig
R-b

On Wed, Sep 18, 2019 at 03:24:25PM +0200, Boris Brezillon wrote:
> So we can implement fine-grained dependency tracking between batches.
> 
> Signed-off-by: Boris Brezillon 
> ---
> Changes in v3:
> * Fix typos
> * Do not initialize the syncobj in a signaled state, and set
>   fence->signaled to true when submitting a dummy batch (one with no
>   draw/clear queued)
> ---
>  src/gallium/drivers/panfrost/pan_job.c | 56 +-
>  src/gallium/drivers/panfrost/pan_job.h | 39 ++
>  2 files changed, 94 insertions(+), 1 deletion(-)
> 
> diff --git a/src/gallium/drivers/panfrost/pan_job.c 
> b/src/gallium/drivers/panfrost/pan_job.c
> index 785317dbd0b0..b6763da66a97 100644
> --- a/src/gallium/drivers/panfrost/pan_job.c
> +++ b/src/gallium/drivers/panfrost/pan_job.c
> @@ -36,6 +36,45 @@
>  #include "pan_util.h"
>  #include "pandecode/decode.h"
>  
> +static struct panfrost_batch_fence *
> +panfrost_create_batch_fence(struct panfrost_batch *batch)
> +{
> +struct panfrost_batch_fence *fence;
> +ASSERTED int ret;
> +
> +fence = rzalloc(NULL, struct panfrost_batch_fence);
> +assert(fence);
> +pipe_reference_init(>reference, 1);
> +fence->ctx = batch->ctx;
> +fence->batch = batch;
> +ret = drmSyncobjCreate(pan_screen(batch->ctx->base.screen)->fd, 0,
> +   >syncobj);
> +assert(!ret);
> +
> +return fence;
> +}
> +
> +static void
> +panfrost_free_batch_fence(struct panfrost_batch_fence *fence)
> +{
> +drmSyncobjDestroy(pan_screen(fence->ctx->base.screen)->fd,
> +  fence->syncobj);
> +ralloc_free(fence);
> +}
> +
> +void
> +panfrost_batch_fence_unreference(struct panfrost_batch_fence *fence)
> +{
> +if (pipe_reference(>reference, NULL))
> + panfrost_free_batch_fence(fence);
> +}
> +
> +void
> +panfrost_batch_fence_reference(struct panfrost_batch_fence *fence)
> +{
> +pipe_reference(NULL, >reference);
> +}
> +
>  static struct panfrost_batch *
>  panfrost_create_batch(struct panfrost_context *ctx,
>const struct pipe_framebuffer_state *key)
> @@ -53,6 +92,7 @@ panfrost_create_batch(struct panfrost_context *ctx,
>  
>  util_dynarray_init(>headers, batch);
>  util_dynarray_init(>gpu_headers, batch);
> +batch->out_sync = panfrost_create_batch_fence(batch);
>  util_copy_framebuffer_state(>key, key);
>  
>  return batch;
> @@ -74,6 +114,15 @@ panfrost_free_batch(struct panfrost_batch *batch)
>  if (ctx->batch == batch)
>  ctx->batch = NULL;
>  
> +/* The out_sync fence lifetime is different from the the batch one
> + * since other batches might want to wait on a fence of already
> + * submitted/signaled batch. All we need to do here is make sure the
> + * fence does not point to an invalid batch, which the core will
> + * interpret as 'batch is already submitted'.
> + */
> +batch->out_sync->batch = NULL;
> +panfrost_batch_fence_unreference(batch->out_sync);
> +
>  util_unreference_framebuffer_state(>key);
>  ralloc_free(batch);
>  }
> @@ -441,8 +490,13 @@ panfrost_batch_submit(struct panfrost_batch *batch)
>  int ret;
>  
>  /* Nothing to do! */
> -if (!batch->last_job.gpu && !batch->clear)
> +if (!batch->last_job.gpu && !batch->clear) {
> +/* Mark the fence as signaled so the fence logic does not try
> + * to wait on it.
> + */
> +batch->out_sync->signaled = true;
>  goto out;
> +}
>  
>  if (!batch->clear && batch->last_tiler.gpu)
>  panfrost_batch_draw_wallpaper(batch);
> diff --git a/src/gallium/drivers/panfrost/pan_job.h 
> b/src/gallium/drivers/panfrost/pan_job.h
> index 3f2cf1a999f3..88f1e4620fd0 100644
> --- a/src/gallium/drivers/panfrost/pan_job.h
> +++ b/src/gallium/drivers/panfrost/pan_job.h
> @@ -31,6 +31,36 @@
>  #include "pan_allocate.h"
>  #include "pan_resource.h"
>  
> +/* panfrost_batch_fence is the out fence of a batch that users or other 
> batches
> + * might want to wait on. The batch fence lifetime is different from the 
> batch
> + * one as want will certainly want to wait upon the fence after the batch has
> + * been submitted (which is when panfrost_batch objects are freed).
> + */
> +struct panfrost_batch_fence {
> +/* Refcounting object for the fence. */
> +struct pipe_reference reference;
> +
> +/* Batch that created this fence object. Will become NULL at batch
> + * submission time. This field is mainly here to know whether the
> + * batch has been flushed or not.
> + */
> +struct panfrost_batch *batch;
> +
> +/* Context this fence is attached to. We need both ctx and batch, as
> + * the batch will go away 

Re: [Mesa-dev] [PATCH v3 02/17] panfrost: Make panfrost_batch->bos a hash table

2019-09-20 Thread Alyssa Rosenzweig
R-b
On Wed, Sep 18, 2019 at 03:24:24PM +0200, Boris Brezillon wrote:
> So we can store the flags as data and keep the BO as a key. This way
> we keep track of the type of access done on BOs.
> 
> Signed-off-by: Boris Brezillon 
> ---
> Changes in v3:
> * None
> ---
>  src/gallium/drivers/panfrost/pan_job.c | 33 +-
>  src/gallium/drivers/panfrost/pan_job.h |  2 +-
>  2 files changed, 23 insertions(+), 12 deletions(-)
> 
> diff --git a/src/gallium/drivers/panfrost/pan_job.c 
> b/src/gallium/drivers/panfrost/pan_job.c
> index 8e2703ae168c..785317dbd0b0 100644
> --- a/src/gallium/drivers/panfrost/pan_job.c
> +++ b/src/gallium/drivers/panfrost/pan_job.c
> @@ -44,9 +44,8 @@ panfrost_create_batch(struct panfrost_context *ctx,
>  
>  batch->ctx = ctx;
>  
> -batch->bos = _mesa_set_create(batch,
> -  _mesa_hash_pointer,
> -  _mesa_key_pointer_equal);
> +batch->bos = _mesa_hash_table_create(batch, _mesa_hash_pointer,
> + _mesa_key_pointer_equal);
>  
>  batch->minx = batch->miny = ~0;
>  batch->maxx = batch->maxy = 0;
> @@ -67,10 +66,8 @@ panfrost_free_batch(struct panfrost_batch *batch)
>  
>  struct panfrost_context *ctx = batch->ctx;
>  
> -set_foreach(batch->bos, entry) {
> -struct panfrost_bo *bo = (struct panfrost_bo *)entry->key;
> -panfrost_bo_unreference(bo);
> -}
> +hash_table_foreach(batch->bos, entry)
> +panfrost_bo_unreference((struct panfrost_bo *)entry->key);
>  
>  _mesa_hash_table_remove_key(ctx->batches, >key);
>  
> @@ -138,11 +135,25 @@ panfrost_batch_add_bo(struct panfrost_batch *batch, 
> struct panfrost_bo *bo,
>  if (!bo)
>  return;
>  
> -if (_mesa_set_search(batch->bos, bo))
> +struct hash_entry *entry;
> +uint32_t old_flags = 0;
> +
> +entry = _mesa_hash_table_search(batch->bos, bo);
> +if (!entry) {
> +entry = _mesa_hash_table_insert(batch->bos, bo,
> +(void *)(uintptr_t)flags);
> +panfrost_bo_reference(bo);
> + } else {
> +old_flags = (uintptr_t)entry->data;
> +}
> +
> +assert(entry);
> +
> +if (old_flags == flags)
>  return;
>  
> -panfrost_bo_reference(bo);
> -_mesa_set_add(batch->bos, bo);
> +flags |= old_flags;
> +entry->data = (void *)(uintptr_t)flags;
>  }
>  
>  void panfrost_batch_add_fbo_bos(struct panfrost_batch *batch)
> @@ -376,7 +387,7 @@ panfrost_batch_submit_ioctl(struct panfrost_batch *batch,
>  bo_handles = calloc(batch->bos->entries, sizeof(*bo_handles));
>  assert(bo_handles);
>  
> -set_foreach(batch->bos, entry) {
> +hash_table_foreach(batch->bos, entry) {
>  struct panfrost_bo *bo = (struct panfrost_bo *)entry->key;
>  assert(bo->gem_handle > 0);
>  bo_handles[submit.bo_handle_count++] = bo->gem_handle;
> diff --git a/src/gallium/drivers/panfrost/pan_job.h 
> b/src/gallium/drivers/panfrost/pan_job.h
> index 0b37a3131e86..3f2cf1a999f3 100644
> --- a/src/gallium/drivers/panfrost/pan_job.h
> +++ b/src/gallium/drivers/panfrost/pan_job.h
> @@ -98,7 +98,7 @@ struct panfrost_batch {
>  unsigned job_index;
>  
>  /* BOs referenced -- will be used for flushing logic */
> -struct set *bos;
> +struct hash_table *bos;
>  
>  /* Current transient BO */
>   struct panfrost_bo *transient_bo;
> -- 
> 2.21.0
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH v3 01/17] panfrost: Extend the panfrost_batch_add_bo() API to pass access flags

2019-09-20 Thread Alyssa Rosenzweig
> @@ -1121,7 +1134,11 @@ panfrost_emit_for_draw(struct panfrost_context *ctx, 
> bool with_vertex_data)
>  
>  struct panfrost_shader_state *ss = 
> >variants[all->active_variant];
>  
> -panfrost_batch_add_bo(batch, ss->bo);
> +panfrost_batch_add_bo(batch, ss->bo,
> +  PAN_BO_ACCESS_PRIVATE |
> +  PAN_BO_ACCESS_READ |

> +  PAN_BO_ACCESS_VERTEX_TILER |
> +  PAN_BO_ACCESS_FRAGMENT);

I believe this should be just the access for the stage `i`

Although actually I am not at all sure what this batch_add_bo is doing
at all?

I think this batch_add_bo should probably dropped altogether? This loop
is dealing with constant buffers; the shaders themselves were added

>  void panfrost_batch_add_fbo_bos(struct panfrost_batch *batch)
>  {
> +uint32_t flags = PAN_BO_ACCESS_SHARED | PAN_BO_ACCESS_WRITE |
> + PAN_BO_ACCESS_VERTEX_TILER |
> + PAN_BO_ACCESS_FRAGMENT;

I think we can drop VERTEX_TILER here...? The buffers are written right
at the end of the FRAGMENT job, not touched before that.

If nothing else is broken, this should allow a nice perf boost with
pipelining, so the vertex/tiler from frame n+1 can run in parallel with
the fragment of frame n (rather than blocking on frame n finishing with
the FBOs).
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 3/3] panfrost: More tests are passing

2019-09-20 Thread Alyssa Rosenzweig
R-b with pleasure! Glad to see those tests fixed, those have stumped me
for a *long* time :D

Congratulations! Thank you!

On Fri, Sep 20, 2019 at 04:53:39PM +0200, Boris Brezillon wrote:
> Remove the tests that are now passing.
> 
> Signed-off-by: Boris Brezillon 
> ---
>  .../drivers/panfrost/ci/expected-failures.txt | 153 --
>  1 file changed, 153 deletions(-)
> 
> diff --git a/src/gallium/drivers/panfrost/ci/expected-failures.txt 
> b/src/gallium/drivers/panfrost/ci/expected-failures.txt
> index 7e7dbd62307b..91c1f14ce1a2 100644
> --- a/src/gallium/drivers/panfrost/ci/expected-failures.txt
> +++ b/src/gallium/drivers/panfrost/ci/expected-failures.txt
> @@ -1,10 +1,3 @@
> -dEQP-GLES2.functional.color_clear.masked_rgba Fail
> -dEQP-GLES2.functional.color_clear.masked_rgb Fail
> -dEQP-GLES2.functional.color_clear.masked_scissored_rgba Fail
> -dEQP-GLES2.functional.color_clear.masked_scissored_rgb Fail
> -dEQP-GLES2.functional.color_clear.scissored_rgba Fail
> -dEQP-GLES2.functional.color_clear.scissored_rgb Fail
> -dEQP-GLES2.functional.color_clear.short_scissored_rgb Fail
>  dEQP-GLES2.functional.depth_range.write.0_8_to_third Fail
>  dEQP-GLES2.functional.depth_range.write.clamp_both Fail
>  dEQP-GLES2.functional.depth_range.write.clamp_far Fail
> @@ -672,201 +665,55 @@ 
> dEQP-GLES2.functional.fragment_ops.depth_stencil.stencil_ops.zero_zero_zero 
> Fail
>  dEQP-GLES2.functional.fragment_ops.depth_stencil.write_mask.both Fail
>  dEQP-GLES2.functional.fragment_ops.depth_stencil.write_mask.depth Fail
>  dEQP-GLES2.functional.fragment_ops.depth_stencil.write_mask.stencil Fail
> -dEQP-GLES2.functional.fragment_ops.interaction.basic_shader.0 Fail
> -dEQP-GLES2.functional.fragment_ops.interaction.basic_shader.10 Fail
>  dEQP-GLES2.functional.fragment_ops.interaction.basic_shader.11 Fail
> -dEQP-GLES2.functional.fragment_ops.interaction.basic_shader.12 Fail
>  dEQP-GLES2.functional.fragment_ops.interaction.basic_shader.13 Fail
>  dEQP-GLES2.functional.fragment_ops.interaction.basic_shader.15 Fail
> -dEQP-GLES2.functional.fragment_ops.interaction.basic_shader.16 Fail
>  dEQP-GLES2.functional.fragment_ops.interaction.basic_shader.17 Fail
>  dEQP-GLES2.functional.fragment_ops.interaction.basic_shader.18 Fail
>  dEQP-GLES2.functional.fragment_ops.interaction.basic_shader.19 Fail
> -dEQP-GLES2.functional.fragment_ops.interaction.basic_shader.1 Fail
>  dEQP-GLES2.functional.fragment_ops.interaction.basic_shader.20 Fail
> -dEQP-GLES2.functional.fragment_ops.interaction.basic_shader.21 Fail
>  dEQP-GLES2.functional.fragment_ops.interaction.basic_shader.22 Fail
> -dEQP-GLES2.functional.fragment_ops.interaction.basic_shader.23 Fail
> -dEQP-GLES2.functional.fragment_ops.interaction.basic_shader.24 Fail
> -dEQP-GLES2.functional.fragment_ops.interaction.basic_shader.25 Fail
>  dEQP-GLES2.functional.fragment_ops.interaction.basic_shader.26 Fail
> -dEQP-GLES2.functional.fragment_ops.interaction.basic_shader.29 Fail
> -dEQP-GLES2.functional.fragment_ops.interaction.basic_shader.30 Fail
> -dEQP-GLES2.functional.fragment_ops.interaction.basic_shader.31 Fail
> -dEQP-GLES2.functional.fragment_ops.interaction.basic_shader.32 Fail
> -dEQP-GLES2.functional.fragment_ops.interaction.basic_shader.33 Fail
> -dEQP-GLES2.functional.fragment_ops.interaction.basic_shader.34 Fail
> -dEQP-GLES2.functional.fragment_ops.interaction.basic_shader.35 Fail
> -dEQP-GLES2.functional.fragment_ops.interaction.basic_shader.36 Fail
> -dEQP-GLES2.functional.fragment_ops.interaction.basic_shader.37 Fail
> -dEQP-GLES2.functional.fragment_ops.interaction.basic_shader.38 Fail
>  dEQP-GLES2.functional.fragment_ops.interaction.basic_shader.39 Fail
> -dEQP-GLES2.functional.fragment_ops.interaction.basic_shader.3 Fail
> -dEQP-GLES2.functional.fragment_ops.interaction.basic_shader.40 Fail
> -dEQP-GLES2.functional.fragment_ops.interaction.basic_shader.41 Fail
>  dEQP-GLES2.functional.fragment_ops.interaction.basic_shader.42 Fail
> -dEQP-GLES2.functional.fragment_ops.interaction.basic_shader.43 Fail
>  dEQP-GLES2.functional.fragment_ops.interaction.basic_shader.44 Fail
> -dEQP-GLES2.functional.fragment_ops.interaction.basic_shader.46 Fail
>  dEQP-GLES2.functional.fragment_ops.interaction.basic_shader.47 Fail
>  dEQP-GLES2.functional.fragment_ops.interaction.basic_shader.48 Fail
> -dEQP-GLES2.functional.fragment_ops.interaction.basic_shader.49 Fail
> -dEQP-GLES2.functional.fragment_ops.interaction.basic_shader.50 Fail
> -dEQP-GLES2.functional.fragment_ops.interaction.basic_shader.51 Fail
> -dEQP-GLES2.functional.fragment_ops.interaction.basic_shader.52 Fail
> -dEQP-GLES2.functional.fragment_ops.interaction.basic_shader.53 Fail
> -dEQP-GLES2.functional.fragment_ops.interaction.basic_shader.54 Fail
> -dEQP-GLES2.functional.fragment_ops.interaction.basic_shader.55 Fail
> -dEQP-GLES2.functional.fragment_ops.interaction.basic_shader.56 Fail
>  dEQP-GLES2.functional.fragment_ops.interaction.basic_shader.57 Fail
> 

Re: [Mesa-dev] [PATCH 2/3] panfrost: Draw the wallpaper when only depth/stencil bufs are cleared

2019-09-20 Thread Alyssa Rosenzweig
R-b, nice fix :)

On Fri, Sep 20, 2019 at 04:53:38PM +0200, Boris Brezillon wrote:
> When only the depth/stencil bufs are cleared, we should make sure the
> color content is reloaded into the tile buffers if we want to preserve
> their content.
> 
> Signed-off-by: Boris Brezillon 
> ---
> There might be a more optimal solution to do that (like not passing the
> color bufs to the fragment job?), but this solution seems to fix a few
> deqp tests.
> ---
>  src/gallium/drivers/panfrost/pan_context.c |  2 +-
>  src/gallium/drivers/panfrost/pan_job.c | 16 ++--
>  2 files changed, 15 insertions(+), 3 deletions(-)
> 
> diff --git a/src/gallium/drivers/panfrost/pan_context.c 
> b/src/gallium/drivers/panfrost/pan_context.c
> index b2f2a9da7a51..c99bf1b26ce7 100644
> --- a/src/gallium/drivers/panfrost/pan_context.c
> +++ b/src/gallium/drivers/panfrost/pan_context.c
> @@ -1333,7 +1333,7 @@ panfrost_queue_draw(struct panfrost_context *ctx)
>  
>  if (rasterizer_discard)
>  panfrost_scoreboard_queue_vertex_job(batch, vertex, FALSE);
> -else if (ctx->wallpaper_batch)
> +else if (ctx->wallpaper_batch && batch->first_tiler.gpu)
>  panfrost_scoreboard_queue_fused_job_prepend(batch, vertex, 
> tiler);
>  else
>  panfrost_scoreboard_queue_fused_job(batch, vertex, tiler);
> diff --git a/src/gallium/drivers/panfrost/pan_job.c 
> b/src/gallium/drivers/panfrost/pan_job.c
> index 4ec2aa0565d7..a2df31f96f00 100644
> --- a/src/gallium/drivers/panfrost/pan_job.c
> +++ b/src/gallium/drivers/panfrost/pan_job.c
> @@ -698,10 +698,23 @@ panfrost_batch_get_tiler_dummy(struct panfrost_batch 
> *batch)
>  static void
>  panfrost_batch_draw_wallpaper(struct panfrost_batch *batch)
>  {
> +/* Color 0 is cleared, no need to draw the wallpaper.
> + * TODO: MRT wallpapers.
> + */
> +if (batch->clear & PIPE_CLEAR_COLOR0)
> +return;
> +
>  /* Nothing to reload? TODO: MRT wallpapers */
>  if (batch->key.cbufs[0] == NULL)
>  return;
>  
> +/* No draw calls, and no clear on the depth/stencil bufs.
> + * Drawing the wallpaper would be useless.
> + */
> +if (!batch->last_tiler.gpu &&
> +!(batch->clear & PIPE_CLEAR_DEPTHSTENCIL))
> +return;
> +
>  /* Check if the buffer has any content on it worth preserving */
>  
>  struct pipe_surface *surf = batch->key.cbufs[0];
> @@ -923,8 +936,7 @@ panfrost_batch_submit(struct panfrost_batch *batch)
>  goto out;
>  }
>  
> -if (!batch->clear && batch->last_tiler.gpu)
> -panfrost_batch_draw_wallpaper(batch);
> +panfrost_batch_draw_wallpaper(batch);
>  
>  panfrost_scoreboard_link_batch(batch);
>  
> -- 
> 2.21.0
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 1/3] panfrost: Make sure a clear does not re-use a pre-existing batch

2019-09-20 Thread Alyssa Rosenzweig
To be clear, if we have a batch and do the following operations:

clear red
draw 1
clear green
draw 2
flush

All we should see is #2 on a green background, which this patch handles
by the second clear invalidating all the clears/draws that came before
it (provided there is no flush in between). 

I might just be tripped up by the "freeze" name. That really means throw
away / free here, I guess?

Provided that's the idea (and we're not somehow saving the original draw
1), it's Reviewed-by A R 

On Fri, Sep 20, 2019 at 04:53:37PM +0200, Boris Brezillon wrote:
> glClear()s are expected to be the first thing GL apps do before drawing
> new things. If there's already an existing batch targetting the same
> FBO that has draws attached to it, we should make sure the new clear
> gets a new batch assigned to guaranteed that the FB content is actually
> cleared with the requested color/depth/stencil values.
> 
> We create a panfrost_get_fresh_batch_for_fbo() helper for that and
> call it from panfrost_clear().
> 
> Signed-off-by: Boris Brezillon 
> ---
>  src/gallium/drivers/panfrost/pan_context.c |  2 +-
>  src/gallium/drivers/panfrost/pan_job.c | 21 +
>  src/gallium/drivers/panfrost/pan_job.h |  3 +++
>  3 files changed, 25 insertions(+), 1 deletion(-)
> 
> diff --git a/src/gallium/drivers/panfrost/pan_context.c 
> b/src/gallium/drivers/panfrost/pan_context.c
> index ac01461a07fe..b2f2a9da7a51 100644
> --- a/src/gallium/drivers/panfrost/pan_context.c
> +++ b/src/gallium/drivers/panfrost/pan_context.c
> @@ -162,7 +162,7 @@ panfrost_clear(
>  double depth, unsigned stencil)
>  {
>  struct panfrost_context *ctx = pan_context(pipe);
> -struct panfrost_batch *batch = panfrost_get_batch_for_fbo(ctx);
> +struct panfrost_batch *batch = panfrost_get_fresh_batch_for_fbo(ctx);
>  
>  panfrost_batch_add_fbo_bos(batch);
>  panfrost_batch_clear(batch, buffers, color, depth, stencil);
> diff --git a/src/gallium/drivers/panfrost/pan_job.c 
> b/src/gallium/drivers/panfrost/pan_job.c
> index d8330bc133a6..4ec2aa0565d7 100644
> --- a/src/gallium/drivers/panfrost/pan_job.c
> +++ b/src/gallium/drivers/panfrost/pan_job.c
> @@ -298,6 +298,27 @@ panfrost_get_batch_for_fbo(struct panfrost_context *ctx)
>  return batch;
>  }
>  
> +struct panfrost_batch *
> +panfrost_get_fresh_batch_for_fbo(struct panfrost_context *ctx)
> +{
> +struct panfrost_batch *batch;
> +
> +batch = panfrost_get_batch(ctx, >pipe_framebuffer);
> +
> +/* The batch has no draw/clear queued, let's return it directly.
> + * Note that it's perfectly fine to re-use a batch with an
> + * existing clear, we'll just update it with the new clear request.
> + */
> +if (!batch->last_job.gpu)
> +return batch;
> +
> +/* Otherwise, we need to freeze the existing one and instantiate a 
> new
> + * one.
> + */
> +panfrost_freeze_batch(batch);
> +return panfrost_get_batch(ctx, >pipe_framebuffer);
> +}
> +
>  static bool
>  panfrost_batch_fence_is_signaled(struct panfrost_batch_fence *fence)
>  {
> diff --git a/src/gallium/drivers/panfrost/pan_job.h 
> b/src/gallium/drivers/panfrost/pan_job.h
> index e1b1f56a2e64..0bd78bba267a 100644
> --- a/src/gallium/drivers/panfrost/pan_job.h
> +++ b/src/gallium/drivers/panfrost/pan_job.h
> @@ -172,6 +172,9 @@ panfrost_batch_fence_reference(struct 
> panfrost_batch_fence *batch);
>  struct panfrost_batch *
>  panfrost_get_batch_for_fbo(struct panfrost_context *ctx);
>  
> +struct panfrost_batch *
> +panfrost_get_fresh_batch_for_fbo(struct panfrost_context *ctx);
> +
>  void
>  panfrost_batch_init(struct panfrost_context *ctx);
>  
> -- 
> 2.21.0
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 1/2] panfrost: Fix indexed draws

2019-09-18 Thread Alyssa Rosenzweig
R-b

On Wed, Sep 18, 2019 at 03:54:37PM +0200, Boris Brezillon wrote:
> ->padded_count should be large enough to cover all vertices pointed by
> the index array. Use the local vertex_count variable that contains the
> updated vertex_count value for the indexed draw case.
> 
> Signed-off-by: Boris Brezillon 
> ---
>  src/gallium/drivers/panfrost/pan_context.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/src/gallium/drivers/panfrost/pan_context.c 
> b/src/gallium/drivers/panfrost/pan_context.c
> index 08b799b66bf8..1b8558c1c2c1 100644
> --- a/src/gallium/drivers/panfrost/pan_context.c
> +++ b/src/gallium/drivers/panfrost/pan_context.c
> @@ -1601,7 +1601,7 @@ panfrost_draw_vbo(
>  
>  ctx->padded_count = pan_expand_shift_odd(so);
>  } else {
> -ctx->padded_count = ctx->vertex_count;
> +ctx->padded_count = vertex_count;
>  
>  /* Reset instancing state */
>  ctx->payloads[PIPE_SHADER_VERTEX].instance_shift = 0;
> -- 
> 2.21.0
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH v2 00/37] panfrost: Support batch pipelining

2019-09-17 Thread Alyssa Rosenzweig
"Can't use pipe_framebuffer_state as a hash key" Is this still relevant?
I thought we did this.

You're right that the 32-batch bitset is something we can integrate in a
v2.

I guess that's it, never mind. Good stuff regardless :)

On Tue, Sep 17, 2019 at 12:08:34AM +0200, Boris Brezillon wrote:
> On Mon, 16 Sep 2019 16:29:05 -0400
> Alyssa Rosenzweig  wrote:
> 
> > > As a drive-by comment, in case you didn't know, the "standard"
> > > solution for avoiding flushing when BO's are written by the CPU (e.g.
> > > uniform buffer updates) as documented in ARM's performance guide is to
> > > add a copy-on-write mechanism, so that you have "shadow" BO's when the
> > > original BO is modified by the user. I believe this is implemented in
> > > freedreno, at least there was a talk about it at XDC a few years ago.  
> > 
> > Yes, this is implemented in freedreno. BO shadowing will be the next
> > step once this pipelining code settles down. For now, this series is
> > about eliminating the strict flushes between each and every frame of
> > each and every FBO that we currently occur now.
> > 
> > Boris, references on the freedreno model (which is the mesa gold
> > standard):
> > 
> > https://www.x.org/wiki/Events/XDC2016/Program/clark_ooo_rendering.pdf
> > https://bloggingthemonkey.blogspot.com/2016/07/dirty-tricks-for-moar-fps.html
> > 
> > The former presentation is definitely worth a read; evidently we've
> > already painted ourselves into some corners :p
> 
> I had a quick look at the presentation, and it looks pretty similar to
> what is being implemented in this series. The only difference I could
> spot is the limitation to 32 batches to avoid usage of sets in the BO
> access tracking logic, and that's still something I can change (I'd
> prefer to do that in a follow-up patch though).
> 
> What specific aspects do you think we got wrong?
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH v2 37/37] panfrost/ci: New tests are passing

2019-09-17 Thread Alyssa Rosenzweig
R-b, nice :)

On Mon, Sep 16, 2019 at 11:37:15AM +0200, Boris Brezillon wrote:
> All dEQP-GLES2.functional.fbo.render.texsubimage.* tests are now
> passing.
> 
> Signed-off-by: Boris Brezillon 
> ---
>  src/gallium/drivers/panfrost/ci/expected-failures.txt | 4 
>  1 file changed, 4 deletions(-)
> 
> diff --git a/src/gallium/drivers/panfrost/ci/expected-failures.txt 
> b/src/gallium/drivers/panfrost/ci/expected-failures.txt
> index b0fc872a3009..3c707230dd23 100644
> --- a/src/gallium/drivers/panfrost/ci/expected-failures.txt
> +++ b/src/gallium/drivers/panfrost/ci/expected-failures.txt
> @@ -53,10 +53,6 @@ 
> dEQP-GLES2.functional.fbo.render.shared_colorbuffer.tex2d_rgb_depth_component16
>  
> dEQP-GLES2.functional.fbo.render.shared_depthbuffer.rbo_rgb565_depth_component16
>  Fail
>  
> dEQP-GLES2.functional.fbo.render.shared_depthbuffer.tex2d_rgba_depth_component16
>  Fail
>  
> dEQP-GLES2.functional.fbo.render.shared_depthbuffer.tex2d_rgb_depth_component16
>  Fail
> -dEQP-GLES2.functional.fbo.render.texsubimage.after_render_tex2d_rgba Fail
> -dEQP-GLES2.functional.fbo.render.texsubimage.after_render_tex2d_rgb Fail
> -dEQP-GLES2.functional.fbo.render.texsubimage.between_render_tex2d_rgba Fail
> -dEQP-GLES2.functional.fbo.render.texsubimage.between_render_tex2d_rgb Fail
>  dEQP-GLES2.functional.fragment_ops.depth_stencil.random.0 Fail
>  dEQP-GLES2.functional.fragment_ops.depth_stencil.random.10 Fail
>  dEQP-GLES2.functional.fragment_ops.depth_stencil.random.11 Fail
> -- 
> 2.21.0
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH v2 36/37] panfrost: Take draw call order into account

2019-09-17 Thread Alyssa Rosenzweig
Hmm, could you explain a bit why this is necessary?

I would think if there's no dependency, there's no dependency, and if
this fixes bugs, that's a dependency tracking issue that we're just
papering over.

(Also, I guess r-b on previous patch retracted temporarily since it was a setup 
for
this.)

On Mon, Sep 16, 2019 at 11:37:14AM +0200, Boris Brezillon wrote:
> This is not strictly required, but let's try to match the draw call
> orders, just in case the app had a reason to do it in this order.
> 
> Signed-off-by: Boris Brezillon 
> ---
>  src/gallium/drivers/panfrost/pan_context.h |  6 ++
>  src/gallium/drivers/panfrost/pan_job.c | 23 +++---
>  src/gallium/drivers/panfrost/pan_job.h |  3 +++
>  3 files changed, 29 insertions(+), 3 deletions(-)
> 
> diff --git a/src/gallium/drivers/panfrost/pan_context.h 
> b/src/gallium/drivers/panfrost/pan_context.h
> index f13967f51b46..c6b53685b285 100644
> --- a/src/gallium/drivers/panfrost/pan_context.h
> +++ b/src/gallium/drivers/panfrost/pan_context.h
> @@ -114,6 +114,12 @@ struct panfrost_context {
>  struct panfrost_batch *batch;
>  struct hash_table *fbo_to_batch;
>  
> +/* A list containing all non-submitted batches since the last flush.
> + * This list is used to keep track of clear/draw order on batches 
> that
> + * don't have explicit dependencies between them.
> + */
> +struct list_head batch_queue;
> +
>  /* panfrost_bo -> panfrost_bo_access */
>  struct hash_table *accessed_bos;
>  
> diff --git a/src/gallium/drivers/panfrost/pan_job.c 
> b/src/gallium/drivers/panfrost/pan_job.c
> index 13d7e8086e62..e030f8e98dad 100644
> --- a/src/gallium/drivers/panfrost/pan_job.c
> +++ b/src/gallium/drivers/panfrost/pan_job.c
> @@ -118,6 +118,7 @@ panfrost_create_batch(struct panfrost_context *ctx,
>  util_dynarray_init(>headers, batch);
>  util_dynarray_init(>gpu_headers, batch);
>  util_dynarray_init(>dependencies, batch);
> +list_inithead(>queue_node);
>  batch->out_sync = panfrost_create_batch_fence(batch);
>  util_copy_framebuffer_state(>key, key);
>  
> @@ -181,6 +182,9 @@ panfrost_free_batch(struct panfrost_batch *batch)
>struct panfrost_batch_fence *, dep)
>  panfrost_batch_fence_unreference(*dep);
>  
> +/* Remove the batch from the batch queue. */
> +list_del(>queue_node);
> +
>  /* The out_sync fence lifetime is different from the the batch one
>   * since other batches might want to wait on an fence of already
>   * submitted/signaled batch. All we need to do here is make sure the
> @@ -543,6 +547,13 @@ void panfrost_batch_add_fbo_bos(struct panfrost_batch 
> *batch)
>  struct panfrost_resource *rsrc = 
> pan_resource(batch->key.zsbuf->texture);
>  panfrost_batch_add_bo(batch, rsrc->bo, flags);
>  }
> +
> +/* We the batch was not already present in the queue, add it know.
> + * Should we move the batch at end of the queue when a new draw
> + * happens?
> + */
> +if (list_empty(>queue_node))
> +list_addtail(>queue_node, >ctx->batch_queue);
>  }
>  
>  struct panfrost_bo *
> @@ -878,10 +889,15 @@ panfrost_flush_all_batches(struct panfrost_context 
> *ctx, bool wait)
>  util_dynarray_init(, NULL);
>  }
>  
> -hash_table_foreach(ctx->fbo_to_batch, hentry) {
> -struct panfrost_batch *batch = hentry->data;
> +/* We can use the for_each_entry_safe() iterator here because the
> + * next element might be removed from the list when flushing the
> + * dependencies in panfrost_batch_submit().
> + */
> +while (!list_empty(>batch_queue)) {
> +struct panfrost_batch *batch;
>  
> -assert(batch);
> +batch = list_first_entry(>batch_queue,
> + struct panfrost_batch, queue_node);
>  
>  if (wait) {
>  panfrost_batch_fence_reference(batch->out_sync);
> @@ -1150,4 +1166,5 @@ panfrost_batch_init(struct panfrost_context *ctx)
>  panfrost_batch_compare);
>  ctx->accessed_bos = _mesa_hash_table_create(ctx, _mesa_hash_pointer,
>  _mesa_key_pointer_equal);
> +list_inithead(>batch_queue);
>  }
> diff --git a/src/gallium/drivers/panfrost/pan_job.h 
> b/src/gallium/drivers/panfrost/pan_job.h
> index d198864ce4f7..34926e30cdde 100644
> --- a/src/gallium/drivers/panfrost/pan_job.h
> +++ b/src/gallium/drivers/panfrost/pan_job.h
> @@ -71,6 +71,9 @@ struct panfrost_batch {
>  struct panfrost_context *ctx;
>  struct pipe_framebuffer_state key;
>  
> +/* Used to insert the batch in the batch queue */
> +

Re: [Mesa-dev] [PATCH v2 35/37] panfrost: Rename ctx->batches into ctx->fbo_to_batch

2019-09-17 Thread Alyssa Rosenzweig
R-b

On Mon, Sep 16, 2019 at 11:37:13AM +0200, Boris Brezillon wrote:
> We are about to add a batch queue to keep track of submission order.
> Let's rename the existing batches hash table (which is used to get the
> batch attached to an FBO) into fbo_to_batch to avoid confusion.
> 
> Signed-off-by: Boris Brezillon 
> ---
>  src/gallium/drivers/panfrost/pan_context.c |  2 +-
>  src/gallium/drivers/panfrost/pan_context.h |  2 +-
>  src/gallium/drivers/panfrost/pan_job.c | 21 +++--
>  3 files changed, 13 insertions(+), 12 deletions(-)
> 
> diff --git a/src/gallium/drivers/panfrost/pan_context.c 
> b/src/gallium/drivers/panfrost/pan_context.c
> index 90c7512b105f..b8f653bf8e72 100644
> --- a/src/gallium/drivers/panfrost/pan_context.c
> +++ b/src/gallium/drivers/panfrost/pan_context.c
> @@ -1361,7 +1361,7 @@ panfrost_flush(
>   */
>  if (fence) {
>  util_dynarray_init(, NULL);
> -hash_table_foreach(ctx->batches, hentry) {
> +hash_table_foreach(ctx->fbo_to_batch, hentry) {
>  struct panfrost_batch *batch = hentry->data;
>  
>  panfrost_batch_fence_reference(batch->out_sync);
> diff --git a/src/gallium/drivers/panfrost/pan_context.h 
> b/src/gallium/drivers/panfrost/pan_context.h
> index d50ed57d5d8a..f13967f51b46 100644
> --- a/src/gallium/drivers/panfrost/pan_context.h
> +++ b/src/gallium/drivers/panfrost/pan_context.h
> @@ -112,7 +112,7 @@ struct panfrost_context {
>  
>  /* Bound job batch and map of panfrost_batch_key to job batches */
>  struct panfrost_batch *batch;
> -struct hash_table *batches;
> +struct hash_table *fbo_to_batch;
>  
>  /* panfrost_bo -> panfrost_bo_access */
>  struct hash_table *accessed_bos;
> diff --git a/src/gallium/drivers/panfrost/pan_job.c 
> b/src/gallium/drivers/panfrost/pan_job.c
> index df92b791a1f2..13d7e8086e62 100644
> --- a/src/gallium/drivers/panfrost/pan_job.c
> +++ b/src/gallium/drivers/panfrost/pan_job.c
> @@ -134,9 +134,9 @@ panfrost_freeze_batch(struct panfrost_batch *batch)
>   * matches. This way, next draws/clears targeting this FBO will 
> trigger
>   * the creation of a new batch.
>   */
> -entry = _mesa_hash_table_search(ctx->batches, >key);
> +entry = _mesa_hash_table_search(ctx->fbo_to_batch, >key);
>   if (entry && entry->data == batch)
> -_mesa_hash_table_remove(ctx->batches, entry);
> +_mesa_hash_table_remove(ctx->fbo_to_batch, entry);
>  
>  /* If this is the bound batch, the panfrost_context parameters are
>   * relevant so submitting it invalidates those parameters, but if 
> it's
> @@ -155,7 +155,7 @@ static bool panfrost_batch_is_frozen(struct 
> panfrost_batch *batch)
>  struct panfrost_context *ctx = batch->ctx;
>  struct hash_entry *entry;
>  
> -entry = _mesa_hash_table_search(ctx->batches, >key);
> +entry = _mesa_hash_table_search(ctx->fbo_to_batch, >key);
>  if (entry && entry->data == batch)
>  return false;
>  
> @@ -245,7 +245,8 @@ panfrost_get_batch(struct panfrost_context *ctx,
> const struct pipe_framebuffer_state *key)
>  {
>  /* Lookup the job first */
> -struct hash_entry *entry = _mesa_hash_table_search(ctx->batches, 
> key);
> +struct hash_entry *entry = _mesa_hash_table_search(ctx->fbo_to_batch,
> +   key);
>  
>  if (entry)
>  return entry->data;
> @@ -255,7 +256,7 @@ panfrost_get_batch(struct panfrost_context *ctx,
>  struct panfrost_batch *batch = panfrost_create_batch(ctx, key);
>  
>  /* Save the created job */
> -_mesa_hash_table_insert(ctx->batches, >key, batch);
> +_mesa_hash_table_insert(ctx->fbo_to_batch, >key, batch);
>  
>  return batch;
>  }
> @@ -877,7 +878,7 @@ panfrost_flush_all_batches(struct panfrost_context *ctx, 
> bool wait)
>  util_dynarray_init(, NULL);
>  }
>  
> -hash_table_foreach(ctx->batches, hentry) {
> +hash_table_foreach(ctx->fbo_to_batch, hentry) {
>  struct panfrost_batch *batch = hentry->data;
>  
>  assert(batch);
> @@ -892,7 +893,7 @@ panfrost_flush_all_batches(struct panfrost_context *ctx, 
> bool wait)
>  panfrost_batch_submit(batch);
>  }
>  
> -assert(!ctx->batches->entries);
> +assert(!ctx->fbo_to_batch->entries);
>  
>  /* Collect batch fences before returning */
>  panfrost_gc_fences(ctx);
> @@ -1144,9 +1145,9 @@ panfrost_batch_is_scanout(struct panfrost_batch *batch)
>  void
>  panfrost_batch_init(struct panfrost_context *ctx)
>  {
> -ctx->batches = _mesa_hash_table_create(ctx,
> -   panfrost_batch_hash,
> -   

Re: [Mesa-dev] [PATCH v2 34/37] panfrost: Do fine-grained flushing when preparing BO for CPU accesses

2019-09-17 Thread Alyssa Rosenzweig
R-b

On Mon, Sep 16, 2019 at 11:37:12AM +0200, Boris Brezillon wrote:
> We don't have to flush all batches when we're only interested in
> reading/writing a specific BO. Thanks to the
> panfrost_flush_batches_accessing_bo() and panfrost_bo_wait() helpers
> we can now flush only the batches touching the BO we want to access
> from the CPU.
> 
> Signed-off-by: Boris Brezillon 
> ---
>  src/gallium/drivers/panfrost/pan_resource.c | 27 +
>  1 file changed, 12 insertions(+), 15 deletions(-)
> 
> diff --git a/src/gallium/drivers/panfrost/pan_resource.c 
> b/src/gallium/drivers/panfrost/pan_resource.c
> index 1f7605adcd5d..d59529ff15b7 100644
> --- a/src/gallium/drivers/panfrost/pan_resource.c
> +++ b/src/gallium/drivers/panfrost/pan_resource.c
> @@ -578,10 +578,8 @@ panfrost_transfer_map(struct pipe_context *pctx,
>  is_bound |= fb->cbufs[c]->texture == resource;
>  }
>  
> -if (is_bound && (usage & PIPE_TRANSFER_READ)) {
> -assert(level == 0);
> -panfrost_flush_all_batches(ctx, true);
> -}
> +if (is_bound && (usage & PIPE_TRANSFER_READ))
> + assert(level == 0);
>  
>  /* TODO: Respect usage flags */
>  
> @@ -594,11 +592,11 @@ panfrost_transfer_map(struct pipe_context *pctx,
>  /* No flush for writes to uninitialized */
>  } else if (!(usage & PIPE_TRANSFER_UNSYNCHRONIZED)) {
>  if (usage & PIPE_TRANSFER_WRITE) {
> -/* STUB: flush reading */
> -//printf("debug: missed reading flush %d\n", 
> resource->target);
> +panfrost_flush_batches_accessing_bo(ctx, bo, 
> PAN_BO_GPU_ACCESS_RW);
> +panfrost_bo_wait(bo, INT64_MAX, 
> PAN_BO_GPU_ACCESS_RW);
>  } else if (usage & PIPE_TRANSFER_READ) {
> -/* STUB: flush writing */
> -//printf("debug: missed writing flush %d (%d-%d)\n", 
> resource->target, box->x, box->x + box->width);
> +panfrost_flush_batches_accessing_bo(ctx, bo, 
> PAN_BO_GPU_ACCESS_WRITE);
> +panfrost_bo_wait(bo, INT64_MAX, 
> PAN_BO_GPU_ACCESS_WRITE);
>  } else {
>  /* Why are you even mapping?! */
>  }
> @@ -748,11 +746,8 @@ panfrost_generate_mipmap(
>   * reorder-type optimizations in place. But for now prioritize
>   * correctness. */
>  
> -struct panfrost_batch *batch = panfrost_get_batch_for_fbo(ctx);
> -bool has_draws = batch->last_job.gpu;
> -
> -if (has_draws)
> -panfrost_flush_all_batches(ctx, true);
> +panfrost_flush_batches_accessing_bo(ctx, rsrc->bo, 
> PAN_BO_GPU_ACCESS_RW);
> +panfrost_bo_wait(rsrc->bo, INT64_MAX, PAN_BO_GPU_ACCESS_RW);
>  
>  /* We've flushed the original buffer if needed, now trigger a blit */
>  
> @@ -765,8 +760,10 @@ panfrost_generate_mipmap(
>  /* If the blit was successful, flush once more. If it wasn't, well, 
> let
>   * the state tracker deal with it. */
>  
> -if (blit_res)
> -panfrost_flush_all_batches(ctx, true);
> +if (blit_res) {
> +panfrost_flush_batches_accessing_bo(ctx, rsrc->bo, 
> PAN_BO_GPU_ACCESS_WRITE);
> +panfrost_bo_wait(rsrc->bo, INT64_MAX, 
> PAN_BO_GPU_ACCESS_WRITE);
> +}
>  
>  return blit_res;
>  }
> -- 
> 2.21.0
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH v2 33/37] panfrost: Get rid of the flush in panfrost_set_framebuffer_state()

2019-09-17 Thread Alyssa Rosenzweig
R-b :D

On Mon, Sep 16, 2019 at 11:37:11AM +0200, Boris Brezillon wrote:
> Now that we have track inter-batch dependencies, the flush done in
> panfrost_set_framebuffer_state() is no longer needed. Let's get rid of
> it.
> 
> Signed-off-by: Boris Brezillon 
> ---
>  src/gallium/drivers/panfrost/pan_context.c | 46 ++
>  1 file changed, 3 insertions(+), 43 deletions(-)
> 
> diff --git a/src/gallium/drivers/panfrost/pan_context.c 
> b/src/gallium/drivers/panfrost/pan_context.c
> index 56a76a230141..90c7512b105f 100644
> --- a/src/gallium/drivers/panfrost/pan_context.c
> +++ b/src/gallium/drivers/panfrost/pan_context.c
> @@ -2306,50 +2306,10 @@ panfrost_set_framebuffer_state(struct pipe_context 
> *pctx,
>  {
>  struct panfrost_context *ctx = pan_context(pctx);
>  
> -/* Flush when switching framebuffers, but not if the framebuffer
> - * state is being restored by u_blitter
> - */
> -
> -struct panfrost_batch *batch = panfrost_get_batch_for_fbo(ctx);
> -bool is_scanout = panfrost_batch_is_scanout(batch);
> -bool has_draws = batch->last_job.gpu;
> -
> -/* Bail out early when the current and new states are the same. */
> -if (util_framebuffer_state_equal(>pipe_framebuffer, fb))
> -return;
> -
> -/* The wallpaper logic sets a new FB state before doing the blit and
> - * restore the old one when it's done. Those FB states are reported 
> to
> - * be different because the surface they are pointing to are 
> different,
> - * but those surfaces actually point to the same cbufs/zbufs. In that
> - * case we definitely don't want new FB descs to be emitted/attached
> - * since the job is expected to be flushed just after the blit is 
> done,
> - * so let's just copy the new state and return here.
> - */
> -if (ctx->wallpaper_batch) {
> -util_copy_framebuffer_state(>pipe_framebuffer, fb);
> -return;
> -}
> -
> -if (!is_scanout || has_draws)
> -panfrost_flush_all_batches(ctx, true);
> -else
> -
> assert(!ctx->payloads[PIPE_SHADER_VERTEX].postfix.framebuffer &&
> -   
> !ctx->payloads[PIPE_SHADER_FRAGMENT].postfix.framebuffer);
> -
> -/* Invalidate the FBO job cache since we've just been assigned a new
> - * FB state.
> - */
> -ctx->batch = NULL;
> -
> +panfrost_hint_afbc(pan_screen(pctx->screen), fb);
>  util_copy_framebuffer_state(>pipe_framebuffer, fb);
> -
> -/* Given that we're rendering, we'd love to have compression */
> -struct panfrost_screen *screen = pan_screen(ctx->base.screen);
> -
> -panfrost_hint_afbc(screen, >pipe_framebuffer);
> -for (unsigned i = 0; i < PIPE_SHADER_TYPES; ++i)
> -ctx->payloads[i].postfix.framebuffer = 0;
> +ctx->batch = NULL;
> +panfrost_invalidate_frame(ctx);
>  }
>  
>  static void *
> -- 
> 2.21.0
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH v2 32/37] panfrost: Kill the explicit serialization in panfrost_batch_submit()

2019-09-17 Thread Alyssa Rosenzweig
R-b with enthusiasm :)

On Mon, Sep 16, 2019 at 11:37:10AM +0200, Boris Brezillon wrote:
> Now that we have all the pieces in place to support pipelining batches
> we can get rid of the drmSyncobjWait() at the end of
> panfrost_batch_submit().
> 
> Signed-off-by: Boris Brezillon 
> ---
>  src/gallium/drivers/panfrost/pan_job.c | 9 -
>  1 file changed, 9 deletions(-)
> 
> diff --git a/src/gallium/drivers/panfrost/pan_job.c 
> b/src/gallium/drivers/panfrost/pan_job.c
> index de2922a8366e..df92b791a1f2 100644
> --- a/src/gallium/drivers/panfrost/pan_job.c
> +++ b/src/gallium/drivers/panfrost/pan_job.c
> @@ -846,7 +846,6 @@ panfrost_batch_submit(struct panfrost_batch *batch)
>  panfrost_batch_submit((*dep)->batch);
>  }
>  
> -struct panfrost_context *ctx = batch->ctx;
>  int ret;
>  
>  /* Nothing to do! */
> @@ -865,15 +864,7 @@ panfrost_batch_submit(struct panfrost_batch *batch)
>  
>  out:
>  panfrost_freeze_batch(batch);
> -
> -/* We always stall the pipeline for correct results since pipelined
> - * rendering is quite broken right now (to be fixed by the 
> panfrost_job
> - * refactor, just take the perf hit for correctness)
> - */
> -drmSyncobjWait(pan_screen(ctx->base.screen)->fd,
> -   >out_sync->syncobj, 1, INT64_MAX, 0, NULL);
>  panfrost_free_batch(batch);
> -
>  }
>  
>  void
> -- 
> 2.21.0
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH v2 28/37] panfrost: Start tracking inter-batch dependencies

2019-09-17 Thread Alyssa Rosenzweig
> > I'm wondering if batch->dependencies should be expressed as a set,
> > rather than a dynarray, such that testing whether a batch has a
> > given dependency is ideally O(1), not O(N).
> > 
> > In practice I don't know if the asymptotic complexity matters, esp. for
> > small numbers of batches, and in practice hash table iteration is slow
> > enough in mesa* that maybe it would be counterproductive.
> > 
> > Just something I thought I might throw out there.
> > 
> > * Set iteration ought to be no slower than array iteration, but constant
> >   factors are a thing.
> 
> I thought the number of deps would be small enough to not justify the
> use of a set, but maybe I'm wrong.

Mm, after all this lands we can profile and revisit, there are bigger
fish to fry.

> > > +static void
> > > +panfrost_bo_access_gc_fences(struct panfrost_context *ctx,
> > > + struct panfrost_bo_access *access,
> > > +  const struct panfrost_bo *bo)  
> > 
> > Could you remind me what gc abbreviates? Sorry.
> 
> Garbage collect.
> 
> > 
> > I'm a little unclear on what the function's purpose is based on the
> > name.
> 
> Collect signaled fences to keep the kernel-side syncobj-map small. The
> idea is to collect those signaled fences at the end of each flush_all
> call. This function is likely to collect only fences from previous
> batch flushes not the one that have just have just been submitted and
> are probably still in flight when we trigger the garbage collection.
> Anyway, we need to do this garbage collection at some point if we don't
> want the BO access map to keep invalid entries around and retain
> syncobjs forever.

This definitely needs to be documented somewhere, then. Maybe paste the
"Collect...forever" paragraph into a comment above access_gc_fences?

> > Question: is it safe to remove entries while iterating the table?
> > (Not a review comment, I don't know the details of mesa's
> > implementation)
> 
> According to the doc, it is.

Good to know, thank you.

> > > -if (ctx->last_out_sync) {
> > > +if (reqs & PANFROST_JD_REQ_FS && batch->first_job.gpu)  
> > 
> > Could we add a `bool is_fragment` with this condition and then make this
> > as well as the if below simple if (is_fragment)?
> 
> Hm, the condition is actually is_fragment_and_has_draws, so I'm not
> sure is_fragment is good name. Maybe depends_on_vertex_tiler.

is_fragment_shader?

> > > +/* Submit the dependencies first. */
> > > +util_dynarray_foreach(>dependencies,
> > > +  struct panfrost_batch_fence *, dep) {
> > > +if ((*dep)->batch)
> > > +panfrost_batch_submit((*dep)->batch);
> > > +}  
> > 
> > I assume this logic will be refined later in the series...?
> 
> It's not actually. Why do you think it should?

Oh, I was just assuming the dependency graph stuff would be more
explicit, I guess.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH v2 31/37] panfrost: Add a panfrost_flush_batches_accessing_bo() helper

2019-09-17 Thread Alyssa Rosenzweig
R-b

On Mon, Sep 16, 2019 at 11:37:09AM +0200, Boris Brezillon wrote:
> This will allow us to only flush batches touching a specific resource,
> which is particularly useful when the CPU needs to access a BO.
> 
> Signed-off-by: Boris Brezillon 
> ---
>  src/gallium/drivers/panfrost/pan_job.c | 31 ++
>  src/gallium/drivers/panfrost/pan_job.h |  4 
>  2 files changed, 35 insertions(+)
> 
> diff --git a/src/gallium/drivers/panfrost/pan_job.c 
> b/src/gallium/drivers/panfrost/pan_job.c
> index e36f252e01fc..de2922a8366e 100644
> --- a/src/gallium/drivers/panfrost/pan_job.c
> +++ b/src/gallium/drivers/panfrost/pan_job.c
> @@ -921,6 +921,37 @@ panfrost_flush_all_batches(struct panfrost_context *ctx, 
> bool wait)
>  util_dynarray_fini();
>  }
>  
> +void
> +panfrost_flush_batches_accessing_bo(struct panfrost_context *ctx,
> +struct panfrost_bo *bo,
> +uint32_t access_type)
> +{
> +struct panfrost_bo_access *access;
> +struct hash_entry *hentry;
> +
> +/* It doesn't make any to flush only the readers. */
> +assert(access_type == PAN_BO_GPU_ACCESS_WRITE ||
> +   access_type == PAN_BO_GPU_ACCESS_RW);
> +
> +hentry = _mesa_hash_table_search(ctx->accessed_bos, bo);
> +access = hentry ? hentry->data : NULL;
> +if (!access)
> +return;
> +
> +if (access_type & PAN_BO_GPU_ACCESS_WRITE && access->writer &&
> +access->writer->batch)
> +panfrost_batch_submit(access->writer->batch);
> +
> +if (!(access_type & PAN_BO_GPU_ACCESS_READ))
> +return;
> +
> +util_dynarray_foreach(>readers, struct panfrost_batch_fence 
> *,
> +  reader) {
> +if (*reader && (*reader)->batch)
> +panfrost_batch_submit((*reader)->batch);
> +}
> +}
> +
>  void
>  panfrost_batch_set_requirements(struct panfrost_batch *batch)
>  {
> diff --git a/src/gallium/drivers/panfrost/pan_job.h 
> b/src/gallium/drivers/panfrost/pan_job.h
> index 5c9d5e3715d5..d198864ce4f7 100644
> --- a/src/gallium/drivers/panfrost/pan_job.h
> +++ b/src/gallium/drivers/panfrost/pan_job.h
> @@ -185,6 +185,10 @@ panfrost_batch_create_bo(struct panfrost_batch *batch, 
> size_t size,
>  void
>  panfrost_flush_all_batches(struct panfrost_context *ctx, bool wait);
>  
> +void
> +panfrost_flush_batches_accessing_bo(struct panfrost_context *ctx,
> +struct panfrost_bo *bo, uint32_t flags);
> +
>  void
>  panfrost_batch_set_requirements(struct panfrost_batch *batch);
>  
> -- 
> 2.21.0
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH v2 25/37] panfrost: Add a batch fence

2019-09-17 Thread Alyssa Rosenzweig
> > > +/* Start in a signaled state so that even non-submitted batches
> > > + * (those that have no draw/clear) can be waited upon.
> > > + */  
> > 
> > When would this happen? Presumably if a batch does nothing whatsoever,
> > it doesn't make sense to wait on it.
> 
> Was just simpler to have all batch fences containing a syncobjs on
> which we can call WAIT_SYNCOBJ even no real fence is attached to it.
> 
> The other option being to set fence->signaled to true when a batch with
> no draw/clear is submitted, and then skip those entries when we build
> the array passed to the waitsyncobj() func.

I'm wondering the latter option would be cleaner, rather than having
extra DRM objects floating around?

> > >  #include "pan_resource.h"
> > >  
> > > +/* Batch that created this fence object. Will become NULL at 
> > > batch
> > > + * submission time. This field is mainly here to know whether the
> > > + * batch has been flushed or not.
> > > + */
> > > +struct panfrost_batch *batch;  
> > 
> > Could this be replaced by just a `bool flushed`, or do we actually use
> > the value in a later patch?
> 
> I actually use the value to flush deps when they're not already flushed.

Fair enough.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH v2 23/37] panfrost: Make panfrost_batch->bos a hash table

2019-09-17 Thread Alyssa Rosenzweig
Ah, perhaps, yes. My bad.

On Tue, Sep 17, 2019 at 12:18:17AM +0200, Boris Brezillon wrote:
> On Mon, 16 Sep 2019 15:26:12 -0400
> Alyssa Rosenzweig  wrote:
> 
> > Well, the hash tables strongly assume you're not using NULLs for things.
> > 
> > See _mesa_hash_table_set_deleted_key for how to change that behaviour.
> 
> Maybe I'm missing something, but AFAICT it's the key field that requires
> special care, not the data one.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

  1   2   3   4   5   6   7   8   9   10   >