Re: [Mesa-dev] [PATCH] i965: Use the predicate enable bit for conditional rendering without stalling
On Tue, Nov 11, 2014 at 11:13:28AM -0800, Kenneth Graunke wrote: On Tuesday, November 11, 2014 06:59:51 PM Neil Roberts wrote: Kenneth Graunke kenn...@whitecape.org writes: drm-intel-next must have the new software checker turned on, which disallows non-whitelisted register writes (along with libva, so it can't really be enabled upstream yet). For what it's worth, I get the EINVAL error even on the stock Fedora 20 kernel on Haswell (and presumably IvyBridge) so I can only assume the software checker is already upstream, unless I'm misunderstanding something. $ uname -r 3.16.7-200.fc20.x86_64 $ modinfo i915 | grep cmd_parser parm: enable_cmd_parser:Enable command parsing [...] (1=enabled [default], 0=disabled) (int) $ sudo cat /sys/module/i915/parameters/enable_cmd_parser 1 If I cat 0 to /sys/module/i915/parameters/enable_cmd_parser then I no longer get the EINVAL error. - Neil Huh. Yeah, I thought they turned it on by default in 3.16, which I don't understand at all. AFAIK the libva issue isn't fixed (or wasn't by then), so it sure seems like it would've broken userspace. Which would be a pretty clear kernel policy violation... We let libva pass. And in the latest patches from Brad if we detect libva tricks we'll still let it pass, just not with elevated privs needed for writing special registers. And the point of enabling the parser in 3.16 already was to have as much coverage early as possible to catch any userspace issues we've missed. -Daniel -- Daniel Vetter Software Engineer, Intel Corporation +41 (0) 79 365 57 48 - http://blog.ffwll.ch ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] i965: Use the predicate enable bit for conditional rendering without stalling
On Wednesday, November 12, 2014 11:28:15 AM Daniel Vetter wrote: On Tue, Nov 11, 2014 at 11:13:28AM -0800, Kenneth Graunke wrote: On Tuesday, November 11, 2014 06:59:51 PM Neil Roberts wrote: Kenneth Graunke kenn...@whitecape.org writes: drm-intel-next must have the new software checker turned on, which disallows non-whitelisted register writes (along with libva, so it can't really be enabled upstream yet). For what it's worth, I get the EINVAL error even on the stock Fedora 20 kernel on Haswell (and presumably IvyBridge) so I can only assume the software checker is already upstream, unless I'm misunderstanding something. $ uname -r 3.16.7-200.fc20.x86_64 $ modinfo i915 | grep cmd_parser parm: enable_cmd_parser:Enable command parsing [...] (1=enabled [default], 0=disabled) (int) $ sudo cat /sys/module/i915/parameters/enable_cmd_parser 1 If I cat 0 to /sys/module/i915/parameters/enable_cmd_parser then I no longer get the EINVAL error. - Neil Huh. Yeah, I thought they turned it on by default in 3.16, which I don't understand at all. AFAIK the libva issue isn't fixed (or wasn't by then), so it sure seems like it would've broken userspace. Which would be a pretty clear kernel policy violation... We let libva pass. And in the latest patches from Brad if we detect libva tricks we'll still let it pass, just not with elevated privs needed for writing special registers. And the point of enabling the parser in 3.16 already was to have as much coverage early as possible to catch any userspace issues we've missed. -Daniel Cool. I'd seen the discussiosn on making it work, and thought it hadn't actually happened...but I had my facts wrong. Sorry for the trouble. I actually haven't heard anyone complaining about breakage, so apparently it works pretty well. Nice! --Ken signature.asc Description: This is a digitally signed message part. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] i965: Use the predicate enable bit for conditional rendering without stalling
Kenneth Graunke kenn...@whitecape.org writes: drm-intel-next must have the new software checker turned on, which disallows non-whitelisted register writes (along with libva, so it can't really be enabled upstream yet). For what it's worth, I get the EINVAL error even on the stock Fedora 20 kernel on Haswell (and presumably IvyBridge) so I can only assume the software checker is already upstream, unless I'm misunderstanding something. $ uname -r 3.16.7-200.fc20.x86_64 $ modinfo i915 | grep cmd_parser parm: enable_cmd_parser:Enable command parsing [...] (1=enabled [default], 0=disabled) (int) $ sudo cat /sys/module/i915/parameters/enable_cmd_parser 1 If I cat 0 to /sys/module/i915/parameters/enable_cmd_parser then I no longer get the EINVAL error. - Neil ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] i965: Use the predicate enable bit for conditional rendering without stalling
On Tuesday, November 11, 2014 06:59:51 PM Neil Roberts wrote: Kenneth Graunke kenn...@whitecape.org writes: drm-intel-next must have the new software checker turned on, which disallows non-whitelisted register writes (along with libva, so it can't really be enabled upstream yet). For what it's worth, I get the EINVAL error even on the stock Fedora 20 kernel on Haswell (and presumably IvyBridge) so I can only assume the software checker is already upstream, unless I'm misunderstanding something. $ uname -r 3.16.7-200.fc20.x86_64 $ modinfo i915 | grep cmd_parser parm: enable_cmd_parser:Enable command parsing [...] (1=enabled [default], 0=disabled) (int) $ sudo cat /sys/module/i915/parameters/enable_cmd_parser 1 If I cat 0 to /sys/module/i915/parameters/enable_cmd_parser then I no longer get the EINVAL error. - Neil Huh. Yeah, I thought they turned it on by default in 3.16, which I don't understand at all. AFAIK the libva issue isn't fixed (or wasn't by then), so it sure seems like it would've broken userspace. Which would be a pretty clear kernel policy violation... I must have the facts wrong, I guess... --Ken signature.asc Description: This is a digitally signed message part. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] i965: Use the predicate enable bit for conditional rendering without stalling
On Fri, Nov 07, 2014 at 03:28:01PM +, Neil Roberts wrote: Unfortunately these two source registers are not in the whitelist of available registers in the kernel driver so this needs a kernel patch to work. This patch tries to check whether it is possible to write to this register by creating a DRM buffer with a single command to write to the register and then trying to exec it. The return value of the exec function (and hence the ioctl) is checked. There is a function with a similar goal just above to check whether the OACONTROL register can be used. This works by putting the test command in the regular batch buffer and trying to compare whether the value was successfully written. I couldn't get this approach to work because intel_batchbuffer_flush actually calls exit() if the ioctl fails. I am suspicious that this approach doesn't work for OACONTROL either. Daniel Vetter dan...@ffwll.ch writes: Atm the cmd parser for gen7 validates the batch and rejects it if there's something in there it doesn't like. But it doesn't grant any additional privs. Hence - OACONTROL passes since it's in the whitelist, but since the cmd parser doesn't grant the needed privs the writes would need. - your new regs aren't in the whitelist, so the execbuf fails with -EINVAL. We have a cmd parser version which we intend to bump every time we add new registers, so probably better to check that. And I guess we'd need one to indicate that the cmd parser actually does something useful. Probably best done with just another bump. And if we do that we could replace the current trick mesa uses with just a getparam query - the getparam is fixed so either returns -EINVAL on old kernels or the cmd parser version. I'm not really sure if I understand your reply correctly. Did you see that I also posted the corresponding kernel patch? I was testing my patch on top of the drm-intel-next branch with IvyBridge. If I don't apply the patch then I get EINVAL and the register load doesn't work. If I do apply it then everything seems to work and the Piglit tests pass. Therefore I'm assuming that the only thing that's stopping the register load from working without the patch is the whitelist in the command parser so adding the registers to the whitelist should be enough to get it working. It looks like the PRM for Haswell says that MI_LOAD_REGISTER_MEM is converted to no-op for non-privileged buffers. However I can't find any mention of this for IvyBridge. Does that mean it's allowed on IvyBridge but it won't work on Haswell? I haven't tested it on Haswell yet. However, as well as checking the command parser version, the patch for Mesa only enables using the predicate registers if it determines it can do a register write from an untrusted buffer. This is done via can_do_pipelined_register_writes which already existed to determine whether we can do indirect renders. So presumably checking for this and checking for version 2 of the command parser would be enough to ensure that we can load into the predicate register. Presumably it would also start working magically on Haswell if we did something to disable the hardware command parser and only relied on the kernel command parser. Is that the plan? The bit I mentioned about OACONTROL was just saying that the method of detecting whether we can write to OACONTROL specifically doesn't work. This is because writing to a register that is not in the whitelist returns EINVAL and Mesa calls exit(1) when drm_intel_gem_bo_context_exec fails. I used a different method to detect whether we can write to MI_PREDICATE_SRC0 by manually calling drm_intel_gem_bo_context_exec to avoid the call to exit(1). I think the code to check whether OACONTROL is accessible is wrong because the process will have already exited before it gets a chance to check whether the load worked. In v2 of the patch I made it check the cmd parser version as you suggested. We should probably do something similar for OACONTROL, but that is a separate issue. Regards, - Neil ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] i965: Use the predicate enable bit for conditional rendering without stalling
On Mon, Nov 10, 2014 at 2:57 PM, Neil Roberts n...@linux.intel.com wrote: The bit I mentioned about OACONTROL was just saying that the method of detecting whether we can write to OACONTROL specifically doesn't work. This is because writing to a register that is not in the whitelist returns EINVAL and Mesa calls exit(1) when drm_intel_gem_bo_context_exec fails. I used a different method to detect whether we can write to MI_PREDICATE_SRC0 by manually calling drm_intel_gem_bo_context_exec to avoid the call to exit(1). I think the code to check whether OACONTROL is accessible is wrong because the process will have already exited before it gets a chance to check whether the load worked. In v2 of the patch I made it check the cmd parser version as you suggested. We should probably do something similar for OACONTROL, but that is a separate issue. Hmm, that's awkward. I had been thinking we should aim remove the OACONTROL tricks from the cmd parser, considering that it's not currently possible to program the OA to get useful data this way, but it sounds like we'd actually start making Mesa apps exit(1) if we were to do that :-/ If we add more complete support for the OA via perf then it would be nice if we just had one place responsible for programming OACONTROL so I'd been thinking we could just drop it from the cmd parser white list - assuming older versions of Mesa would handle that gracefully. 'Guess it won't be quite so simple :-) Regards, - Robert ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] i965: Use the predicate enable bit for conditional rendering without stalling
Neil Roberts n...@linux.intel.com writes: It looks like the PRM for Haswell says that MI_LOAD_REGISTER_MEM is converted to no-op for non-privileged buffers. However I can't find any mention of this for IvyBridge. Does that mean it's allowed on IvyBridge but it won't work on Haswell? I haven't tested it on Haswell yet. Ok, I just tested it on Haswell and sure enough it doesn't work. It does however fail gracefully because can_do_pipelined_register_writes fails so it doesn't even try to use the predicate registers. I think in that case it wouldn't cause any problems to land the patches as they are. If we later find a way to make it work on Haswell we could bump the command parser version again as you say. We could also use that when deciding whether to enable indirect rendering to avoid having to do a trial load. - Neil ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] i965: Use the predicate enable bit for conditional rendering without stalling
On Monday, November 10, 2014 02:57:22 PM Neil Roberts wrote: I'm not really sure if I understand your reply correctly. Did you see that I also posted the corresponding kernel patch? I was testing my patch on top of the drm-intel-next branch with IvyBridge. That's why. Ivybridge has had the hardware checker disabled for a long time (basically, enabling PPGTT disables it), so you could write any registers you want. Haswell allows PPGTT + the hardware checker, so the kernel does that, and your register writes git turned into MI_NOOP. drm-intel-next must have the new software checker turned on, which disallows non-whitelisted register writes (along with libva, so it can't really be enabled upstream yet). --Ken signature.asc Description: This is a digitally signed message part. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] i965: Use the predicate enable bit for conditional rendering without stalling
Previously whenever a primitive is drawn the driver would call _mesa_check_conditional_render which blocks waiting for the result of the query to determine whether to render. On Gen7+ there is a bit in the 3DPRIMITIVE command which can be used to disable the primitive based on the value of a state bit. This state bit can be set based on whether two registers have different values using the MI_PREDICATE command. We can load these two registers with the pixel count values stored in the query begin and end to implement conditional rendering without stalling. Unfortunately these two source registers are not in the whitelist of available registers in the kernel driver so this needs a kernel patch to work. This patch tries to check whether it is possible to write to this register by creating a DRM buffer with a single command to write to the register and then trying to exec it. The return value of the exec function (and hence the ioctl) is checked. There is a function with a similar goal just above to check whether the OACONTROL register can be used. This works by putting the test command in the regular batch buffer and trying to compare whether the value was successfully written. I couldn't get this approach to work because intel_batchbuffer_flush actually calls exit() if the ioctl fails. I am suspicious that this approach doesn't work for OACONTROL either. The predicate enable bit is currently only used for drawing 3D primitives. For blits, clears, bitmaps, copypixels and drawpixels it still causes a stall. For most of these it would probably just work to call the new brw_check_conditional_render function instead of _mesa_check_conditional_render because they already work in terms of rendering primitives. However it's a bit trickier for blits because it can use the BLT ring or the blorp codepath. I think these operations are less useful for conditional rendering than rendering primitives so it might be best to leave it for a later patch. --- I'm not really sure whether I'm using the right cache domain to load the pixel count values into the predicate source values. Currently I'm using the instruction domain because that is what is used to write the values from a PIPE_CONTROL command but that doesn't seem to make much sense because the comment for that domain says it is for shaders. I've added the PIPE_CONTROL_FLUSH_ENABLE flag to the PIPE_CONTROL command used to save the depth pass count because the PRM says that you should do this. However it seems to work just as well without it and I'm not really sure what it does. Perhaps without the bit it can start the next command without waiting for the depth pass count to be written and that would be bad because the next command is likely to be the load into the predicate register. I'm not sure if it also implies a cache flush because I would expect that the PIPE_CONTROL command and the register load command would be using the same cache domain? I'll post the kernel patch to the intel-gfx list. I've also written a little demo using conditional rendering here: https://github.com/bpeel/glthing/tree/wip/conditional-render With the patch the FPS jumps from ~20 to ~50 although it's a bit of a silly test because if you just completely disable conditional rendering then the FPS goes all the way up to 270. src/mesa/drivers/dri/i965/Makefile.sources | 1 + src/mesa/drivers/dri/i965/brw_conditional_render.c | 167 + src/mesa/drivers/dri/i965/brw_context.c| 4 + src/mesa/drivers/dri/i965/brw_context.h| 21 +++ src/mesa/drivers/dri/i965/brw_defines.h| 1 + src/mesa/drivers/dri/i965/brw_draw.c | 16 +- src/mesa/drivers/dri/i965/brw_queryobj.c | 17 ++- src/mesa/drivers/dri/i965/intel_extensions.c | 48 ++ src/mesa/drivers/dri/i965/intel_reg.h | 23 +++ 9 files changed, 286 insertions(+), 12 deletions(-) create mode 100644 src/mesa/drivers/dri/i965/brw_conditional_render.c diff --git a/src/mesa/drivers/dri/i965/Makefile.sources b/src/mesa/drivers/dri/i965/Makefile.sources index 711aabe..0adaf4d 100644 --- a/src/mesa/drivers/dri/i965/Makefile.sources +++ b/src/mesa/drivers/dri/i965/Makefile.sources @@ -38,6 +38,7 @@ i965_FILES = \ brw_clip_tri.c \ brw_clip_unfilled.c \ brw_clip_util.c \ + brw_conditional_render.c \ brw_context.c \ brw_cubemap_normalize.cpp \ brw_curbe.c \ diff --git a/src/mesa/drivers/dri/i965/brw_conditional_render.c b/src/mesa/drivers/dri/i965/brw_conditional_render.c new file mode 100644 index 000..e676aa0 --- /dev/null +++ b/src/mesa/drivers/dri/i965/brw_conditional_render.c @@ -0,0 +1,167 @@ +/* + * Copyright © 2014 Intel Corporation + * + * Permission is hereby granted, free of charge, to any person obtaining a + * copy of this software and associated documentation files (the Software), + * to deal in the Software without restriction, including without limitation + *