Jan Vesely <jan.ves...@rutgers.edu> writes:
> On Sun, 2017-10-22 at 20:40 -0700, Francisco Jerez wrote:
>> Jan Vesely <jano.ves...@gmail.com> writes:
>>
>> > From: Jan Vesely <jan.ves...@rutgers.edu>
>> >
>> > v2: u
yInfo target_library_info;
> #endif
>
> + template
> + unsigned target_lang_address_space(const T& target, const AS
> lang_as) {
Can you name this "target_address_space" (to me lang address space means
the LangAS enum, i.e. the non-target-dependent representation, w
Jan Vesely writes:
> From: Jan Vesely
>
> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103388
> Signed-off-by: Jan Vesely
> ---
> Hi,
>
> this is an alternative to Vedran's approach. it hides the logic behind a
>
Series is:
Reviewed-by: Francisco Jerez <curroje...@riseup.net>
Emil Velikov <emil.l.veli...@gmail.com> writes:
> From: Emil Velikov <emil.veli...@collabora.com>
>
> Nearly all the distributions* that build Mesa OpenCL, enable the ICD.
> Since building a
r
> newer LLVM.
>
Agree with Emil, a few of these compatibility defininitions now become
trivial and could be folded into their uses. Still seems like a good
start, patch is:
Reviewed-by: Francisco Jerez <curroje...@riseup.net>
> -Emil
signature.asc
Description: PGP signature
___
that.
>>> >
>>>
>>> Right, I'll update the commit message as follows and push it in a few hours.
>>
>> Thanks.
>> Acked-by: Jan Vesely <jan.ves...@rutgers.edu>
>>
>> you might want to get the maintainer's (Francisco) ack as well.
Jason Ekstrand writes:
> On Wed, Oct 4, 2017 at 5:29 PM, Connor Abbott wrote:
>
>> This won't completely solve the problem. For example, what if you
>> hoist the assignment to color2 outside the loop?
>>
>> vec4 color2;
>> while (1) {
>>vec4 color
Aaron Watry writes:
> On Fri, Sep 29, 2017 at 10:14 AM, Emil Velikov
> wrote:
>> Hi all,
>>
>> Currently nearly all the distributions I've seen* enable and use the ICD.
>> Only Gentoo does not use it, but manages the OpenCL.so conflicts via eselect.
Jan Vesely <jan.ves...@rutgers.edu> writes:
> On Tue, 2017-09-26 at 14:51 -0700, Francisco Jerez wrote:
>> Jan Vesely <jan.ves...@rutgers.edu> writes:
>>
>> > On Wed, 2017-09-20 at 19:10 -0500, Aaron Watry wrote:
>> > > [SNIP]
>> > &
y.
>
Wouldn't it be easier for Clover to check which libclc version it's
linking against and expose a subset of language versions and extensions
accordingly? That should only take a tiny bit of build system support
and wouldn't lock us to a single libclc release per API version (which
would make
Jan Vesely <jan.ves...@rutgers.edu> writes:
> Fixes build issues with llvm-3.6
> Fixes: 3115687f9b9830417c408228db2bc679e346bba6 (clover: Fix build after
> LLVM r313390)
>
> Signed-off-by: Jan Vesely <jan.ves...@rutgers.edu>
Reviewed-by: Francisco Jerez <curroj
Jan Vesely <jan.ves...@rutgers.edu> writes:
> On Fri, 2017-09-15 at 17:48 -0700, Francisco Jerez wrote:
>> Jan Vesely <jan.ves...@rutgers.edu> writes:
>>
>> > Signed-off-by: Jan Vesely <jan.ves...@rutgers.edu>
>> > ---
>> >
7 +89,7 @@ namespace {
> create_context(std::string _log) {
>init_targets();
>std::unique_ptr ctx { new LLVMContext };
> - ctx->setDiagnosticHandler(diagnostic_handler, _log);
> + compat::set_diagnostic_handler(ctx.get(), diagnostic_handler, _log);
Would rat
Jan Vesely <jan.ves...@rutgers.edu> writes:
> On Mon, 2017-09-04 at 13:23 -0700, Francisco Jerez wrote:
>> Jan Vesely <jan.ves...@rutgers.edu> writes:
>>
>> > v2: wait in map_buffer and map_image as well
>> > v3: use event::wait instead of wait (
Chema Casanova <jmcasan...@igalia.com> writes:
> El 08/09/17 a las 11:06, Alejandro Piñeiro escribió:
>> On 08/09/17 02:50, Francisco Jerez wrote:
>>> Currently the liveness analysis pass would extend a live interval up
>>> to the top of the program w
Jason Ekstrand writes:
> On Wed, Sep 6, 2017 at 3:58 PM, Chema Casanova
> wrote:
>
>> Hi Connor and Curro,
>>
>> On 28/08/17 12:24, Alejandro Piñeiro wrote:
>> > On 27/08/17 20:24, Connor Abbott wrote:
>> >> Hi,
>> >>
>> >> On Aug 25, 2017 9:28 AM,
Currently the liveness analysis pass would extend a live interval up
to the top of the program when no unconditional and complete
definition of the variable is found that dominates all of its uses.
This can lead to a serious performance problem in shaders containing
many partial writes, like
cpp
> index c62b8ba6140..19dd960be3a 100644
> --- a/src/intel/compiler/brw_shader.cpp
> +++ b/src/intel/compiler/brw_shader.cpp
> @@ -486,6 +486,9 @@ brw_instruction_name(const struct gen_device_info
> *devinfo, enum opcode op)
>return "tes_add_indirect_urb_offset"
alls "wait_signalled()"), but I
> suppose calling non-virtual function is preferrable. if not, feel free
> to use v3.
>
Yeah, I find v4 more readable than calling the base class'
implementation of wait(). Patch is:
Reviewed-by: Francisco Jerez <curroje...@riseup.net>
Jan Vesely <jan.ves...@rutgers.edu> writes:
> Signed-off-by: Jan Vesely <jan.ves...@rutgers.edu>
With the spelling fixed up (s/has_halfs/has_halves/) patch is:
Reviewed-by: Francisco Jerez <curroje...@riseup.net>
> ---
>
> src/gallium/state_tracke
Francisco Jerez <curroje...@riseup.net> writes:
> Chema Casanova <jmcasan...@igalia.com> writes:
>
>> El 25/08/17 a las 20:09, Francisco Jerez escribió:
>>> Alejandro Piñeiro <apinhe...@igalia.com> writes:
>>>
>>>> Although it is possibl
Chema Casanova <jmcasan...@igalia.com> writes:
> El 25/08/17 a las 20:09, Francisco Jerez escribió:
>> Alejandro Piñeiro <apinhe...@igalia.com> writes:
>>
>>> Although it is possible to emit them directly as AND/OR on brw_fs_nir,
>>> having specifi
Alejandro Piñeiro writes:
> On 24/08/17 21:07, Connor Abbott wrote:
>>
>> Hi Alejandro,
>
> Hi Connor,
>
>>
>> This seems really suspicious. If the live ranges are really
>> independent, then the register allocator should be able to assign the
>> two virtual registers to
Alejandro Piñeiro writes:
> Although it is possible to emit them directly as AND/OR on brw_fs_nir,
> having specific opcodes makes it easier to remove duplicate settings
> later.
>
> Signed-off-by: Alejandro Piñeiro
> Signed-off-by: Jose Maria
Jan Vesely <jan.ves...@rutgers.edu> writes:
> On Fri, 2017-08-18 at 14:19 -0700, Francisco Jerez wrote:
>> Jan Vesely <jan.ves...@rutgers.edu> writes:
>>
>> > v2: wait in map_buffer and map_image as well
>> > v3: use event::wait ins
Jan Vesely writes:
> v2: wait in map_buffer and map_image as well
> v3: use event::wait instead of wait (skips fence wait for hard_event)
>
Unfortunately this won't wait for the event action to be executed, only
for all dependencies of the event to become signalled, so
The anv_execbuf_add_bo() call can actually fail in practice, which
should cause the QueueSubmit operation to fail. Reported by Coverity.
CID: 1416606: Unchecked return value (CHECKED_RETURN)
---
src/intel/vulkan/anv_batch_chain.c | 22 +-
1 file changed, 13 insertions(+), 9
Probably harmless, but will overwrite errno with a failure status
code. Reported by coverity.
CID 1416600: Argument cannot be negative (NEGATIVE_RETURNS)
---
src/intel/vulkan/anv_batch_chain.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git
Jan Vesely <jan.ves...@rutgers.edu> writes:
> On Tue, 2017-08-15 at 12:00 -0700, Francisco Jerez wrote:
>> Jan Vesely <jan.ves...@rutgers.edu> writes:
>>
>> > On Sat, 2017-08-12 at 20:14 -0700, Francisco Jerez wrote:
>> > > Jan Vesely <jan.
Mark Janes writes:
> This series resolves
> https://bugs.freedesktop.org/show_bug.cgi?id=101985, currently blocking
> 17.2 release.
>
I have doubts this series is ready for production, though I don't think
it makes a ton of sense for Gen7 fp64 vec4 spilling to be
Jan Vesely <jan.ves...@rutgers.edu> writes:
> On Sat, 2017-08-12 at 20:14 -0700, Francisco Jerez wrote:
>> Jan Vesely <jan.ves...@rutgers.edu> writes:
>>
>> > On Sat, 2017-08-05 at 12:34 -0700, Francisco Jerez wrote:
>> > > Francisco Jerez <
pfn_event_notify is NULL or if
> command_exec_callback_type is
> not CL_SUBMITTED , CL_RUNNING or CL_COMPLETE .
>
> Fixes: OpenCL CTS test_conformance/events/test_events callbacks
>
> Signed-off-by: Aaron Watry <awa...@gmail.com>
> Cc: Francisco Jerez <cu...@riseup.net
Aaron Watry <awa...@gmail.com> writes:
> On Sat, Aug 12, 2017 at 10:14 PM, Francisco Jerez <curroje...@riseup.net>
> wrote:
>> Jan Vesely <jan.ves...@rutgers.edu> writes:
>>
>>> On Sat, 2017-08-05 at 12:34 -0700, Francisco Jerez wrote:
>&g
Jan Vesely <jan.ves...@rutgers.edu> writes:
> On Sat, 2017-08-05 at 12:34 -0700, Francisco Jerez wrote:
>> Francisco Jerez <curroje...@riseup.net> writes:
>>
>> > Jan Vesely <jan.ves...@rutgers.edu> writes:
>> >
>> > > Hi,
Francisco Jerez <curroje...@riseup.net> writes:
> Jan Vesely <jan.ves...@rutgers.edu> writes:
>
>> Hi,
>>
>> thanks for detailed explanation. I indeed missed the writeBuffer part
>> in specs.
>>
>> On Wed, 2017-08-02 at 15:05 -0700, Francisco
Jan Vesely <jan.ves...@rutgers.edu> writes:
> Hi,
>
> thanks for detailed explanation. I indeed missed the writeBuffer part
> in specs.
>
> On Wed, 2017-08-02 at 15:05 -0700, Francisco Jerez wrote:
>> These changes are somewhat redundant and potentially
>&g
; return map;
>
> @@ -695,7 +717,11 @@ clEnqueueMapImage(cl_command_queue d_q, cl_mem d_mem,
> cl_bool blocking,
>
> void *map = img.resource(q).add_map(q, flags, blocking, origin, region);
>
> - ret_object(rd_ev, create(q, CL_COMMAND_MAP_IMAGE, deps));
> +
"Marathe, Yogesh" <yogesh.mara...@intel.com> writes:
> Francisco,
>
>> -Original Message-
>> From: mesa-dev [mailto:mesa-dev-boun...@lists.freedesktop.org] On Behalf
>> Of Francisco Jerez
>> Sent: Thursday, July 20, 2017 10:51 PM
&
aravindan.muthuku...@intel.com writes:
> From: Aravindan Muthukumar
>
> This patch improves CPI Rate(Cycles per Instruction)
> and branch mispredict for i965. The function check_state()
> was showing CPI retired rate.
>
> Performance stats with android:
> CPI
Matt Turner writes:
> The implementations of the ARB_shader_ballot intrinsics will explicitly
> read the flag as a source register.
> ---
> src/intel/compiler/brw_fs.cpp | 18 ++
> 1 file changed, 14 insertions(+), 4 deletions(-)
>
> diff --git
Matt Turner writes:
> This function will be used to implement read_invocation (by specifying a
> specific channel) and read_first_invocation (by not specifying a
> channel).
> ---
> src/intel/compiler/brw_fs_builder.h | 9 ++---
> 1 file changed, 6 insertions(+), 3
Matt Turner writes:
> The implementations of the ARB_shader_group_vote intrinsics will
> explicitly write the flag as the destination register.
> ---
> src/intel/compiler/brw_fs.cpp | 12 ++--
> 1 file changed, 10 insertions(+), 2 deletions(-)
>
> diff --git
le clang
> -Wmissing-field-initializers [2]." - Emil
>
> This change works around that and will silence such warnings. It is both
> a GCC and a clang extension.
>
...and it's standard C++. Patch is:
Reviewed-by: Francisco Jerez <curroje...@riseup.net>
> v2:
>
if]{v,}
> push-pop-texture-state
> texwrap 1d
> texwrap 1d proj
> texwrap 2d proj
> texwrap formats
>
> All told, 49 more tests pass on NV20 (10de:0201).
>
> No changes on Intel CI run or RV250 (1002:4c66).
>
> Signed-off-by: Ian Romanick <ian.d
Samuel Iglesias Gonsálvez <sigles...@igalia.com> writes:
> On Fri, 2017-06-23 at 11:06 -0700, Francisco Jerez wrote:
>> Samuel Iglesias Gonsálvez <sigles...@igalia.com> writes:
>>
>> > On Thu, 2017-06-22 at 16:25 -0700, Francisco Jerez wrote:
>
Samuel Iglesias Gonsálvez <sigles...@igalia.com> writes:
> On Thu, 2017-06-22 at 16:25 -0700, Francisco Jerez wrote:
>> Samuel Iglesias Gonsálvez <sigles...@igalia.com> writes:
>>
>> > Signed-off-by: Samuel Iglesias Gonsálvez <sigles...@igalia
Samuel Iglesias Gonsálvez writes:
> Hello,
>
> As mentioned in the patch series that implemented Ivybridge support
> ARB_gpu_shader_fp64 [0], the only missing feature in that series was
> register spilling of 64-bit data and, because of that, about ~39 fp64
> piglit tests
Samuel Iglesias Gonsálvez writes:
> Signed-off-by: Samuel Iglesias Gonsálvez
> ---
> src/intel/compiler/brw_eu_defines.h | 2 +
> src/intel/compiler/brw_shader.cpp| 5 +
> src/intel/compiler/brw_vec4.cpp | 7 ++
---
src/intel/compiler/brw_fs_bank_conflicts.cpp | 274 ++-
1 file changed, 188 insertions(+), 86 deletions(-)
diff --git a/src/intel/compiler/brw_fs_bank_conflicts.cpp
b/src/intel/compiler/brw_fs_bank_conflicts.cpp
index 0225c70..dc88cac 100644
---
Unnecessary GRF bank conflicts increase the issue time of ternary
instructions (the overwhelmingly most common of which is MAD) by
roughly 50%, leading to reduced ALU throughput. This pass attempts to
minimize the number of bank conflicts by rearranging the layout of the
GRF space post-register
Samuel Iglesias Gonsálvez writes:
> Signed-off-by: Samuel Iglesias Gonsálvez
> ---
> src/intel/compiler/brw_eu.h | 18 ++--
> src/intel/compiler/brw_eu_emit.c| 38
> +
>
Anuj Phogat <anuj.pho...@gmail.com> writes:
> From: Ben Widawsky <benjamin.widaw...@intel.com>
>
> V2 (Anuj):
> Squash the changes in one patch rebase on master.
> Address the review comments made by Francisco Jerez.
> Do the URB allocation per slice (not per
Anuj Phogat <anuj.pho...@gmail.com> writes:
> On Mon, Jun 19, 2017 at 2:18 PM, Francisco Jerez <curroje...@riseup.net>
> wrote:
>> Anuj Phogat <anuj.pho...@gmail.com> writes:
>>
>>> Adding min_size_increment_per_bank variable better explains the
&
Anuj Phogat <anuj.pho...@gmail.com> writes:
> From: Ben Widawsky <benjamin.widaw...@intel.com>
>
> V2 (Anuj):
> Squash the changes in one patch rebase on master.
> Address the review comments made by Francisco Jerez.
You seem to have missed half of my review comme
Anuj Phogat <anuj.pho...@gmail.com> writes:
> Adding min_size_increment_per_bank variable better explains the
> computation of L3 way size in the function.
>
> V2: Use const variable for min_size_increment_per_bank.
>
> Signed-off-by: Anuj Phogat <anuj.pho...@gmail
Anuj Phogat <anuj.pho...@gmail.com> writes:
> The new table added in this patch matches with the table
> in gfxspecs. We were programming the wrong values earlier.
>
> V2: Update the comment.
>
> Signed-off-by: Anuj Phogat <anuj.pho...@gmail.com>
> Cc: Francisco J
Jonas Kulla <nyocu...@gmail.com> writes:
> Valid values for URBAllocation start at 32, so substract that
> before programming the register.
>
> This was missed when porting from the GL driver.
Reviewed-by: Francisco Jerez <curroje...@riseup.net>
> ---
> src/inte
Aaron Watry <awa...@gmail.com> writes:
> Humble ping for this one.
>
Thanks for CC'ing me on this -- Patch is:
Reviewed-by: Francisco Jerez <curroje...@riseup.net>
> --Aaron
>
> On Sun, Jun 4, 2017 at 7:32 PM, Aaron Watry <awa...@gmail.com> wrote:
>> c
Anuj Phogat <anuj.pho...@gmail.com> writes:
> On Mon, Jun 12, 2017 at 12:22 PM, Francisco Jerez <curroje...@riseup.net>
> wrote:
>> Anuj Phogat <anuj.pho...@gmail.com> writes:
>>
>>> On Mon, Jun 12, 2017 at 11:10 AM, Francisco Jerez <curroje...@ri
Anuj Phogat <anuj.pho...@gmail.com> writes:
> On Mon, Jun 12, 2017 at 11:10 AM, Francisco Jerez <curroje...@riseup.net>
> wrote:
>> Anuj Phogat <anuj.pho...@gmail.com> writes:
>>
>>> The new table added in this patch matches with the table
>>
Anuj Phogat <anuj.pho...@gmail.com> writes:
> Adding this variable better explains the computation for L3 way
> size in the function.
>
> Signed-off-by: Anuj Phogat <anuj.pho...@gmail.com>
> Cc: Francisco Jerez <curroje...@riseup.net>
> ---
>
Anuj Phogat <anuj.pho...@gmail.com> writes:
> The new table added in this patch matches with the table
> in gfxspecs. We were programming the wrong values earlier.
>
> Signed-off-by: Anuj Phogat <anuj.pho...@gmail.com>
> Cc: Francisco Jerez <curroje...@rise
mbers are initialized to in C when
an explicit initialization is missing from an aggregate initializer.
With the redundant initializers dropped patch is:
Reviewed-by: Francisco Jerez <curroje...@riseup.net>
> Suggested-by: Francisco Jerez <curroje...@riseup.net>
> Signed-off-by: A
Anuj Phogat <anuj.pho...@gmail.com> writes:
> By making use of l3_banks field in gen_device_info struct
> l3_way_size for gen7+ = 2 * l3_banks.
>
> V2: Keep the get_l3_way_size() function.
>
> Suggested-by: Francisco Jerez <curroje...@riseup.net>
> Sig
Anuj Phogat <anuj.pho...@gmail.com> writes:
> By making use of l3_banks field in gen_device_info struct
> l3_way_size for gen7+ = 2 * l3_banks.
>
> Suggested-by: Francisco Jerez <curroje...@riseup.net>
> Signed-off-by: Anuj Phogat <anuj.pho...@gmail.com&g
Anuj Phogat <anuj.pho...@gmail.com> writes:
> This new field helps simplify l3 way size computations
> in next patch.
>
> Suggested-by: Francisco Jerez <curroje...@riseup.net>
> Signed-off-by: Anuj Phogat <anuj.pho...@gmail.com>
> Cc: Francisco Jerez <cu
Ilia Mirkin writes:
> I kinda see it both ways - yeah, the functions are the same and it's
> all shared, so your patch makes sense. OTOH, all of these functions
> (which do anything) have a nv04/nv10/nv20 prefix, which makes it
> easier to separate stuff out by generation
Samuel Iglesias Gonsálvez <sigles...@igalia.com> writes:
> Kind reminder that patches 1 and 3 are still unreviewed.
>
Seems terrible, this makes me feel like deleting the vec4 back-end
instead. But okay, series is:
Acked-by: Francisco Jerez <curroje...@riseup.net>
> Sam
&
Francisco Jerez <curroje...@riseup.net> writes:
> Anuj Phogat <anuj.pho...@gmail.com> writes:
>
>> Cherryview and Broxton are always gt1. So, remove the redundant checks.
>>
>> Signed-off-by: Anuj Phogat <anuj.pho...@gmail.com>
>> ---
>>
Anuj Phogat writes:
> Cherryview and Broxton are always gt1. So, remove the redundant checks.
>
> Signed-off-by: Anuj Phogat
> ---
> src/intel/common/gen_l3_config.c | 10 --
> 1 file changed, 4 insertions(+), 6 deletions(-)
>
> diff --git
Pierre Moreau writes:
> Besides parsing all the opcodes until reaching the EOF character, there
> is no way to compute the size of a SPIR-V binary. Therefore, it is
> easier to pass it along the SPIR-V binary in pipe_compute_state.
>
LLVM IR programs use
Samuel Iglesias Gonsálvez writes:
> Reorder the uniforms to load first the dvec4-aligned variables
> in the push constant buffer and then push the vec4-aligned ones.
>
> This fixes a bug were the dvec3/4 might be loaded one part on a GRF and
> the rest in next GRF, so the
Samuel Iglesias Gonsálvez <sigles...@igalia.com> writes:
> El Viernes, 28 de abril de 2017 16:08:35 Francisco Jerez escribió:
>> Samuel Iglesias Gonsálvez <sigles...@igalia.com> writes:
>> > It was setting XYWZ swizzle and writemask to all uniforms, no matter if
>
Anuj Phogat <anuj.pho...@gmail.com> writes:
> On Mon, Apr 24, 2017 at 9:15 PM, Ben Widawsky <b...@bwidawsk.net> wrote:
>
>> On 17-04-18 18:18:39, Francisco Jerez wrote:
>>
>> Most, if not all of the unrelated changes that snuck in were due to rebase.
>>
Samuel Iglesias Gonsálvez <sigles...@igalia.com> writes:
> On Mon, 2017-05-01 at 14:55 +0200, Samuel Iglesias Gonsálvez wrote:
>> El Viernes, 28 de abril de 2017 16:27:56 Francisco Jerez escribió:
>> > Samuel Iglesias Gonsálvez <sigles...@igalia.com> writes:
>
s strictly speaking wrong for any non-identity regions.
With that clarified patch is:
Reviewed-by: Francisco Jerez <curroje...@riseup.net>
> Signed-off-by: Samuel Iglesias Gonsálvez <sigles...@igalia.com>
> Cc: "17.1" <mesa-sta...@lists.freedesktop.org>
> ---
>
Samuel Iglesias Gonsálvez writes:
> On gen7, the swizzles used in DF align16 instructions works for element
> size of 32 bits, so we can address only 2 consecutive DFs. As we assumed that
> in the rest of the code and prepare the instructions for this
> (scalarize_df()),
>
nst) && inst->exec_size == inv_cvt(src.width +
> 1)) {
'cvt(inst->exec_size) - 1 == src.width'
> + const unsigned width = inv_cvt(src.width + 1);
> +const unsigned hstride = inv_cvt(src.hstride);
You can drop these two lines.
> +
gt; are aligned.
>
> v2:
> - Fix 'shift' calculation (Curro)
> - Set both swizzle and writemask.
> - Add assert(shift == 0) for the indirect case.
>
> Signed-off-by: Samuel Iglesias Gonsálvez <sigles...@igalia.com>
> Cc: "17.1" <mesa-sta...@lists.freedesk
const clang::InputKind default_ik = clang::IK_OpenCL;
"ik_opencl" seems like a better name for this, with that fixed:
Reviewed-by: Francisco Jerez <curroje...@riseup.net>
> #endif
>
> inline void
> diff --git a/src/gallium/state_trackers/clover/llvm/invoc
e square root. At any rate, it's still
better than manhattan. Series is:
Reviewed-by: Francisco Jerez <curroje...@riseup.net>
> The G45 documentation indicates that the old manhattan distance setting
> is "only for debug purposes" and should never be used. The Ironlake
> doc
Samuel Iglesias Gonsálvez <sigles...@igalia.com> writes:
> On Mon, 2017-04-24 at 11:22 -0700, Francisco Jerez wrote:
>> Samuel Iglesias Gonsálvez <sigles...@igalia.com> writes:
>>
>> > On Fri, 2017-04-21 at 10:23 -0700, Francisco Jerez wrote:
>
Ben Widawsky <b...@bwidawsk.net> writes:
> On 17-04-18 18:18:39, Francisco Jerez wrote:
>
> Most, if not all of the unrelated changes that snuck in were due to rebase.
> Anuj, would you mind fixing those? I tried my best to address the rest, but
> I'm
> admittedly stumb
Samuel Iglesias Gonsálvez <sigles...@igalia.com> writes:
> On Fri, 2017-04-21 at 10:23 -0700, Francisco Jerez wrote:
>> Samuel Iglesias Gonsálvez <sigles...@igalia.com> writes:
>>
>> > On Thu, 2017-04-20 at 10:26 -0700, Francisco Jerez wrote:
>
Kenneth Graunke <kenn...@whitecape.org> writes:
> Curro pointed out that I should not just check for MACH, but use
> the reads_accumulator_implicitly() helper, which would also prevent
> the same bug with MAC and SADA2 (if we ever decide to use them).
>
> Cc: Fra
Kenneth Graunke writes:
> opt_register_coalesce() was optimizing sequences such as:
>
>mul(8) acc0:D, attr18.xyyy:D, attr19.xyyy:D
>mach(8) vgrf5.xy:D, attr18.xyyy:D, attr19.xyyy:D
>mov(8) m4.zw:F, vgrf5.xxxy:F
>
> into:
>
>mul(8) acc0:D, attr18.xyyy:D,
This is what we use later on to compute the number of registers that
will actually get spilled to memory, so it's more likely to match
reality than the current open-coded approximation.
Cc:
---
src/intel/compiler/brw_fs_reg_allocate.cpp | 3 +--
1 file
Until now the spilling cost calculation was neglecting the amount of
data read from the register during the spilling cost calculation.
This caused it to make suboptimal decisions in some cases leading to
higher memory bandwidth usage than necessary.
Improves Unigine Heaven performance by ~4% on
Samuel Iglesias Gonsálvez <sigles...@igalia.com> writes:
> On Thu, 2017-04-20 at 10:26 -0700, Francisco Jerez wrote:
>> Samuel Iglesias Gonsálvez <sigles...@igalia.com> writes:
>>
>> > It was setting XYWZ swizzle to all uniforms, no matter if they were
>>
Samuel Iglesias Gonsálvez writes:
> It was setting XYWZ swizzle to all uniforms, no matter if they were
> a vector or not.
>
> Signed-off-by: Samuel Iglesias Gonsálvez
> Cc: curroje...@riseup.net
Don't you need to CC mesa-stable here and in the next
Hi Pam, looks good overall, a couple of comments below,
Plamena Manolova writes:
> Adds suppport for ARB_fragment_shader_interlock. We achieve
> the interlock and fragment ordering by issuing a memory fence
> via sendc.
>
> Signed-off-by: Plamena Manolova
Anuj Phogat writes:
> From: Ben Widawsky
>
> V2: Squash the changes in one patch and rebased on master (Anuj).
>
> Signed-off-by: Ben Widawsky
> Signed-off-by: Anuj Phogat
> ---
>
stem, although the Travis-CI
> instance [1] is less forgiving. I'm not too happy on the patch hence the
> HACK - suggestions are greatly appreciated.
>
> Cc: Francisco Jerez <curroje...@riseup.net>
> Cc: Jan Vesely <jan.ves...@rutgers.edu>
> Cc: Aaron Watry <awa...@gmai
Jan Vesely <jan.ves...@rutgers.edu> writes:
> Fixes build failure with LLVM 4
>
> Fixes: a981e68c26dc4079a335101da0033185030207f6
> (clover: Fix build against clang SVN >= r299965)
>
> Signed-off-by: Jan Vesely <jan.ves...@rutgers.edu>
Reviewed-by: Franc
compatibility preprocessor
conditionals mid-expression, you could instead add a conditional define
into the llvm/compat.hpp file along the lines of:
+#if HAVE_LLVM >= 0x0500
+ const auto lang_as_offset = 0;
+#else
+ const auto lang_as_offset = clang::LangAS
The individual branches of an if/else/endif construct will be executed
some unknown number of times between 0 and 1 relative to the parent
block. Use some factor in between as weight while approximating the
cost of spill/fill instructions within a conditional if-else branch.
This favors spilling
Aaron Watry <awa...@gmail.com> writes:
> Fixes: 3dfe61e ("gallium: decrease the size of pipe_box - 24 -> 16 bytes")
> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=100569
> Signed-off-by: Aaron Watry <awa...@gmail.com>
Thanks!
Reviewed-by: Fran
"Pohjolainen, Topi" writes:
> Jason, Curro, do you have any opinion if this is worth pursuing?
> I need something for blorp blits at least - using blorp for texture
> uploads on top of current excessive flushing regresses perf.
>
I wouldn't be surprised if it
This fixes the stripes of garbage rendered on the floor of the vehicle
assembly building among other rendering issues. The reason for the
misrendering seems to be that some of the GLSL shaders used by the
application use variables before initializing them, incorrectly
assuming that they will be
false;
> + /* As it is an strided destination, we write n-times more being
> n the
> + * size ratio between source and destination types. Update
> + * size_written accordingly.
> + */
> + inst->size_written = inst->dst.compone
301 - 400 of 2990 matches
Mail list logo