Re: [Mesa-dev] [PATCH 16/20] radeonsi: add FMASK texture binding slots and resource setup
Am 08.08.2013 02:20, schrieb Marek Olšák: FMASK is bound as a separate texture. For every texture, there can be an FMASK. Therefore a separate array of resource slots has to be added. This adds a new mechanism for emitting resource descriptors, its features are: - resource descriptors are stored in an ordinary buffer (not in a CS) Having resource descriptors outside of the CS has two problems that we need to solve first: 1. Fine grained descriptor updates doesn't work, I already tried that. The problem is that unlike previous asics descriptors are now a memory block, so no longer part of the CP context. So when we (for example) have a draw command executing and the next draw command is using new resources for a specific slot we would either block until the first draw command is finished (which is bad for performance) or change the descriptors while they are still in use (which results in VM faults). 2. If my understand is correct when they are embedded the descriptors are preloaded into the caches while executing the IB, so to archive the same speed with descriptors outside of the IB you need to add additional commands to the constant IB which is new to SI and we currently doesn't support in the CS interface. Regards, Christian. - descriptors of disabled resources are set to zeros - fine-grained resource updates (it can update one resource slot while not touching the other slots) - updates are done with the WRITE_DATA packet - it implements the si_atom interface for packet emission - only used for FMASK textures right now The primary motivation for this is that FMASK textures naturally need fine-grained resource updates and I also need to query in the shader if a resource is NULL. --- src/gallium/drivers/radeonsi/Makefile.sources | 1 + src/gallium/drivers/radeonsi/r600_hw_context.c | 3 + src/gallium/drivers/radeonsi/r600_resource.h | 1 + src/gallium/drivers/radeonsi/r600_texture.c| 1 + src/gallium/drivers/radeonsi/radeonsi_pipe.c | 9 +- src/gallium/drivers/radeonsi/radeonsi_pipe.h | 6 +- src/gallium/drivers/radeonsi/radeonsi_pm4.c| 7 + src/gallium/drivers/radeonsi/radeonsi_pm4.h| 2 + src/gallium/drivers/radeonsi/si_descriptors.c | 188 + src/gallium/drivers/radeonsi/si_state.c| 58 +++- src/gallium/drivers/radeonsi/si_state.h| 36 + 11 files changed, 305 insertions(+), 7 deletions(-) create mode 100644 src/gallium/drivers/radeonsi/si_descriptors.c diff --git a/src/gallium/drivers/radeonsi/Makefile.sources b/src/gallium/drivers/radeonsi/Makefile.sources index b3ffa72..68c8282 100644 --- a/src/gallium/drivers/radeonsi/Makefile.sources +++ b/src/gallium/drivers/radeonsi/Makefile.sources @@ -10,6 +10,7 @@ C_SOURCES := \ r600_translate.c \ radeonsi_pm4.c \ radeonsi_compute.c \ + si_descriptors.c \ si_state.c \ si_state_streamout.c \ si_state_draw.c \ diff --git a/src/gallium/drivers/radeonsi/r600_hw_context.c b/src/gallium/drivers/radeonsi/r600_hw_context.c index 7ed7496..b595477 100644 --- a/src/gallium/drivers/radeonsi/r600_hw_context.c +++ b/src/gallium/drivers/radeonsi/r600_hw_context.c @@ -289,6 +289,9 @@ void si_context_flush(struct r600_context *ctx, unsigned flags) * next draw command */ si_pm4_reset_emitted(ctx); + + si_sampler_views_begin_new_cs(ctx, ctx-fmask_sampler_views[PIPE_SHADER_VERTEX]); + si_sampler_views_begin_new_cs(ctx, ctx-fmask_sampler_views[PIPE_SHADER_FRAGMENT]); } void si_context_emit_fence(struct r600_context *ctx, struct si_resource *fence_bo, unsigned offset, unsigned value) diff --git a/src/gallium/drivers/radeonsi/r600_resource.h b/src/gallium/drivers/radeonsi/r600_resource.h index e5dd36a..ab5c7b7 100644 --- a/src/gallium/drivers/radeonsi/r600_resource.h +++ b/src/gallium/drivers/radeonsi/r600_resource.h @@ -44,6 +44,7 @@ struct r600_fmask_info { unsigned offset; unsigned size; unsigned alignment; + unsigned pitch; unsigned bank_height; unsigned slice_tile_max; unsigned tile_mode_index; diff --git a/src/gallium/drivers/radeonsi/r600_texture.c b/src/gallium/drivers/radeonsi/r600_texture.c index cd3d1aa..b613564 100644 --- a/src/gallium/drivers/radeonsi/r600_texture.c +++ b/src/gallium/drivers/radeonsi/r600_texture.c @@ -463,6 +463,7 @@ static void r600_texture_get_fmask_info(struct r600_screen *rscreen, out-slice_tile_max -= 1; out-tile_mode_index = fmask.tiling_index[0]; + out-pitch = fmask.level[0].nblk_x; out-bank_height = fmask.bankh; out-alignment = MAX2(256, fmask.bo_alignment); out-size = fmask.bo_size; diff --git a/src/gallium/drivers/radeonsi/radeonsi_pipe.c b/src/gallium/drivers/radeonsi/radeonsi_pipe.c index ad955e3..3112124 100644 --- a/src/gallium/drivers/radeonsi/radeonsi_pipe.c +++
Re: [Mesa-dev] [PATCH 18/20] radeonsi: port texture improvements from r600g
Am 08.08.2013 02:20, schrieb Marek Olšák: This started as an attempt to add support for MSAA texture transfers and MSAA depth-stencil decompression for the DB-CB copy path. It has gotten a bit out of control, but it's for the greater good. Some changes do not make much sense, they are there just to make it look like the other driver. With a few cosmetic modifications, r600_texture.c can be shared with a symlink. The patch itself looks good to me. The original plan was to put that stuff under drivers/radeon, and only have the SI specific parts under radeonsi. Didn't worked as intended and we should try to share that code between r600 and radeonsi again. Christian. --- src/gallium/drivers/r600/r600_blit.c | 3 - src/gallium/drivers/r600/r600_texture.c | 3 +- src/gallium/drivers/radeonsi/r600_blit.c | 53 ++- src/gallium/drivers/radeonsi/r600_resource.h | 7 +- src/gallium/drivers/radeonsi/r600_texture.c | 528 --- src/gallium/drivers/radeonsi/radeonsi_pipe.c | 8 +- src/gallium/drivers/radeonsi/radeonsi_pipe.h | 11 +- src/gallium/drivers/radeonsi/si_state.c | 21 +- 8 files changed, 365 insertions(+), 269 deletions(-) diff --git a/src/gallium/drivers/r600/r600_blit.c b/src/gallium/drivers/r600/r600_blit.c index 2230e7b..1c22a75 100644 --- a/src/gallium/drivers/r600/r600_blit.c +++ b/src/gallium/drivers/r600/r600_blit.c @@ -171,9 +171,6 @@ void r600_blit_decompress_depth(struct pipe_context *ctx, zsurf = ctx-create_surface(ctx, texture-resource.b.b, surf_tmpl); surf_tmpl.format = flushed_depth_texture-resource.b.b.format; - surf_tmpl.u.tex.level = level; - surf_tmpl.u.tex.first_layer = layer; - surf_tmpl.u.tex.last_layer = layer; cbsurf = ctx-create_surface(ctx, flushed_depth_texture-resource.b.b, surf_tmpl); diff --git a/src/gallium/drivers/r600/r600_texture.c b/src/gallium/drivers/r600/r600_texture.c index 36cca17..742e982 100644 --- a/src/gallium/drivers/r600/r600_texture.c +++ b/src/gallium/drivers/r600/r600_texture.c @@ -458,8 +458,7 @@ r600_texture_create_object(struct pipe_screen *screen, rtex-is_depth = util_format_has_depth(util_format_description(rtex-resource.b.b.format)); rtex-surface = *surface; - r = r600_setup_surface(screen, rtex, - pitch_in_bytes_override); + r = r600_setup_surface(screen, rtex, pitch_in_bytes_override); if (r) { FREE(rtex); return NULL; diff --git a/src/gallium/drivers/radeonsi/r600_blit.c b/src/gallium/drivers/radeonsi/r600_blit.c index 2db952c..d5ed87a 100644 --- a/src/gallium/drivers/radeonsi/r600_blit.c +++ b/src/gallium/drivers/radeonsi/r600_blit.c @@ -102,23 +102,31 @@ static void r600_blitter_end(struct pipe_context *ctx) r600_context_queries_resume(rctx); } -void si_blit_uncompress_depth(struct pipe_context *ctx, +static unsigned u_max_sample(struct pipe_resource *r) +{ + return r-nr_samples ? r-nr_samples - 1 : 0; +} + +void r600_blit_decompress_depth(struct pipe_context *ctx, struct r600_texture *texture, struct r600_texture *staging, unsigned first_level, unsigned last_level, - unsigned first_layer, unsigned last_layer) + unsigned first_layer, unsigned last_layer, + unsigned first_sample, unsigned last_sample) { struct r600_context *rctx = (struct r600_context *)ctx; - unsigned layer, level, checked_last_layer, max_layer; + unsigned layer, level, sample, checked_last_layer, max_layer, max_sample; float depth = 1.0f; const struct util_format_description *desc; - void *custom_dsa; + void **custom_dsa; struct r600_texture *flushed_depth_texture = staging ? staging : texture-flushed_depth_texture; if (!staging !texture-dirty_level_mask) return; + max_sample = u_max_sample(texture-resource.b.b); + desc = util_format_description(flushed_depth_texture-resource.b.b.format); switch (util_format_has_depth(desc) | util_format_has_stencil(desc) 1) { default: @@ -144,30 +152,35 @@ void si_blit_uncompress_depth(struct pipe_context *ctx, checked_last_layer = last_layer max_layer ? last_layer : max_layer; for (layer = first_layer; layer = checked_last_layer; layer++) { - struct pipe_surface *zsurf, *cbsurf, surf_tmpl; + for (sample = first_sample; sample = last_sample; sample++) { + struct pipe_surface *zsurf, *cbsurf, surf_tmpl; - surf_tmpl.format = texture-real_format; - surf_tmpl.u.tex.level = level; -
Re: [Mesa-dev] [PATCH 14/20] radeonsi: reserve a parameter slot for FMASK resources in shaders
On Don, 2013-08-08 at 02:20 +0200, Marek Olšák wrote: diff --git a/src/gallium/drivers/radeonsi/si_state_draw.c b/src/gallium/drivers/radeonsi/si_state_draw.c index 746ace6..4208fa7 100644 --- a/src/gallium/drivers/radeonsi/si_state_draw.c +++ b/src/gallium/drivers/radeonsi/si_state_draw.c @@ -241,6 +241,7 @@ static void si_pipe_shader_ps(struct pipe_context *ctx, struct si_pipe_shader *s /* Last 2 reserved SGPRs are used for VCC */ num_sgprs = num_user_sgprs + 2; } + num_sgprs += 1; /* XXX this fixes VM faults */ assert(num_sgprs = 104); You can't just increment num_sgprs unconditionally here, or the assertion will spuriously fail again when num_sgprs was 104 to begin with. More fundamentally, I'd really like us to understand why / under what circumstances the value needs to be incremented at all. You mentioned on IRC that you were using an LLVM 3.3 tree, is this change still necessary with LLVM master / trunk? If yes, it would be good to look into at least some example shaders where the change is really necessary, to try and understand the circumstances. -- Earthling Michel Dänzer | http://www.amd.com Libre software enthusiast | Debian, X and DRI developer ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 12/20] radeonsi: implement MSAA colorbuffer compression for rendering
On Don, 2013-08-08 at 02:20 +0200, Marek Olšák wrote: diff --git a/src/gallium/drivers/radeonsi/r600_hw_context.c b/src/gallium/drivers/radeonsi/r600_hw_context.c index 25c972b..382382b 100644 --- a/src/gallium/drivers/radeonsi/r600_hw_context.c +++ b/src/gallium/drivers/radeonsi/r600_hw_context.c @@ -179,6 +179,17 @@ static void r600_flush_framebuffer(struct r600_context *ctx) si_pm4_emit(ctx, pm4); si_pm4_free_state(ctx, pm4, ~0); + /* flush CB_META */ + pm4 = si_pm4_alloc_state(ctx); + + if (pm4 == NULL) + return; + + si_cmd_flush_and_inv_cb_meta(pm4); + si_pm4_emit(ctx, pm4); + si_pm4_free_state(ctx, pm4, ~0); + ctx-flush_and_inv_cb_meta = false; + ctx-flags = ~R600_CONTEXT_DST_CACHES_DIRTY; } Can't you add to the existing pm4 here, instead of allocating a new one? -- Earthling Michel Dänzer | http://www.amd.com Libre software enthusiast | Debian, X and DRI developer ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 15/20] radeonsi: add basic infrastructure for atom-based states
Am 08.08.2013 02:20, schrieb Marek Olšák: It's the same as in r600g. Look how simple it is. That concept has the problem that we don't necessary know in which order the state is emitted. Why not just add an emit callback to si_pm4_state for the short term instead? For the long term we should probably teach the kernel interface to accept a bunch of pointers to smaller IB fragments instead of one large IB. That would allow us to not only save the copy of commands for the CSO states, but also allows us to emit optional IB fragments that are only inserted if the we had a context switch in between. Christian. --- src/gallium/drivers/radeonsi/r600_hw_context.c | 8 src/gallium/drivers/radeonsi/radeonsi_pipe.h | 9 + src/gallium/drivers/radeonsi/si_state.h| 10 ++ src/gallium/drivers/radeonsi/si_state_draw.c | 8 +++- 4 files changed, 34 insertions(+), 1 deletion(-) diff --git a/src/gallium/drivers/radeonsi/r600_hw_context.c b/src/gallium/drivers/radeonsi/r600_hw_context.c index 382382b..7ed7496 100644 --- a/src/gallium/drivers/radeonsi/r600_hw_context.c +++ b/src/gallium/drivers/radeonsi/r600_hw_context.c @@ -114,9 +114,17 @@ err: void si_need_cs_space(struct r600_context *ctx, unsigned num_dw, boolean count_draw_in) { + int i; + /* The number of dwords we already used in the CS so far. */ num_dw += ctx-cs-cdw; + for (i = 0; i ctx-num_atoms; i++) { + if (ctx-atoms[i]-dirty) { + num_dw += ctx-atoms[i]-num_dw; + } + } + if (count_draw_in) { /* The number of dwords all the dirty states would take. */ num_dw += ctx-pm4_dirty_cdwords; diff --git a/src/gallium/drivers/radeonsi/radeonsi_pipe.h b/src/gallium/drivers/radeonsi/radeonsi_pipe.h index e370149..5fa9bdc 100644 --- a/src/gallium/drivers/radeonsi/radeonsi_pipe.h +++ b/src/gallium/drivers/radeonsi/radeonsi_pipe.h @@ -145,6 +145,10 @@ struct r600_context { void*custom_blend_decompress; struct r600_screen *screen; struct radeon_winsys*ws; + + struct si_atom *atoms[SI_MAX_ATOMS]; + unsignednum_atoms; + struct si_vertex_element*vertex_elements; struct pipe_framebuffer_state framebuffer; unsignedfb_log_samples; @@ -329,4 +333,9 @@ static INLINE uint64_t r600_resource_va(struct pipe_screen *screen, struct pipe_ return rscreen-ws-buffer_get_virtual_address(rresource-cs_buf); } +static INLINE void si_add_atom(struct r600_context *rctx, struct si_atom *atom) +{ + rctx-atoms[rctx-num_atoms++] = atom; +} + #endif diff --git a/src/gallium/drivers/radeonsi/si_state.h b/src/gallium/drivers/radeonsi/si_state.h index b01fbf2..4aabdef 100644 --- a/src/gallium/drivers/radeonsi/si_state.h +++ b/src/gallium/drivers/radeonsi/si_state.h @@ -29,6 +29,16 @@ #include radeonsi_pm4.h +#define SI_MAX_ATOMS 2 + +/* This encapsulates a state or an operation which can emitted into the GPU + * command stream. */ +struct si_atom { + void (*emit)(struct r600_context *ctx, struct si_atom *state); + unsignednum_dw; + booldirty; +}; + struct si_state_blend { struct si_pm4_state pm4; uint32_tcb_target_mask; diff --git a/src/gallium/drivers/radeonsi/si_state_draw.c b/src/gallium/drivers/radeonsi/si_state_draw.c index 4208fa7..bcae778 100644 --- a/src/gallium/drivers/radeonsi/si_state_draw.c +++ b/src/gallium/drivers/radeonsi/si_state_draw.c @@ -664,7 +664,7 @@ void si_draw_vbo(struct pipe_context *ctx, const struct pipe_draw_info *info) { struct r600_context *rctx = (struct r600_context *)ctx; struct pipe_index_buffer ib = {}; - uint32_t cp_coher_cntl; + uint32_t cp_coher_cntl, i; if (!info-count (info-indexed || !info-count_from_stream_output)) return; @@ -728,6 +728,12 @@ void si_draw_vbo(struct pipe_context *ctx, const struct pipe_draw_info *info) si_need_cs_space(rctx, 0, TRUE); + for (i = 0; i rctx-num_atoms; i++) { + if (rctx-atoms[i]-dirty) { + rctx-atoms[i]-emit(rctx, rctx-atoms[i]); + } + } + si_pm4_emit_dirty(rctx); rctx-pm4_dirty_cdwords = 0; ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 14/20] radeonsi: reserve a parameter slot for FMASK resources in shaders
Am 08.08.2013 10:30, schrieb Michel Dänzer: On Don, 2013-08-08 at 02:20 +0200, Marek Olšák wrote: diff --git a/src/gallium/drivers/radeonsi/si_state_draw.c b/src/gallium/drivers/radeonsi/si_state_draw.c index 746ace6..4208fa7 100644 --- a/src/gallium/drivers/radeonsi/si_state_draw.c +++ b/src/gallium/drivers/radeonsi/si_state_draw.c @@ -241,6 +241,7 @@ static void si_pipe_shader_ps(struct pipe_context *ctx, struct si_pipe_shader *s /* Last 2 reserved SGPRs are used for VCC */ num_sgprs = num_user_sgprs + 2; } + num_sgprs += 1; /* XXX this fixes VM faults */ assert(num_sgprs = 104); You can't just increment num_sgprs unconditionally here, or the assertion will spuriously fail again when num_sgprs was 104 to begin with. More fundamentally, I'd really like us to understand why / under what circumstances the value needs to be incremented at all. You mentioned on IRC that you were using an LLVM 3.3 tree, is this change still necessary with LLVM master / trunk? If yes, it would be good to look into at least some example shaders where the change is really necessary, to try and understand the circumstances. Additional to that I'm not sure if we should add another sgpr pointer to the shader, the docs speak of quite a performance penalty for each. Since they are (optional) attributes to the textures anyway they could probably be interleaved with the texture resources. Christian. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 00/20] MSAA support for Radeon SI and more
On Don, 2013-08-08 at 02:21 +0200, Marek Olšák wrote: This is MSAA support for the radeonsi driver. It implements MSAA rendering, texturing, and colorbuffer compression. The only missing feature is the MSAA fast color clear; other than that, it operates at maximum performance the hardware is capable of. Nice! There are also some minor changes which are not related to MSAA. Apart from the issues being discussed, the series looks good to me. Also the whole r600_texture.c file from r600g is ported to radeonsi. Great. As Christian mentioned, we've been meaning to share code between radeonsi and r600g, feel free to take any opportunities you see for that anytime. -- Earthling Michel Dänzer | http://www.amd.com Libre software enthusiast | Debian, X and DRI developer ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] dri: Choose a decent global driNConfigOptions.
Previously, we were asserting that each driver specified an NConfigOptions exactly equal to the number of options they supplied, leading to frequent bugs when people would forget to adjust the value when adjusting driver options. Instead, just overallocate the table by a bit and leave sanity checking to the assert in findOption(). --- src/gallium/state_trackers/dri/common/dri_screen.c | 5 +-- src/mesa/drivers/dri/common/dri_util.c | 4 +- src/mesa/drivers/dri/common/xmlconfig.c| 44 -- src/mesa/drivers/dri/common/xmlconfig.h| 4 +- src/mesa/drivers/dri/i915/intel_screen.c | 5 +-- src/mesa/drivers/dri/i965/intel_screen.c | 5 +-- src/mesa/drivers/dri/radeon/radeon_screen.c| 5 +-- 7 files changed, 14 insertions(+), 58 deletions(-) diff --git a/src/gallium/state_trackers/dri/common/dri_screen.c b/src/gallium/state_trackers/dri/common/dri_screen.c index 3b42b5a..779741e 100644 --- a/src/gallium/state_trackers/dri/common/dri_screen.c +++ b/src/gallium/state_trackers/dri/common/dri_screen.c @@ -74,8 +74,6 @@ PUBLIC const char __driConfigOptions[] = #define false 0 -static const uint __driNConfigOptions = 13; - static const __DRIconfig ** dri_fill_in_modes(struct dri_screen *screen) { @@ -417,8 +415,7 @@ dri_init_screen_helper(struct dri_screen *screen, else screen-target = PIPE_TEXTURE_RECT; - driParseOptionInfo(screen-optionCacheDefaults, - __driConfigOptions, __driNConfigOptions); + driParseOptionInfo(screen-optionCacheDefaults, __driConfigOptions); driParseConfigFiles(screen-optionCache, screen-optionCacheDefaults, diff --git a/src/mesa/drivers/dri/common/dri_util.c b/src/mesa/drivers/dri/common/dri_util.c index 9ed9df4..fa520ea 100644 --- a/src/mesa/drivers/dri/common/dri_util.c +++ b/src/mesa/drivers/dri/common/dri_util.c @@ -52,8 +52,6 @@ PUBLIC const char __dri2ConfigOptions[] = DRI_CONF_SECTION_END DRI_CONF_END; -static const uint __dri2NConfigOptions = 1; - /*/ /** \name Screen handling functions */ /*/ @@ -112,7 +110,7 @@ dri2CreateNewScreen(int scrn, int fd, return NULL; } -driParseOptionInfo(psp-optionInfo, __dri2ConfigOptions, __dri2NConfigOptions); +driParseOptionInfo(psp-optionInfo, __dri2ConfigOptions); driParseConfigFiles(psp-optionCache, psp-optionInfo, psp-myNum, dri2); return psp; diff --git a/src/mesa/drivers/dri/common/xmlconfig.c b/src/mesa/drivers/dri/common/xmlconfig.c index 5c97c20..b95e452 100644 --- a/src/mesa/drivers/dri/common/xmlconfig.c +++ b/src/mesa/drivers/dri/common/xmlconfig.c @@ -132,16 +132,6 @@ static GLuint findOption (const driOptionCache *cache, const char *name) { return hash; } -/** \brief Count the real number of options in an option cache */ -static GLuint countOptions (const driOptionCache *cache) { -GLuint size = 1 cache-tableSize; -GLuint i, count = 0; -for (i = 0; i size; ++i) - if (cache-info[i].name) - count++; -return count; -} - /** \brief Like strdup but using malloc and with error checking. */ #define XSTRDUP(dest,source) do { \ GLuint len = strlen (source); \ @@ -685,25 +675,18 @@ static void optInfoEndElem (void *userData, const XML_Char *name) { } } -void driParseOptionInfo (driOptionCache *info, -const char *configOptions, GLuint nConfigOptions) { +void driParseOptionInfo (driOptionCache *info, const char *configOptions) { XML_Parser p; int status; struct OptInfoData userData; struct OptInfoData *data = userData; -GLuint realNoptions; - - /* determine hash table size and allocate memory: - * 3/2 of the number of options, rounded up, so there remains always - * at least one free entry. This is needed for detecting undefined - * options in configuration files without getting a hash table overflow. - * Round this up to a power of two. */ -GLuint minSize = (nConfigOptions*3 + 1) / 2; -GLuint size, log2size; -for (size = 1, log2size = 0; size minSize; size = 1, ++log2size); -info-tableSize = log2size; -info-info = calloc(size, sizeof (driOptionInfo)); -info-values = calloc(size, sizeof (driOptionValue)); + +/* Make the hash table big enough to fit more than the maximum number of + * config options we've ever seen in a driver. + */ +info-tableSize = 6; +info-info = calloc(1 info-tableSize, sizeof (driOptionInfo)); +info-values = calloc(1 info-tableSize, sizeof (driOptionValue)); if (info-info == NULL || info-values == NULL) { fprintf (stderr, %s: %d: out of memory.\n, __FILE__, __LINE__); abort(); @@ -728,17 +711,6 @@ void driParseOptionInfo (driOptionCache *info, XML_FATAL (%s.,
Re: [Mesa-dev] [PATCH 12/20] radeonsi: implement MSAA colorbuffer compression for rendering
Do you mean the surface_sync pm4 or the one in si_state? Marek On Thu, Aug 8, 2013 at 10:31 AM, Michel Dänzer mic...@daenzer.net wrote: On Don, 2013-08-08 at 02:20 +0200, Marek Olšák wrote: diff --git a/src/gallium/drivers/radeonsi/r600_hw_context.c b/src/gallium/drivers/radeonsi/r600_hw_context.c index 25c972b..382382b 100644 --- a/src/gallium/drivers/radeonsi/r600_hw_context.c +++ b/src/gallium/drivers/radeonsi/r600_hw_context.c @@ -179,6 +179,17 @@ static void r600_flush_framebuffer(struct r600_context *ctx) si_pm4_emit(ctx, pm4); si_pm4_free_state(ctx, pm4, ~0); + /* flush CB_META */ + pm4 = si_pm4_alloc_state(ctx); + + if (pm4 == NULL) + return; + + si_cmd_flush_and_inv_cb_meta(pm4); + si_pm4_emit(ctx, pm4); + si_pm4_free_state(ctx, pm4, ~0); + ctx-flush_and_inv_cb_meta = false; + ctx-flags = ~R600_CONTEXT_DST_CACHES_DIRTY; } Can't you add to the existing pm4 here, instead of allocating a new one? -- Earthling Michel Dänzer | http://www.amd.com Libre software enthusiast | Debian, X and DRI developer ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 14/20] radeonsi: reserve a parameter slot for FMASK resources in shaders
Interleaving might not be a good idea, but they could be in the same array, like this: 0..15: textures 16..31: FMASK textures I'll test LLVM master, but we should probably bump the LLVM version requirement in configure.ac to prevent users from using LLVM 3.3. Marek On Thu, Aug 8, 2013 at 10:37 AM, Christian König deathsim...@vodafone.de wrote: Am 08.08.2013 10:30, schrieb Michel Dänzer: On Don, 2013-08-08 at 02:20 +0200, Marek Olšák wrote: diff --git a/src/gallium/drivers/radeonsi/si_state_draw.c b/src/gallium/drivers/radeonsi/si_state_draw.c index 746ace6..4208fa7 100644 --- a/src/gallium/drivers/radeonsi/si_state_draw.c +++ b/src/gallium/drivers/radeonsi/si_state_draw.c @@ -241,6 +241,7 @@ static void si_pipe_shader_ps(struct pipe_context *ctx, struct si_pipe_shader *s /* Last 2 reserved SGPRs are used for VCC */ num_sgprs = num_user_sgprs + 2; } + num_sgprs += 1; /* XXX this fixes VM faults */ assert(num_sgprs = 104); You can't just increment num_sgprs unconditionally here, or the assertion will spuriously fail again when num_sgprs was 104 to begin with. More fundamentally, I'd really like us to understand why / under what circumstances the value needs to be incremented at all. You mentioned on IRC that you were using an LLVM 3.3 tree, is this change still necessary with LLVM master / trunk? If yes, it would be good to look into at least some example shaders where the change is really necessary, to try and understand the circumstances. Additional to that I'm not sure if we should add another sgpr pointer to the shader, the docs speak of quite a performance penalty for each. Since they are (optional) attributes to the textures anyway they could probably be interleaved with the texture resources. Christian. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 15/20] radeonsi: add basic infrastructure for atom-based states
Why is the order so important? Note that cache flushes shouldn't be treated like states and neither should the draw packets and any state that comes from pipe_draw_info. The only things left are the register updates and resource updates where the order doesn't matter, does it? Anyway, the order is determined by the order of si_add_atom calls, which should be done once in create_context. Marek On Thu, Aug 8, 2013 at 10:32 AM, Christian König deathsim...@vodafone.de wrote: Am 08.08.2013 02:20, schrieb Marek Olšák: It's the same as in r600g. Look how simple it is. That concept has the problem that we don't necessary know in which order the state is emitted. Why not just add an emit callback to si_pm4_state for the short term instead? For the long term we should probably teach the kernel interface to accept a bunch of pointers to smaller IB fragments instead of one large IB. That would allow us to not only save the copy of commands for the CSO states, but also allows us to emit optional IB fragments that are only inserted if the we had a context switch in between. Christian. --- src/gallium/drivers/radeonsi/r600_hw_context.c | 8 src/gallium/drivers/radeonsi/radeonsi_pipe.h | 9 + src/gallium/drivers/radeonsi/si_state.h| 10 ++ src/gallium/drivers/radeonsi/si_state_draw.c | 8 +++- 4 files changed, 34 insertions(+), 1 deletion(-) diff --git a/src/gallium/drivers/radeonsi/r600_hw_context.c b/src/gallium/drivers/radeonsi/r600_hw_context.c index 382382b..7ed7496 100644 --- a/src/gallium/drivers/radeonsi/r600_hw_context.c +++ b/src/gallium/drivers/radeonsi/r600_hw_context.c @@ -114,9 +114,17 @@ err: void si_need_cs_space(struct r600_context *ctx, unsigned num_dw, boolean count_draw_in) { + int i; + /* The number of dwords we already used in the CS so far. */ num_dw += ctx-cs-cdw; + for (i = 0; i ctx-num_atoms; i++) { + if (ctx-atoms[i]-dirty) { + num_dw += ctx-atoms[i]-num_dw; + } + } + if (count_draw_in) { /* The number of dwords all the dirty states would take. */ num_dw += ctx-pm4_dirty_cdwords; diff --git a/src/gallium/drivers/radeonsi/radeonsi_pipe.h b/src/gallium/drivers/radeonsi/radeonsi_pipe.h index e370149..5fa9bdc 100644 --- a/src/gallium/drivers/radeonsi/radeonsi_pipe.h +++ b/src/gallium/drivers/radeonsi/radeonsi_pipe.h @@ -145,6 +145,10 @@ struct r600_context { void*custom_blend_decompress; struct r600_screen *screen; struct radeon_winsys*ws; + + struct si_atom *atoms[SI_MAX_ATOMS]; + unsignednum_atoms; + struct si_vertex_element*vertex_elements; struct pipe_framebuffer_state framebuffer; unsignedfb_log_samples; @@ -329,4 +333,9 @@ static INLINE uint64_t r600_resource_va(struct pipe_screen *screen, struct pipe_ return rscreen-ws-buffer_get_virtual_address(rresource-cs_buf); } +static INLINE void si_add_atom(struct r600_context *rctx, struct si_atom *atom) +{ + rctx-atoms[rctx-num_atoms++] = atom; +} + #endif diff --git a/src/gallium/drivers/radeonsi/si_state.h b/src/gallium/drivers/radeonsi/si_state.h index b01fbf2..4aabdef 100644 --- a/src/gallium/drivers/radeonsi/si_state.h +++ b/src/gallium/drivers/radeonsi/si_state.h @@ -29,6 +29,16 @@ #include radeonsi_pm4.h +#define SI_MAX_ATOMS 2 + +/* This encapsulates a state or an operation which can emitted into the GPU + * command stream. */ +struct si_atom { + void (*emit)(struct r600_context *ctx, struct si_atom *state); + unsignednum_dw; + booldirty; +}; + struct si_state_blend { struct si_pm4_state pm4; uint32_tcb_target_mask; diff --git a/src/gallium/drivers/radeonsi/si_state_draw.c b/src/gallium/drivers/radeonsi/si_state_draw.c index 4208fa7..bcae778 100644 --- a/src/gallium/drivers/radeonsi/si_state_draw.c +++ b/src/gallium/drivers/radeonsi/si_state_draw.c @@ -664,7 +664,7 @@ void si_draw_vbo(struct pipe_context *ctx, const struct pipe_draw_info *info) { struct r600_context *rctx = (struct r600_context *)ctx; struct pipe_index_buffer ib = {}; - uint32_t cp_coher_cntl; + uint32_t cp_coher_cntl, i; if (!info-count (info-indexed || !info-count_from_stream_output)) return; @@ -728,6 +728,12 @@ void si_draw_vbo(struct pipe_context *ctx, const struct pipe_draw_info *info) si_need_cs_space(rctx, 0, TRUE); + for (i = 0; i rctx-num_atoms; i++) { + if (rctx-atoms[i]-dirty) { + rctx-atoms[i]-emit(rctx, rctx-atoms[i]); + }
Re: [Mesa-dev] [PATCH 16/20] radeonsi: add FMASK texture binding slots and resource setup
.On Thu, Aug 8, 2013 at 9:47 AM, Christian König deathsim...@vodafone.de wrote: Am 08.08.2013 02:20, schrieb Marek Olšák: FMASK is bound as a separate texture. For every texture, there can be an FMASK. Therefore a separate array of resource slots has to be added. This adds a new mechanism for emitting resource descriptors, its features are: - resource descriptors are stored in an ordinary buffer (not in a CS) Having resource descriptors outside of the CS has two problems that we need to solve first: 1. Fine grained descriptor updates doesn't work, I already tried that. The problem is that unlike previous asics descriptors are now a memory block, so no longer part of the CP context. So when we (for example) have a draw command executing and the next draw command is using new resources for a specific slot we would either block until the first draw command is finished (which is bad for performance) or change the descriptors while they are still in use (which results in VM faults). So what would the proper solution be here? Do I need to flush some caches or would moving the descriptor updates to the constant IB fix that? 2. If my understand is correct when they are embedded the descriptors are preloaded into the caches while executing the IB, so to archive the same speed with descriptors outside of the IB you need to add additional commands to the constant IB which is new to SI and we currently doesn't support in the CS interface. There seems to be support for the constant IB. The CS ioctl chunk ID is RADEON_CHUNK_ID_CONST_IB and the allowed packets are listed in si_vm_packet3_ce_check. Is there anything missing? Marek Regards, Christian. - descriptors of disabled resources are set to zeros - fine-grained resource updates (it can update one resource slot while not touching the other slots) - updates are done with the WRITE_DATA packet - it implements the si_atom interface for packet emission - only used for FMASK textures right now The primary motivation for this is that FMASK textures naturally need fine-grained resource updates and I also need to query in the shader if a resource is NULL. --- src/gallium/drivers/radeonsi/Makefile.sources | 1 + src/gallium/drivers/radeonsi/r600_hw_context.c | 3 + src/gallium/drivers/radeonsi/r600_resource.h | 1 + src/gallium/drivers/radeonsi/r600_texture.c| 1 + src/gallium/drivers/radeonsi/radeonsi_pipe.c | 9 +- src/gallium/drivers/radeonsi/radeonsi_pipe.h | 6 +- src/gallium/drivers/radeonsi/radeonsi_pm4.c| 7 + src/gallium/drivers/radeonsi/radeonsi_pm4.h| 2 + src/gallium/drivers/radeonsi/si_descriptors.c | 188 + src/gallium/drivers/radeonsi/si_state.c| 58 +++- src/gallium/drivers/radeonsi/si_state.h| 36 + 11 files changed, 305 insertions(+), 7 deletions(-) create mode 100644 src/gallium/drivers/radeonsi/si_descriptors.c diff --git a/src/gallium/drivers/radeonsi/Makefile.sources b/src/gallium/drivers/radeonsi/Makefile.sources index b3ffa72..68c8282 100644 --- a/src/gallium/drivers/radeonsi/Makefile.sources +++ b/src/gallium/drivers/radeonsi/Makefile.sources @@ -10,6 +10,7 @@ C_SOURCES := \ r600_translate.c \ radeonsi_pm4.c \ radeonsi_compute.c \ + si_descriptors.c \ si_state.c \ si_state_streamout.c \ si_state_draw.c \ diff --git a/src/gallium/drivers/radeonsi/r600_hw_context.c b/src/gallium/drivers/radeonsi/r600_hw_context.c index 7ed7496..b595477 100644 --- a/src/gallium/drivers/radeonsi/r600_hw_context.c +++ b/src/gallium/drivers/radeonsi/r600_hw_context.c @@ -289,6 +289,9 @@ void si_context_flush(struct r600_context *ctx, unsigned flags) * next draw command */ si_pm4_reset_emitted(ctx); + + si_sampler_views_begin_new_cs(ctx, ctx-fmask_sampler_views[PIPE_SHADER_VERTEX]); + si_sampler_views_begin_new_cs(ctx, ctx-fmask_sampler_views[PIPE_SHADER_FRAGMENT]); } void si_context_emit_fence(struct r600_context *ctx, struct si_resource *fence_bo, unsigned offset, unsigned value) diff --git a/src/gallium/drivers/radeonsi/r600_resource.h b/src/gallium/drivers/radeonsi/r600_resource.h index e5dd36a..ab5c7b7 100644 --- a/src/gallium/drivers/radeonsi/r600_resource.h +++ b/src/gallium/drivers/radeonsi/r600_resource.h @@ -44,6 +44,7 @@ struct r600_fmask_info { unsigned offset; unsigned size; unsigned alignment; + unsigned pitch; unsigned bank_height; unsigned slice_tile_max; unsigned tile_mode_index; diff --git a/src/gallium/drivers/radeonsi/r600_texture.c b/src/gallium/drivers/radeonsi/r600_texture.c index cd3d1aa..b613564 100644 --- a/src/gallium/drivers/radeonsi/r600_texture.c +++ b/src/gallium/drivers/radeonsi/r600_texture.c @@ -463,6 +463,7 @@ static void r600_texture_get_fmask_info(struct r600_screen *rscreen,
Re: [Mesa-dev] [PATCH 15/20] radeonsi: add basic infrastructure for atom-based states
Am 08.08.2013 14:02, schrieb Marek Olšák: Why is the order so important? Note that cache flushes shouldn't be treated like states and neither should the draw packets and any state that comes from pipe_draw_info. The only things left are the register updates and resource updates where the order doesn't matter, does it? Unfortunately it does. In theory the CP context should prevent us from dealing with such details and allow us to emit the resource updates and register writes in any order, but AFAIK in practice we have some erratas on this (that reminds me to remind Alex to show you the documents on that). Anyway, the order is determined by the order of si_add_atom calls, which should be done once in create_context. Then why don't you just use a constant structure for this? Christian. Marek On Thu, Aug 8, 2013 at 10:32 AM, Christian König deathsim...@vodafone.de wrote: Am 08.08.2013 02:20, schrieb Marek Olšák: It's the same as in r600g. Look how simple it is. That concept has the problem that we don't necessary know in which order the state is emitted. Why not just add an emit callback to si_pm4_state for the short term instead? For the long term we should probably teach the kernel interface to accept a bunch of pointers to smaller IB fragments instead of one large IB. That would allow us to not only save the copy of commands for the CSO states, but also allows us to emit optional IB fragments that are only inserted if the we had a context switch in between. Christian. --- src/gallium/drivers/radeonsi/r600_hw_context.c | 8 src/gallium/drivers/radeonsi/radeonsi_pipe.h | 9 + src/gallium/drivers/radeonsi/si_state.h| 10 ++ src/gallium/drivers/radeonsi/si_state_draw.c | 8 +++- 4 files changed, 34 insertions(+), 1 deletion(-) diff --git a/src/gallium/drivers/radeonsi/r600_hw_context.c b/src/gallium/drivers/radeonsi/r600_hw_context.c index 382382b..7ed7496 100644 --- a/src/gallium/drivers/radeonsi/r600_hw_context.c +++ b/src/gallium/drivers/radeonsi/r600_hw_context.c @@ -114,9 +114,17 @@ err: void si_need_cs_space(struct r600_context *ctx, unsigned num_dw, boolean count_draw_in) { + int i; + /* The number of dwords we already used in the CS so far. */ num_dw += ctx-cs-cdw; + for (i = 0; i ctx-num_atoms; i++) { + if (ctx-atoms[i]-dirty) { + num_dw += ctx-atoms[i]-num_dw; + } + } + if (count_draw_in) { /* The number of dwords all the dirty states would take. */ num_dw += ctx-pm4_dirty_cdwords; diff --git a/src/gallium/drivers/radeonsi/radeonsi_pipe.h b/src/gallium/drivers/radeonsi/radeonsi_pipe.h index e370149..5fa9bdc 100644 --- a/src/gallium/drivers/radeonsi/radeonsi_pipe.h +++ b/src/gallium/drivers/radeonsi/radeonsi_pipe.h @@ -145,6 +145,10 @@ struct r600_context { void*custom_blend_decompress; struct r600_screen *screen; struct radeon_winsys*ws; + + struct si_atom *atoms[SI_MAX_ATOMS]; + unsignednum_atoms; + struct si_vertex_element*vertex_elements; struct pipe_framebuffer_state framebuffer; unsignedfb_log_samples; @@ -329,4 +333,9 @@ static INLINE uint64_t r600_resource_va(struct pipe_screen *screen, struct pipe_ return rscreen-ws-buffer_get_virtual_address(rresource-cs_buf); } +static INLINE void si_add_atom(struct r600_context *rctx, struct si_atom *atom) +{ + rctx-atoms[rctx-num_atoms++] = atom; +} + #endif diff --git a/src/gallium/drivers/radeonsi/si_state.h b/src/gallium/drivers/radeonsi/si_state.h index b01fbf2..4aabdef 100644 --- a/src/gallium/drivers/radeonsi/si_state.h +++ b/src/gallium/drivers/radeonsi/si_state.h @@ -29,6 +29,16 @@ #include radeonsi_pm4.h +#define SI_MAX_ATOMS 2 + +/* This encapsulates a state or an operation which can emitted into the GPU + * command stream. */ +struct si_atom { + void (*emit)(struct r600_context *ctx, struct si_atom *state); + unsignednum_dw; + booldirty; +}; + struct si_state_blend { struct si_pm4_state pm4; uint32_tcb_target_mask; diff --git a/src/gallium/drivers/radeonsi/si_state_draw.c b/src/gallium/drivers/radeonsi/si_state_draw.c index 4208fa7..bcae778 100644 --- a/src/gallium/drivers/radeonsi/si_state_draw.c +++ b/src/gallium/drivers/radeonsi/si_state_draw.c @@ -664,7 +664,7 @@ void si_draw_vbo(struct pipe_context *ctx, const struct pipe_draw_info *info) { struct r600_context *rctx = (struct r600_context *)ctx; struct pipe_index_buffer ib = {}; - uint32_t cp_coher_cntl; + uint32_t cp_coher_cntl, i; if (!info-count (info-indexed ||
Re: [Mesa-dev] [PATCH 16/20] radeonsi: add FMASK texture binding slots and resource setup
Am 08.08.2013 14:38, schrieb Marek Olšák: .On Thu, Aug 8, 2013 at 9:47 AM, Christian König deathsim...@vodafone.de wrote: Am 08.08.2013 02:20, schrieb Marek Olšák: FMASK is bound as a separate texture. For every texture, there can be an FMASK. Therefore a separate array of resource slots has to be added. This adds a new mechanism for emitting resource descriptors, its features are: - resource descriptors are stored in an ordinary buffer (not in a CS) Having resource descriptors outside of the CS has two problems that we need to solve first: 1. Fine grained descriptor updates doesn't work, I already tried that. The problem is that unlike previous asics descriptors are now a memory block, so no longer part of the CP context. So when we (for example) have a draw command executing and the next draw command is using new resources for a specific slot we would either block until the first draw command is finished (which is bad for performance) or change the descriptors while they are still in use (which results in VM faults). So what would the proper solution be here? Do I need to flush some caches or would moving the descriptor updates to the constant IB fix that? Actually the current implementation worked better than anything else I tried. When you really need the resource descriptors in a separate buffer you need to use one buffer for each draw call and always write the full buffer contents (no partial updates). Flushing anything won't really help either. The only solution I see using one buffer is to block until the last draw call is finished with WAIT_REG_MEM, but that would be quite disastrous for performance. 2. If my understand is correct when they are embedded the descriptors are preloaded into the caches while executing the IB, so to archive the same speed with descriptors outside of the IB you need to add additional commands to the constant IB which is new to SI and we currently doesn't support in the CS interface. There seems to be support for the constant IB. The CS ioctl chunk ID is RADEON_CHUNK_ID_CONST_IB and the allowed packets are listed in si_vm_packet3_ce_check. Is there anything missing? The userspace side seems to be missing and except for throwing NOP packets into it we never tested it. I know from the closed source side that it actually was quite tricky for them to get working. Additional to that please note that I'm not 100% sure that just putting the descriptors into the IB is really helping here. It was just the most simplest solution to avoid allocating a new buffer on each draw call. Cheers, Christian. Marek Regards, Christian. - descriptors of disabled resources are set to zeros - fine-grained resource updates (it can update one resource slot while not touching the other slots) - updates are done with the WRITE_DATA packet - it implements the si_atom interface for packet emission - only used for FMASK textures right now The primary motivation for this is that FMASK textures naturally need fine-grained resource updates and I also need to query in the shader if a resource is NULL. --- src/gallium/drivers/radeonsi/Makefile.sources | 1 + src/gallium/drivers/radeonsi/r600_hw_context.c | 3 + src/gallium/drivers/radeonsi/r600_resource.h | 1 + src/gallium/drivers/radeonsi/r600_texture.c| 1 + src/gallium/drivers/radeonsi/radeonsi_pipe.c | 9 +- src/gallium/drivers/radeonsi/radeonsi_pipe.h | 6 +- src/gallium/drivers/radeonsi/radeonsi_pm4.c| 7 + src/gallium/drivers/radeonsi/radeonsi_pm4.h| 2 + src/gallium/drivers/radeonsi/si_descriptors.c | 188 + src/gallium/drivers/radeonsi/si_state.c| 58 +++- src/gallium/drivers/radeonsi/si_state.h| 36 + 11 files changed, 305 insertions(+), 7 deletions(-) create mode 100644 src/gallium/drivers/radeonsi/si_descriptors.c diff --git a/src/gallium/drivers/radeonsi/Makefile.sources b/src/gallium/drivers/radeonsi/Makefile.sources index b3ffa72..68c8282 100644 --- a/src/gallium/drivers/radeonsi/Makefile.sources +++ b/src/gallium/drivers/radeonsi/Makefile.sources @@ -10,6 +10,7 @@ C_SOURCES := \ r600_translate.c \ radeonsi_pm4.c \ radeonsi_compute.c \ + si_descriptors.c \ si_state.c \ si_state_streamout.c \ si_state_draw.c \ diff --git a/src/gallium/drivers/radeonsi/r600_hw_context.c b/src/gallium/drivers/radeonsi/r600_hw_context.c index 7ed7496..b595477 100644 --- a/src/gallium/drivers/radeonsi/r600_hw_context.c +++ b/src/gallium/drivers/radeonsi/r600_hw_context.c @@ -289,6 +289,9 @@ void si_context_flush(struct r600_context *ctx, unsigned flags) * next draw command */ si_pm4_reset_emitted(ctx); + + si_sampler_views_begin_new_cs(ctx, ctx-fmask_sampler_views[PIPE_SHADER_VERTEX]); + si_sampler_views_begin_new_cs(ctx, ctx-fmask_sampler_views[PIPE_SHADER_FRAGMENT]); } void
Re: [Mesa-dev] [PATCH 15/20] radeonsi: add basic infrastructure for atom-based states
On Thu, Aug 8, 2013 at 2:55 PM, Christian König deathsim...@vodafone.de wrote: Am 08.08.2013 14:02, schrieb Marek Olšák: Why is the order so important? Note that cache flushes shouldn't be treated like states and neither should the draw packets and any state that comes from pipe_draw_info. The only things left are the register updates and resource updates where the order doesn't matter, does it? Unfortunately it does. In theory the CP context should prevent us from dealing with such details and allow us to emit the resource updates and register writes in any order, but AFAIK in practice we have some erratas on this (that reminds me to remind Alex to show you the documents on that). Anyway, the order is determined by the order of si_add_atom calls, which should be done once in create_context. Then why don't you just use a constant structure for this? Yes, it indeed looks like a good idea to do so for clarity now that I know the order does matter. Marek Christian. Marek On Thu, Aug 8, 2013 at 10:32 AM, Christian König deathsim...@vodafone.de wrote: Am 08.08.2013 02:20, schrieb Marek Olšák: It's the same as in r600g. Look how simple it is. That concept has the problem that we don't necessary know in which order the state is emitted. Why not just add an emit callback to si_pm4_state for the short term instead? For the long term we should probably teach the kernel interface to accept a bunch of pointers to smaller IB fragments instead of one large IB. That would allow us to not only save the copy of commands for the CSO states, but also allows us to emit optional IB fragments that are only inserted if the we had a context switch in between. Christian. --- src/gallium/drivers/radeonsi/r600_hw_context.c | 8 src/gallium/drivers/radeonsi/radeonsi_pipe.h | 9 + src/gallium/drivers/radeonsi/si_state.h| 10 ++ src/gallium/drivers/radeonsi/si_state_draw.c | 8 +++- 4 files changed, 34 insertions(+), 1 deletion(-) diff --git a/src/gallium/drivers/radeonsi/r600_hw_context.c b/src/gallium/drivers/radeonsi/r600_hw_context.c index 382382b..7ed7496 100644 --- a/src/gallium/drivers/radeonsi/r600_hw_context.c +++ b/src/gallium/drivers/radeonsi/r600_hw_context.c @@ -114,9 +114,17 @@ err: void si_need_cs_space(struct r600_context *ctx, unsigned num_dw, boolean count_draw_in) { + int i; + /* The number of dwords we already used in the CS so far. */ num_dw += ctx-cs-cdw; + for (i = 0; i ctx-num_atoms; i++) { + if (ctx-atoms[i]-dirty) { + num_dw += ctx-atoms[i]-num_dw; + } + } + if (count_draw_in) { /* The number of dwords all the dirty states would take. */ num_dw += ctx-pm4_dirty_cdwords; diff --git a/src/gallium/drivers/radeonsi/radeonsi_pipe.h b/src/gallium/drivers/radeonsi/radeonsi_pipe.h index e370149..5fa9bdc 100644 --- a/src/gallium/drivers/radeonsi/radeonsi_pipe.h +++ b/src/gallium/drivers/radeonsi/radeonsi_pipe.h @@ -145,6 +145,10 @@ struct r600_context { void*custom_blend_decompress; struct r600_screen *screen; struct radeon_winsys*ws; + + struct si_atom *atoms[SI_MAX_ATOMS]; + unsignednum_atoms; + struct si_vertex_element*vertex_elements; struct pipe_framebuffer_state framebuffer; unsignedfb_log_samples; @@ -329,4 +333,9 @@ static INLINE uint64_t r600_resource_va(struct pipe_screen *screen, struct pipe_ return rscreen-ws-buffer_get_virtual_address(rresource-cs_buf); } +static INLINE void si_add_atom(struct r600_context *rctx, struct si_atom *atom) +{ + rctx-atoms[rctx-num_atoms++] = atom; +} + #endif diff --git a/src/gallium/drivers/radeonsi/si_state.h b/src/gallium/drivers/radeonsi/si_state.h index b01fbf2..4aabdef 100644 --- a/src/gallium/drivers/radeonsi/si_state.h +++ b/src/gallium/drivers/radeonsi/si_state.h @@ -29,6 +29,16 @@ #include radeonsi_pm4.h +#define SI_MAX_ATOMS 2 + +/* This encapsulates a state or an operation which can emitted into the GPU + * command stream. */ +struct si_atom { + void (*emit)(struct r600_context *ctx, struct si_atom *state); + unsignednum_dw; + booldirty; +}; + struct si_state_blend { struct si_pm4_state pm4; uint32_tcb_target_mask; diff --git a/src/gallium/drivers/radeonsi/si_state_draw.c b/src/gallium/drivers/radeonsi/si_state_draw.c index 4208fa7..bcae778 100644 --- a/src/gallium/drivers/radeonsi/si_state_draw.c +++ b/src/gallium/drivers/radeonsi/si_state_draw.c @@ -664,7 +664,7 @@ void si_draw_vbo(struct
Re: [Mesa-dev] [PATCH 12/20] radeonsi: implement MSAA colorbuffer compression for rendering
On Don, 2013-08-08 at 13:44 +0200, Marek Olšák wrote: Do you mean the surface_sync pm4 or the one in si_state? I mean this: diff --git a/src/gallium/drivers/radeonsi/r600_hw_context.c b/src/gallium/drivers/radeonsi/r600_hw_context.c index 25c972b..d9fc151 100644 --- a/src/gallium/drivers/radeonsi/r600_hw_context.c +++ b/src/gallium/drivers/radeonsi/r600_hw_context.c @@ -176,6 +176,7 @@ static void r600_flush_framebuffer(struct r600_context *ctx) S_0085F0_CB7_DEST_BASE_ENA(1) | S_0085F0_DB_ACTION_ENA(1) | S_0085F0_DB_DEST_BASE_ENA(1)); + si_cmd_flush_and_inv_cb_meta(pm4); si_pm4_emit(ctx, pm4); si_pm4_free_state(ctx, pm4, ~0); On Thu, Aug 8, 2013 at 10:31 AM, Michel Dänzer mic...@daenzer.net wrote: On Don, 2013-08-08 at 02:20 +0200, Marek Olšák wrote: diff --git a/src/gallium/drivers/radeonsi/r600_hw_context.c b/src/gallium/drivers/radeonsi/r600_hw_context.c index 25c972b..382382b 100644 --- a/src/gallium/drivers/radeonsi/r600_hw_context.c +++ b/src/gallium/drivers/radeonsi/r600_hw_context.c @@ -179,6 +179,17 @@ static void r600_flush_framebuffer(struct r600_context *ctx) si_pm4_emit(ctx, pm4); si_pm4_free_state(ctx, pm4, ~0); + /* flush CB_META */ + pm4 = si_pm4_alloc_state(ctx); + + if (pm4 == NULL) + return; + + si_cmd_flush_and_inv_cb_meta(pm4); + si_pm4_emit(ctx, pm4); + si_pm4_free_state(ctx, pm4, ~0); + ctx-flush_and_inv_cb_meta = false; + ctx-flags = ~R600_CONTEXT_DST_CACHES_DIRTY; } Can't you add to the existing pm4 here, instead of allocating a new one? -- Earthling Michel Dänzer | http://www.amd.com Libre software enthusiast | Debian, X and DRI developer ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev -- Earthling Michel Dänzer | http://www.amd.com Libre software enthusiast | Debian, X and DRI developer ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 14/20] radeonsi: reserve a parameter slot for FMASK resources in shaders
On Don, 2013-08-08 at 13:51 +0200, Marek Olšák wrote: I'll test LLVM master, but we should probably bump the LLVM version requirement in configure.ac to prevent users from using LLVM 3.3. That might be a good idea for the Mesa release after 9.2 anyway, but another possibility might be to only enable features requiring LLVM changes when Mesa is built against LLVM 3.4 or newer (there are already some examples of that, guarded by #if HAVE_LLVM = 0x0304), and fixing any problems with running against LLVM 3.3. -- Earthling Michel Dänzer | http://www.amd.com Libre software enthusiast | Debian, X and DRI developer ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 16/20] radeonsi: add FMASK texture binding slots and resource setup
On Thu, Aug 8, 2013 at 3:09 PM, Christian König deathsim...@vodafone.de wrote: Am 08.08.2013 14:38, schrieb Marek Olšák: .On Thu, Aug 8, 2013 at 9:47 AM, Christian König deathsim...@vodafone.de wrote: Am 08.08.2013 02:20, schrieb Marek Olšák: FMASK is bound as a separate texture. For every texture, there can be an FMASK. Therefore a separate array of resource slots has to be added. This adds a new mechanism for emitting resource descriptors, its features are: - resource descriptors are stored in an ordinary buffer (not in a CS) Having resource descriptors outside of the CS has two problems that we need to solve first: 1. Fine grained descriptor updates doesn't work, I already tried that. The problem is that unlike previous asics descriptors are now a memory block, so no longer part of the CP context. So when we (for example) have a draw command executing and the next draw command is using new resources for a specific slot we would either block until the first draw command is finished (which is bad for performance) or change the descriptors while they are still in use (which results in VM faults). So what would the proper solution be here? Do I need to flush some caches or would moving the descriptor updates to the constant IB fix that? Actually the current implementation worked better than anything else I tried. When you really need the resource descriptors in a separate buffer you need to use one buffer for each draw call and always write the full buffer contents (no partial updates). Flushing anything won't really help either. The only solution I see using one buffer is to block until the last draw call is finished with WAIT_REG_MEM, but that would be quite disastrous for performance. 2. If my understand is correct when they are embedded the descriptors are preloaded into the caches while executing the IB, so to archive the same speed with descriptors outside of the IB you need to add additional commands to the constant IB which is new to SI and we currently doesn't support in the CS interface. There seems to be support for the constant IB. The CS ioctl chunk ID is RADEON_CHUNK_ID_CONST_IB and the allowed packets are listed in si_vm_packet3_ce_check. Is there anything missing? The userspace side seems to be missing and except for throwing NOP packets into it we never tested it. I know from the closed source side that it actually was quite tricky for them to get working. Additional to that please note that I'm not 100% sure that just putting the descriptors into the IB is really helping here. It was just the most simplest solution to avoid allocating a new buffer on each draw call. I understand. I don't really need to have resource descriptors in a separate buffer, all I need is these 3 basic features a gallium driver should support: - fine-grained resource updates (mainly for performance, see below) - ability to unbind resources (e.g. by setting IMG_RSRC_WORD1 to 0) - no GPU crash if a shader is using SAMPLER[15] but there are no samplers bound FYI, partial sampler view and sampler state updates are coming to gallium, Brian Paul already has some patches, it's just a matter of time now. Vertex and constant buffer states already support partial updates. Marek Cheers, Christian. Marek Regards, Christian. - descriptors of disabled resources are set to zeros - fine-grained resource updates (it can update one resource slot while not touching the other slots) - updates are done with the WRITE_DATA packet - it implements the si_atom interface for packet emission - only used for FMASK textures right now The primary motivation for this is that FMASK textures naturally need fine-grained resource updates and I also need to query in the shader if a resource is NULL. --- src/gallium/drivers/radeonsi/Makefile.sources | 1 + src/gallium/drivers/radeonsi/r600_hw_context.c | 3 + src/gallium/drivers/radeonsi/r600_resource.h | 1 + src/gallium/drivers/radeonsi/r600_texture.c| 1 + src/gallium/drivers/radeonsi/radeonsi_pipe.c | 9 +- src/gallium/drivers/radeonsi/radeonsi_pipe.h | 6 +- src/gallium/drivers/radeonsi/radeonsi_pm4.c| 7 + src/gallium/drivers/radeonsi/radeonsi_pm4.h| 2 + src/gallium/drivers/radeonsi/si_descriptors.c | 188 + src/gallium/drivers/radeonsi/si_state.c| 58 +++- src/gallium/drivers/radeonsi/si_state.h| 36 + 11 files changed, 305 insertions(+), 7 deletions(-) create mode 100644 src/gallium/drivers/radeonsi/si_descriptors.c diff --git a/src/gallium/drivers/radeonsi/Makefile.sources b/src/gallium/drivers/radeonsi/Makefile.sources index b3ffa72..68c8282 100644 --- a/src/gallium/drivers/radeonsi/Makefile.sources +++ b/src/gallium/drivers/radeonsi/Makefile.sources @@ -10,6 +10,7 @@ C_SOURCES := \ r600_translate.c \ radeonsi_pm4.c \
Re: [Mesa-dev] [PATCH 14/20] radeonsi: reserve a parameter slot for FMASK resources in shaders
On Thu, Aug 08, 2013 at 01:51:39PM +0200, Marek Olšák wrote: Interleaving might not be a good idea, but they could be in the same array, like this: 0..15: textures 16..31: FMASK textures I'll test LLVM master, but we should probably bump the LLVM version requirement in configure.ac to prevent users from using LLVM 3.3. What is it that works with LLVM master, but doesn't work with LLVM 3.3? -Tom Marek On Thu, Aug 8, 2013 at 10:37 AM, Christian König deathsim...@vodafone.de wrote: Am 08.08.2013 10:30, schrieb Michel Dänzer: On Don, 2013-08-08 at 02:20 +0200, Marek Olšák wrote: diff --git a/src/gallium/drivers/radeonsi/si_state_draw.c b/src/gallium/drivers/radeonsi/si_state_draw.c index 746ace6..4208fa7 100644 --- a/src/gallium/drivers/radeonsi/si_state_draw.c +++ b/src/gallium/drivers/radeonsi/si_state_draw.c @@ -241,6 +241,7 @@ static void si_pipe_shader_ps(struct pipe_context *ctx, struct si_pipe_shader *s /* Last 2 reserved SGPRs are used for VCC */ num_sgprs = num_user_sgprs + 2; } + num_sgprs += 1; /* XXX this fixes VM faults */ assert(num_sgprs = 104); You can't just increment num_sgprs unconditionally here, or the assertion will spuriously fail again when num_sgprs was 104 to begin with. More fundamentally, I'd really like us to understand why / under what circumstances the value needs to be incremented at all. You mentioned on IRC that you were using an LLVM 3.3 tree, is this change still necessary with LLVM master / trunk? If yes, it would be good to look into at least some example shaders where the change is really necessary, to try and understand the circumstances. Additional to that I'm not sure if we should add another sgpr pointer to the shader, the docs speak of quite a performance penalty for each. Since they are (optional) attributes to the textures anyway they could probably be interleaved with the texture resources. Christian. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 14/20] radeonsi: reserve a parameter slot for FMASK resources in shaders
On Thu, Aug 08, 2013 at 02:20:54AM +0200, Marek Olšák wrote: --- src/gallium/drivers/radeonsi/radeonsi_shader.c | 7 ++-- src/gallium/drivers/radeonsi/radeonsi_shader.h | 58 ++ src/gallium/drivers/radeonsi/si_state_draw.c | 1 + 3 files changed, 36 insertions(+), 30 deletions(-) diff --git a/src/gallium/drivers/radeonsi/radeonsi_shader.c b/src/gallium/drivers/radeonsi/radeonsi_shader.c index 18dde61..2806045 100644 --- a/src/gallium/drivers/radeonsi/radeonsi_shader.c +++ b/src/gallium/drivers/radeonsi/radeonsi_shader.c @@ -1209,7 +1209,7 @@ static void create_function(struct si_shader_context *si_shader_ctx) { struct lp_build_tgsi_context *bld_base = si_shader_ctx-radeon_bld.soa.bld_base; struct gallivm_state *gallivm = bld_base-base.gallivm; - LLVMTypeRef params[20], f32, i8, i32, v2i32, v3i32; + LLVMTypeRef params[SI_MAX_PARAMS], f32, i8, i32, v2i32, v3i32; unsigned i; i8 = LLVMInt8TypeInContext(gallivm-context); @@ -1221,6 +1221,7 @@ static void create_function(struct si_shader_context *si_shader_ctx) params[SI_PARAM_CONST] = LLVMPointerType(LLVMVectorType(i8, 16), CONST_ADDR_SPACE); params[SI_PARAM_SAMPLER] = params[SI_PARAM_CONST]; params[SI_PARAM_RESOURCE] = LLVMPointerType(LLVMVectorType(i8, 32), CONST_ADDR_SPACE); + params[SI_PARAM_FMASK_RESOURCE] = params[SI_PARAM_RESOURCE]; if (si_shader_ctx-type == TGSI_PROCESSOR_VERTEX) { params[SI_PARAM_VERTEX_BUFFER] = params[SI_PARAM_SAMPLER]; @@ -1229,7 +1230,7 @@ static void create_function(struct si_shader_context *si_shader_ctx) params[SI_PARAM_DUMMY_0] = i32; params[SI_PARAM_DUMMY_1] = i32; params[SI_PARAM_INSTANCE_ID] = i32; - radeon_llvm_create_func(si_shader_ctx-radeon_bld, params, 9); + radeon_llvm_create_func(si_shader_ctx-radeon_bld, params, 10); } else { params[SI_PARAM_PRIM_MASK] = i32; @@ -1249,7 +1250,7 @@ static void create_function(struct si_shader_context *si_shader_ctx) params[SI_PARAM_ANCILLARY] = f32; params[SI_PARAM_SAMPLE_COVERAGE] = f32; params[SI_PARAM_POS_FIXED_PT] = f32; - radeon_llvm_create_func(si_shader_ctx-radeon_bld, params, 20); + radeon_llvm_create_func(si_shader_ctx-radeon_bld, params, 21); } radeon_llvm_shader_type(si_shader_ctx-radeon_bld.main_fn, si_shader_ctx-type); diff --git a/src/gallium/drivers/radeonsi/radeonsi_shader.h b/src/gallium/drivers/radeonsi/radeonsi_shader.h index 2ce34b9..836b144 100644 --- a/src/gallium/drivers/radeonsi/radeonsi_shader.h +++ b/src/gallium/drivers/radeonsi/radeonsi_shader.h @@ -34,43 +34,47 @@ #define SI_SGPR_CONST0 #define SI_SGPR_SAMPLER 2 #define SI_SGPR_RESOURCE 4 -#define SI_SGPR_VERTEX_BUFFER6 -#define SI_SGPR_START_INSTANCE 8 +#define SI_SGPR_FMASK_RESOURCE 6 +#define SI_SGPR_VERTEX_BUFFER8 +#define SI_SGPR_START_INSTANCE 10 -#define SI_VS_NUM_USER_SGPR 9 -#define SI_PS_NUM_USER_SGPR 6 +#define SI_VS_NUM_USER_SGPR 11 +#define SI_PS_NUM_USER_SGPR 8 /* LLVM function parameter indices */ #define SI_PARAM_CONST 0 #define SI_PARAM_SAMPLER 1 #define SI_PARAM_RESOURCE2 +#define SI_PARAM_FMASK_RESOURCE 3 /* VS only parameters */ -#define SI_PARAM_VERTEX_BUFFER 3 -#define SI_PARAM_START_INSTANCE 4 -#define SI_PARAM_VERTEX_ID 5 -#define SI_PARAM_DUMMY_0 6 -#define SI_PARAM_DUMMY_1 7 -#define SI_PARAM_INSTANCE_ID 8 +#define SI_PARAM_VERTEX_BUFFER 4 +#define SI_PARAM_START_INSTANCE 5 +#define SI_PARAM_VERTEX_ID 6 +#define SI_PARAM_DUMMY_0 7 +#define SI_PARAM_DUMMY_1 8 +#define SI_PARAM_INSTANCE_ID 9 /* PS only parameters */ -#define SI_PARAM_PRIM_MASK 3 -#define SI_PARAM_PERSP_SAMPLE4 -#define SI_PARAM_PERSP_CENTER5 -#define SI_PARAM_PERSP_CENTROID 6 -#define SI_PARAM_PERSP_PULL_MODEL7 -#define SI_PARAM_LINEAR_SAMPLE 8 -#define SI_PARAM_LINEAR_CENTER 9 -#define SI_PARAM_LINEAR_CENTROID 10 -#define SI_PARAM_LINE_STIPPLE_TEX11 -#define SI_PARAM_POS_X_FLOAT 12 -#define SI_PARAM_POS_Y_FLOAT 13 -#define SI_PARAM_POS_Z_FLOAT 14 -#define SI_PARAM_POS_W_FLOAT 15 -#define SI_PARAM_FRONT_FACE 16 -#define SI_PARAM_ANCILLARY 17 -#define SI_PARAM_SAMPLE_COVERAGE 18 -#define SI_PARAM_POS_FIXED_PT19 +#define SI_PARAM_PRIM_MASK 4 +#define SI_PARAM_PERSP_SAMPLE5 +#define SI_PARAM_PERSP_CENTER6 +#define SI_PARAM_PERSP_CENTROID 7 +#define SI_PARAM_PERSP_PULL_MODEL8 +#define SI_PARAM_LINEAR_SAMPLE 9 +#define
Re: [Mesa-dev] [PATCH 14/20] radeonsi: reserve a parameter slot for FMASK resources in shaders
On Thu, Aug 08, 2013 at 02:20:54AM +0200, Marek Olšák wrote: --- src/gallium/drivers/radeonsi/radeonsi_shader.c | 7 ++-- src/gallium/drivers/radeonsi/radeonsi_shader.h | 58 ++ src/gallium/drivers/radeonsi/si_state_draw.c | 1 + 3 files changed, 36 insertions(+), 30 deletions(-) diff --git a/src/gallium/drivers/radeonsi/radeonsi_shader.c b/src/gallium/drivers/radeonsi/radeonsi_shader.c index 18dde61..2806045 100644 --- a/src/gallium/drivers/radeonsi/radeonsi_shader.c +++ b/src/gallium/drivers/radeonsi/radeonsi_shader.c @@ -1209,7 +1209,7 @@ static void create_function(struct si_shader_context *si_shader_ctx) { struct lp_build_tgsi_context *bld_base = si_shader_ctx-radeon_bld.soa.bld_base; struct gallivm_state *gallivm = bld_base-base.gallivm; - LLVMTypeRef params[20], f32, i8, i32, v2i32, v3i32; + LLVMTypeRef params[SI_MAX_PARAMS], f32, i8, i32, v2i32, v3i32; unsigned i; i8 = LLVMInt8TypeInContext(gallivm-context); @@ -1221,6 +1221,7 @@ static void create_function(struct si_shader_context *si_shader_ctx) params[SI_PARAM_CONST] = LLVMPointerType(LLVMVectorType(i8, 16), CONST_ADDR_SPACE); params[SI_PARAM_SAMPLER] = params[SI_PARAM_CONST]; params[SI_PARAM_RESOURCE] = LLVMPointerType(LLVMVectorType(i8, 32), CONST_ADDR_SPACE); + params[SI_PARAM_FMASK_RESOURCE] = params[SI_PARAM_RESOURCE]; if (si_shader_ctx-type == TGSI_PROCESSOR_VERTEX) { params[SI_PARAM_VERTEX_BUFFER] = params[SI_PARAM_SAMPLER]; @@ -1229,7 +1230,7 @@ static void create_function(struct si_shader_context *si_shader_ctx) params[SI_PARAM_DUMMY_0] = i32; params[SI_PARAM_DUMMY_1] = i32; params[SI_PARAM_INSTANCE_ID] = i32; - radeon_llvm_create_func(si_shader_ctx-radeon_bld, params, 9); + radeon_llvm_create_func(si_shader_ctx-radeon_bld, params, 10); } else { params[SI_PARAM_PRIM_MASK] = i32; @@ -1249,7 +1250,7 @@ static void create_function(struct si_shader_context *si_shader_ctx) params[SI_PARAM_ANCILLARY] = f32; params[SI_PARAM_SAMPLE_COVERAGE] = f32; params[SI_PARAM_POS_FIXED_PT] = f32; - radeon_llvm_create_func(si_shader_ctx-radeon_bld, params, 20); + radeon_llvm_create_func(si_shader_ctx-radeon_bld, params, 21); } radeon_llvm_shader_type(si_shader_ctx-radeon_bld.main_fn, si_shader_ctx-type); diff --git a/src/gallium/drivers/radeonsi/radeonsi_shader.h b/src/gallium/drivers/radeonsi/radeonsi_shader.h index 2ce34b9..836b144 100644 --- a/src/gallium/drivers/radeonsi/radeonsi_shader.h +++ b/src/gallium/drivers/radeonsi/radeonsi_shader.h @@ -34,43 +34,47 @@ #define SI_SGPR_CONST0 #define SI_SGPR_SAMPLER 2 #define SI_SGPR_RESOURCE 4 -#define SI_SGPR_VERTEX_BUFFER6 -#define SI_SGPR_START_INSTANCE 8 +#define SI_SGPR_FMASK_RESOURCE 6 +#define SI_SGPR_VERTEX_BUFFER8 +#define SI_SGPR_START_INSTANCE 10 -#define SI_VS_NUM_USER_SGPR 9 -#define SI_PS_NUM_USER_SGPR 6 +#define SI_VS_NUM_USER_SGPR 11 +#define SI_PS_NUM_USER_SGPR 8 /* LLVM function parameter indices */ #define SI_PARAM_CONST 0 #define SI_PARAM_SAMPLER 1 #define SI_PARAM_RESOURCE2 +#define SI_PARAM_FMASK_RESOURCE 3 /* VS only parameters */ -#define SI_PARAM_VERTEX_BUFFER 3 -#define SI_PARAM_START_INSTANCE 4 -#define SI_PARAM_VERTEX_ID 5 -#define SI_PARAM_DUMMY_0 6 -#define SI_PARAM_DUMMY_1 7 -#define SI_PARAM_INSTANCE_ID 8 +#define SI_PARAM_VERTEX_BUFFER 4 +#define SI_PARAM_START_INSTANCE 5 +#define SI_PARAM_VERTEX_ID 6 +#define SI_PARAM_DUMMY_0 7 +#define SI_PARAM_DUMMY_1 8 +#define SI_PARAM_INSTANCE_ID 9 /* PS only parameters */ -#define SI_PARAM_PRIM_MASK 3 -#define SI_PARAM_PERSP_SAMPLE4 -#define SI_PARAM_PERSP_CENTER5 -#define SI_PARAM_PERSP_CENTROID 6 -#define SI_PARAM_PERSP_PULL_MODEL7 -#define SI_PARAM_LINEAR_SAMPLE 8 -#define SI_PARAM_LINEAR_CENTER 9 -#define SI_PARAM_LINEAR_CENTROID 10 -#define SI_PARAM_LINE_STIPPLE_TEX11 -#define SI_PARAM_POS_X_FLOAT 12 -#define SI_PARAM_POS_Y_FLOAT 13 -#define SI_PARAM_POS_Z_FLOAT 14 -#define SI_PARAM_POS_W_FLOAT 15 -#define SI_PARAM_FRONT_FACE 16 -#define SI_PARAM_ANCILLARY 17 -#define SI_PARAM_SAMPLE_COVERAGE 18 -#define SI_PARAM_POS_FIXED_PT19 +#define SI_PARAM_PRIM_MASK 4 +#define SI_PARAM_PERSP_SAMPLE5 +#define SI_PARAM_PERSP_CENTER6 +#define SI_PARAM_PERSP_CENTROID 7 +#define SI_PARAM_PERSP_PULL_MODEL8 +#define SI_PARAM_LINEAR_SAMPLE 9 +#define
Re: [Mesa-dev] [PATCH 14/20] radeonsi: reserve a parameter slot for FMASK resources in shaders
On Don, 2013-08-08 at 08:02 -0700, Tom Stellard wrote: On Thu, Aug 08, 2013 at 01:51:39PM +0200, Marek Olšák wrote: Interleaving might not be a good idea, but they could be in the same array, like this: 0..15: textures 16..31: FMASK textures I'll test LLVM master, but we should probably bump the LLVM version requirement in configure.ac to prevent users from using LLVM 3.3. What is it that works with LLVM master, but doesn't work with LLVM 3.3? Apart from MSAA, at least (looking at #if HAVE_LLVM = 0x0304 guards in radeonsi_shader.c) TGSI_OPCODE_DDX/Y and TGSI_OPCODE_TXD. So, I do wonder if it's worth allowing the R600 backend from LLVM 3.3 for the Mesa release after 9.2? -- Earthling Michel Dänzer | http://www.amd.com Libre software enthusiast | Debian, X and DRI developer ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] gallivm: set non-existing values really to zero in size queries for d3d10
From: Roland Scheidegger srol...@vmware.com My previous attempt at doing so double-failed miserably (minification of zero still gives one, and even if it would not the value was never written anyway). While here also rename the confusingly named int_vec bld as we have int vecs of different sizes, and rename need_nr_mips (as this also changes out-of-bounds behavior) to is_sviewinfo too. --- src/gallium/auxiliary/draw/draw_llvm_sample.c |4 +-- src/gallium/auxiliary/gallivm/lp_bld_sample.h |2 +- src/gallium/auxiliary/gallivm/lp_bld_sample_soa.c | 34 ++--- src/gallium/drivers/llvmpipe/lp_tex_sample.c |4 +-- 4 files changed, 22 insertions(+), 22 deletions(-) diff --git a/src/gallium/auxiliary/draw/draw_llvm_sample.c b/src/gallium/auxiliary/draw/draw_llvm_sample.c index f10cba3..97b0255 100644 --- a/src/gallium/auxiliary/draw/draw_llvm_sample.c +++ b/src/gallium/auxiliary/draw/draw_llvm_sample.c @@ -271,7 +271,7 @@ draw_llvm_sampler_soa_emit_size_query(const struct lp_build_sampler_soa *base, struct lp_type type, unsigned texture_unit, unsigned target, - boolean need_nr_mips, + boolean is_sviewinfo, boolean scalar_lod, LLVMValueRef explicit_lod, /* optional */ LLVMValueRef *sizes_out) @@ -286,7 +286,7 @@ draw_llvm_sampler_soa_emit_size_query(const struct lp_build_sampler_soa *base, type, texture_unit, target, - need_nr_mips, + is_sviewinfo, scalar_lod, explicit_lod, sizes_out); diff --git a/src/gallium/auxiliary/gallivm/lp_bld_sample.h b/src/gallium/auxiliary/gallivm/lp_bld_sample.h index db3ea1d..75e8c59 100644 --- a/src/gallium/auxiliary/gallivm/lp_bld_sample.h +++ b/src/gallium/auxiliary/gallivm/lp_bld_sample.h @@ -498,7 +498,7 @@ lp_build_size_query_soa(struct gallivm_state *gallivm, struct lp_type int_type, unsigned texture_unit, unsigned target, -boolean need_nr_mips, +boolean is_viewinfo, boolean scalar_lod, LLVMValueRef explicit_lod, LLVMValueRef *sizes_out); diff --git a/src/gallium/auxiliary/gallivm/lp_bld_sample_soa.c b/src/gallium/auxiliary/gallivm/lp_bld_sample_soa.c index e403ac8..65d6e7b 100644 --- a/src/gallium/auxiliary/gallivm/lp_bld_sample_soa.c +++ b/src/gallium/auxiliary/gallivm/lp_bld_sample_soa.c @@ -1944,7 +1944,7 @@ lp_build_size_query_soa(struct gallivm_state *gallivm, struct lp_type int_type, unsigned texture_unit, unsigned target, -boolean need_nr_mips, +boolean is_sviewinfo, boolean scalar_lod, LLVMValueRef explicit_lod, LLVMValueRef *sizes_out) @@ -1954,7 +1954,7 @@ lp_build_size_query_soa(struct gallivm_state *gallivm, int dims, i; boolean has_array; unsigned num_lods = 1; - struct lp_build_context bld_int_vec; + struct lp_build_context bld_int_vec4; /* * Do some sanity verification about bound texture and shader dcl target. @@ -1997,24 +1997,19 @@ lp_build_size_query_soa(struct gallivm_state *gallivm, assert(!int_type.floating); - lp_build_context_init(bld_int_vec, gallivm, lp_type_int_vec(32, 128)); + lp_build_context_init(bld_int_vec4, gallivm, lp_type_int_vec(32, 128)); if (explicit_lod) { /* FIXME: this needs to honor per-element lod */ lod = LLVMBuildExtractElement(gallivm-builder, explicit_lod, lp_build_const_int32(gallivm, 0), ); first_level = dynamic_state-first_level(dynamic_state, gallivm, texture_unit); level = LLVMBuildAdd(gallivm-builder, lod, first_level, level); - lod = lp_build_broadcast_scalar(bld_int_vec, level); + lod = lp_build_broadcast_scalar(bld_int_vec4, level); } else { - lod = bld_int_vec.zero; + lod = bld_int_vec4.zero; } - if (need_nr_mips) { - size = bld_int_vec.zero; - } - else { - size = bld_int_vec.undef; - } + size = bld_int_vec4.undef; size = LLVMBuildInsertElement(gallivm-builder, size, dynamic_state-width(dynamic_state, gallivm, texture_unit), @@ -2032,7 +2027,7 @@ lp_build_size_query_soa(struct gallivm_state *gallivm, lp_build_const_int32(gallivm, 2), ); } - size =
Re: [Mesa-dev] [PATCH 14/20] radeonsi: reserve a parameter slot for FMASK resources in shaders
On Don, 2013-08-08 at 08:00 -0700, Tom Stellard wrote: On Thu, Aug 08, 2013 at 02:20:54AM +0200, Marek Olšák wrote: diff --git a/src/gallium/drivers/radeonsi/si_state_draw.c b/src/gallium/drivers/radeonsi/si_state_draw.c index 746ace6..4208fa7 100644 --- a/src/gallium/drivers/radeonsi/si_state_draw.c +++ b/src/gallium/drivers/radeonsi/si_state_draw.c @@ -241,6 +241,7 @@ static void si_pipe_shader_ps(struct pipe_context *ctx, struct si_pipe_shader *s /* Last 2 reserved SGPRs are used for VCC */ num_sgprs = num_user_sgprs + 2; } + num_sgprs += 1; /* XXX this fixes VM faults */ One problem is that the compiler is under reporting the number of SGPRs, when there are unused USER_SGPRs in the shader. It should always be reporting a number greater than or equal to the number of USER_SGPRs. That itself shouldn't be a problem, as the radeonsi code ensures this. I think Michel mentioned this earlier, but there may also be a problem with the way we determine usage of the VCC register in the shader, maybe it is being used for more instructions than we realize. I was just worrying that AMDGPUAsmPrinter::EmitProgramInfoSI() might not take implicit access to VCC into account, but Christian confirmed it should. AFAICT we should be encoding any such implicit access correctly (except for V_DIV_SCALE_F* and V_DIV_FMAS_F*, which aren't used yet). -- Earthling Michel Dänzer | http://www.amd.com Libre software enthusiast | Debian, X and DRI developer ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 16/20] radeonsi: add FMASK texture binding slots and resource setup
Am 08.08.2013 16:33, schrieb Marek Olšák: On Thu, Aug 8, 2013 at 3:09 PM, Christian König deathsim...@vodafone.de wrote: Am 08.08.2013 14:38, schrieb Marek Olšák: .On Thu, Aug 8, 2013 at 9:47 AM, Christian König deathsim...@vodafone.de wrote: Am 08.08.2013 02:20, schrieb Marek Olšák: FMASK is bound as a separate texture. For every texture, there can be an FMASK. Therefore a separate array of resource slots has to be added. This adds a new mechanism for emitting resource descriptors, its features are: - resource descriptors are stored in an ordinary buffer (not in a CS) Having resource descriptors outside of the CS has two problems that we need to solve first: 1. Fine grained descriptor updates doesn't work, I already tried that. The problem is that unlike previous asics descriptors are now a memory block, so no longer part of the CP context. So when we (for example) have a draw command executing and the next draw command is using new resources for a specific slot we would either block until the first draw command is finished (which is bad for performance) or change the descriptors while they are still in use (which results in VM faults). So what would the proper solution be here? Do I need to flush some caches or would moving the descriptor updates to the constant IB fix that? Actually the current implementation worked better than anything else I tried. When you really need the resource descriptors in a separate buffer you need to use one buffer for each draw call and always write the full buffer contents (no partial updates). Flushing anything won't really help either.. The only solution I see using one buffer is to block until the last draw call is finished with WAIT_REG_MEM, but that would be quite disastrous for performance. 2. If my understand is correct when they are embedded the descriptors are preloaded into the caches while executing the IB, so to archive the same speed with descriptors outside of the IB you need to add additional commands to the constant IB which is new to SI and we currently doesn't support in the CS interface. There seems to be support for the constant IB. The CS ioctl chunk ID is RADEON_CHUNK_ID_CONST_IB and the allowed packets are listed in si_vm_packet3_ce_check. Is there anything missing? The userspace side seems to be missing and except for throwing NOP packets into it we never tested it. I know from the closed source side that it actually was quite tricky for them to get working. Additional to that please note that I'm not 100% sure that just putting the descriptors into the IB is really helping here. It was just the most simplest solution to avoid allocating a new buffer on each draw call. I understand. I don't really need to have resource descriptors in a separate buffer, all I need is these 3 basic features a gallium driver should support: - fine-grained resource updates (mainly for performance, see below) - ability to unbind resources (e.g. by setting IMG_RSRC_WORD1 to 0) - no GPU crash if a shader is using SAMPLER[15] but there are no samplers bound FYI, partial sampler view and sampler state updates are coming to gallium, Brian Paul already has some patches, it's just a matter of time now. Vertex and constant buffer states already support partial updates. That shouldn't be to much off a problem. Just allocate a state at startup and initialize it with the proper pm4 commands for 16 samplers, then update the resource descriptors in that state when we change the bound textures/samplers/views/constants/whatever. All we need to do then is setting the emitted state to NULL so that it gets re-emitted in the next draw command. Christian. Marek Cheers, Christian. Marek Regards, Christian. - descriptors of disabled resources are set to zeros - fine-grained resource updates (it can update one resource slot while not touching the other slots) - updates are done with the WRITE_DATA packet - it implements the si_atom interface for packet emission - only used for FMASK textures right now The primary motivation for this is that FMASK textures naturally need fine-grained resource updates and I also need to query in the shader if a resource is NULL. --- src/gallium/drivers/radeonsi/Makefile.sources | 1 + src/gallium/drivers/radeonsi/r600_hw_context.c | 3 + src/gallium/drivers/radeonsi/r600_resource.h | 1 + src/gallium/drivers/radeonsi/r600_texture.c| 1 + src/gallium/drivers/radeonsi/radeonsi_pipe.c | 9 +- src/gallium/drivers/radeonsi/radeonsi_pipe.h | 6 +- src/gallium/drivers/radeonsi/radeonsi_pm4.c| 7 + src/gallium/drivers/radeonsi/radeonsi_pm4.h| 2 + src/gallium/drivers/radeonsi/si_descriptors.c | 188 + src/gallium/drivers/radeonsi/si_state.c| 58 +++- src/gallium/drivers/radeonsi/si_state.h| 36 + 11 files changed, 305 insertions(+), 7 deletions(-) create mode 100644
[Mesa-dev] [Bug 67910] New: make check fail on lp_test_arit test
https://bugs.freedesktop.org/show_bug.cgi?id=67910 Priority: medium Bug ID: 67910 Assignee: mesa-dev@lists.freedesktop.org Summary: make check fail on lp_test_arit test Severity: normal Classification: Unclassified OS: Linux (All) Reporter: lordhea...@gmail.com Hardware: x86-64 (AMD64) Status: NEW Version: git Component: Mesa core Product: Mesa * llvm-3.4svn r187791 * mesa-git trunk 43076a55c2c914a619b8b3a3f93e15a7eee6d2ad --output of 'make check'-- make[5]: Entering directory `/home/lordh/archdevel/mesa/git/mesatst/src/gallium/drivers/llvmpipe' PASS: lp_test_format FAIL: lp_test_arit PASS: lp_test_blend PASS: lp_test_conv PASS: lp_test_printf make[6]: Entering directory `/home/lordh/archdevel/mesa/git/mesatst/src/gallium/drivers/llvmpipe' make[6]: Nothing to be done for `all'. make[6]: Leaving directory `/home/lordh/archdevel/mesa/git/mesatst/src/gallium/drivers/llvmpipe' Testsuite summary for Mesa 9.3.0-devel # TOTAL: 5 # PASS: 4 # SKIP: 0 # XFAIL: 0 # FAIL: 1 # XPASS: 0 # ERROR: 0 See src/gallium/drivers/llvmpipe/test-suite.log Please report to https://bugs.freedesktop.org/enter_bug.cgi?product=Mesa make[5]: *** [test-suite.log] Error 1 make[5]: Leaving directory `/home/lordh/archdevel/mesa/git/mesatst/src/gallium/drivers/llvmpipe' --test-suite.log file-- === Mesa 9.3.0-devel: src/gallium/drivers/llvmpipe/test-suite.log === # TOTAL: 5 # PASS: 4 # SKIP: 0 # XFAIL: 0 # FAIL: 1 # XPASS: 0 # ERROR: 0 .. contents:: :depth: 2 FAIL: lp_test_arit == floor(-0): ref = -1, out = 0, precision = -0.00 bits, FAIL ceil(0): ref = 1, out = 0, precision = -0.00 bits, FAIL fract(-0): ref = 0.9994, out = -0, precision = -0.00 bits, FAIL -- You are receiving this mail because: You are the assignee for the bug. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 16/20] radeonsi: add FMASK texture binding slots and resource setup
On Thu, Aug 8, 2013 at 6:57 PM, Christian König deathsim...@vodafone.de wrote: Am 08.08.2013 16:33, schrieb Marek Olšák: On Thu, Aug 8, 2013 at 3:09 PM, Christian König deathsim...@vodafone.de wrote: Am 08.08.2013 14:38, schrieb Marek Olšák: .On Thu, Aug 8, 2013 at 9:47 AM, Christian König deathsim...@vodafone.de wrote: Am 08.08.2013 02:20, schrieb Marek Olšák: FMASK is bound as a separate texture. For every texture, there can be an FMASK. Therefore a separate array of resource slots has to be added. This adds a new mechanism for emitting resource descriptors, its features are: - resource descriptors are stored in an ordinary buffer (not in a CS) Having resource descriptors outside of the CS has two problems that we need to solve first: 1. Fine grained descriptor updates doesn't work, I already tried that. The problem is that unlike previous asics descriptors are now a memory block, so no longer part of the CP context. So when we (for example) have a draw command executing and the next draw command is using new resources for a specific slot we would either block until the first draw command is finished (which is bad for performance) or change the descriptors while they are still in use (which results in VM faults). So what would the proper solution be here? Do I need to flush some caches or would moving the descriptor updates to the constant IB fix that? Actually the current implementation worked better than anything else I tried. When you really need the resource descriptors in a separate buffer you need to use one buffer for each draw call and always write the full buffer contents (no partial updates). Flushing anything won't really help either.. The only solution I see using one buffer is to block until the last draw call is finished with WAIT_REG_MEM, but that would be quite disastrous for performance. 2. If my understand is correct when they are embedded the descriptors are preloaded into the caches while executing the IB, so to archive the same speed with descriptors outside of the IB you need to add additional commands to the constant IB which is new to SI and we currently doesn't support in the CS interface. There seems to be support for the constant IB. The CS ioctl chunk ID is RADEON_CHUNK_ID_CONST_IB and the allowed packets are listed in si_vm_packet3_ce_check. Is there anything missing? The userspace side seems to be missing and except for throwing NOP packets into it we never tested it. I know from the closed source side that it actually was quite tricky for them to get working. Additional to that please note that I'm not 100% sure that just putting the descriptors into the IB is really helping here. It was just the most simplest solution to avoid allocating a new buffer on each draw call. I understand. I don't really need to have resource descriptors in a separate buffer, all I need is these 3 basic features a gallium driver should support: - fine-grained resource updates (mainly for performance, see below) - ability to unbind resources (e.g. by setting IMG_RSRC_WORD1 to 0) - no GPU crash if a shader is using SAMPLER[15] but there are no samplers bound FYI, partial sampler view and sampler state updates are coming to gallium, Brian Paul already has some patches, it's just a matter of time now. Vertex and constant buffer states already support partial updates. That shouldn't be to much off a problem. Just allocate a state at startup and initialize it with the proper pm4 commands for 16 samplers, then update the resource descriptors in that state when we change the bound textures/samplers/views/constants/whatever. All we need to do then is setting the emitted state to NULL so that it gets re-emitted in the next draw command. That would re-emit all 16 shader resources even if just one of them needs to be changed. I was trying to avoid this inefficiency. Is it really impossible to emit just one resource descriptor and keep the others unchanged? This is a basic D3D10/11 feature, for example: void ID3D11DeviceContext::VSSetShaderResources( [in] UINT StartSlot, [in] UINT NumViews, [in] ID3D11ShaderResourceView *const *ppShaderResourceViews ); If the constant engine is required to implement this interface efficiently, then I'd like to work on constant IB support. Marek ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 67910] make check fail on lp_test_arit test
https://bugs.freedesktop.org/show_bug.cgi?id=67910 Roland Scheidegger srol...@vmware.com changed: What|Removed |Added Status|NEW |RESOLVED Resolution|--- |DUPLICATE --- Comment #1 from Roland Scheidegger srol...@vmware.com --- *** This bug has been marked as a duplicate of bug 67672 *** -- You are receiving this mail because: You are the assignee for the bug. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 67672] 9.2 git, Test failure in src/gallium/drivers/llvmpipe
https://bugs.freedesktop.org/show_bug.cgi?id=67672 Roland Scheidegger srol...@vmware.com changed: What|Removed |Added CC||lordhea...@gmail.com --- Comment #14 from Roland Scheidegger srol...@vmware.com --- *** Bug 67910 has been marked as a duplicate of this bug. *** -- You are receiving this mail because: You are the assignee for the bug. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] i965: Remember to call intel_prepare_render() before blitting.
On 08/06/2013 02:36 PM, Kenneth Graunke wrote: Otherwise, blits to the window system buffer may cause crashes, since dst_irb-mt may be NULL. This code is lifted straight out of brw_blorp_framebuffer()'s try_blorp_blit() helper. Fixes crashes in Piglit's fbo-sys-blit on systems without BLORP. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=65919 Signed-off-by: Kenneth Graunke kenn...@whitecape.org Cc: 9.2 mesa-sta...@lists.freedesktop.org Reviewed-by: Ian Romanick ian.d.roman...@intel.com --- src/mesa/drivers/dri/i965/intel_fbo.c | 5 + 1 file changed, 5 insertions(+) diff --git a/src/mesa/drivers/dri/i965/intel_fbo.c b/src/mesa/drivers/dri/i965/intel_fbo.c index fb0bb71..1692325 100644 --- a/src/mesa/drivers/dri/i965/intel_fbo.c +++ b/src/mesa/drivers/dri/i965/intel_fbo.c @@ -695,6 +695,11 @@ intel_blit_framebuffer_with_blitter(struct gl_context *ctx, { struct brw_context *brw = brw_context(ctx); + /* Sync up the state of window system buffers. We need to do this before +* we go looking for the buffers. +*/ + intel_prepare_render(brw); + if (mask GL_COLOR_BUFFER_BIT) { GLint i; const struct gl_framebuffer *drawFb = ctx-DrawBuffer; ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [Mesa-stable] [PATCH] i965: Remember to call intel_prepare_render() before blitting.
On 08/06/2013 02:36 PM, Kenneth Graunke wrote: Otherwise, blits to the window system buffer may cause crashes, since dst_irb-mt may be NULL. This code is lifted straight out of brw_blorp_framebuffer()'s try_blorp_blit() helper. Fixes crashes in Piglit's fbo-sys-blit on systems without BLORP. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=65919 Signed-off-by: Kenneth Graunke kenn...@whitecape.org Cc: 9.2 mesa-sta...@lists.freedesktop.org --- src/mesa/drivers/dri/i965/intel_fbo.c | 5 + 1 file changed, 5 insertions(+) Reviewed-by: Chad Versace chad.vers...@linux.intel.com ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 14/20] radeonsi: reserve a parameter slot for FMASK resources in shaders
On Thu, Aug 08, 2013 at 05:36:09PM +0200, Michel Dänzer wrote: On Don, 2013-08-08 at 08:00 -0700, Tom Stellard wrote: On Thu, Aug 08, 2013 at 02:20:54AM +0200, Marek Olšák wrote: diff --git a/src/gallium/drivers/radeonsi/si_state_draw.c b/src/gallium/drivers/radeonsi/si_state_draw.c index 746ace6..4208fa7 100644 --- a/src/gallium/drivers/radeonsi/si_state_draw.c +++ b/src/gallium/drivers/radeonsi/si_state_draw.c @@ -241,6 +241,7 @@ static void si_pipe_shader_ps(struct pipe_context *ctx, struct si_pipe_shader *s /* Last 2 reserved SGPRs are used for VCC */ num_sgprs = num_user_sgprs + 2; } + num_sgprs += 1; /* XXX this fixes VM faults */ One problem is that the compiler is under reporting the number of SGPRs, when there are unused USER_SGPRs in the shader. It should always be reporting a number greater than or equal to the number of USER_SGPRs. That itself shouldn't be a problem, as the radeonsi code ensures this. The way radeonsi checks for this on pixel shaders won't work in all cases. For example, if there are 8 USER_SGPR and the shader uses VCC, then the backend should report 10 SGPRs used. This value is eventually rounded up to the next multiple of 8, so what is actually stored in shader-num_sgprs is 16. Now lets say that the shader doesn't see 2 of the USER_SGPRs, it will report 8 SGPRs (6 USER_SGPRs + 2 for VCC). This value is rounded up to the nearest multiple of 8, which is still 8, and stored in shader-num_sgprs. Since the number of SGPRs reported matches the number of USER_SGPRs, the code in si_pipe_shader_ps() will accepted the reported number of SGPRs as valid, even though it isn't. -Tom I think Michel mentioned this earlier, but there may also be a problem with the way we determine usage of the VCC register in the shader, maybe it is being used for more instructions than we realize. I was just worrying that AMDGPUAsmPrinter::EmitProgramInfoSI() might not take implicit access to VCC into account, but Christian confirmed it should. AFAICT we should be encoding any such implicit access correctly (except for V_DIV_SCALE_F* and V_DIV_FMAS_F*, which aren't used yet). -- Earthling Michel Dänzer | http://www.amd.com Libre software enthusiast | Debian, X and DRI developer ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 67740] render2.c: In function '__indirect_glMap1d'
https://bugs.freedesktop.org/show_bug.cgi?id=67740 Ian Romanick i...@freedesktop.org changed: What|Removed |Added Status|NEW |NEEDINFO CC||cwo...@cworth.org, ||i...@freedesktop.org --- Comment #1 from Ian Romanick i...@freedesktop.org --- Are you building from GIT or from the 9.1.6 release tarball? Does this occur with 9.1.5? What options are you passing to configure? -- You are receiving this mail because: You are the assignee for the bug. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] dri: Choose a decent global driNConfigOptions.
On 08/08/2013 03:59 AM, Eric Anholt wrote: Previously, we were asserting that each driver specified an NConfigOptions exactly equal to the number of options they supplied, leading to frequent bugs when people would forget to adjust the value when adjusting driver options. Instead, just overallocate the table by a bit and leave sanity checking to the assert in findOption(). This makes sense. I thought part of the reason for __driNConfigOptions being exact was that it was used by external driconf GUI tools. Will this change break those tools? --- src/gallium/state_trackers/dri/common/dri_screen.c | 5 +-- src/mesa/drivers/dri/common/dri_util.c | 4 +- src/mesa/drivers/dri/common/xmlconfig.c| 44 -- src/mesa/drivers/dri/common/xmlconfig.h| 4 +- src/mesa/drivers/dri/i915/intel_screen.c | 5 +-- src/mesa/drivers/dri/i965/intel_screen.c | 5 +-- src/mesa/drivers/dri/radeon/radeon_screen.c| 5 +-- 7 files changed, 14 insertions(+), 58 deletions(-) diff --git a/src/gallium/state_trackers/dri/common/dri_screen.c b/src/gallium/state_trackers/dri/common/dri_screen.c index 3b42b5a..779741e 100644 --- a/src/gallium/state_trackers/dri/common/dri_screen.c +++ b/src/gallium/state_trackers/dri/common/dri_screen.c @@ -74,8 +74,6 @@ PUBLIC const char __driConfigOptions[] = #define false 0 -static const uint __driNConfigOptions = 13; - static const __DRIconfig ** dri_fill_in_modes(struct dri_screen *screen) { @@ -417,8 +415,7 @@ dri_init_screen_helper(struct dri_screen *screen, else screen-target = PIPE_TEXTURE_RECT; - driParseOptionInfo(screen-optionCacheDefaults, - __driConfigOptions, __driNConfigOptions); + driParseOptionInfo(screen-optionCacheDefaults, __driConfigOptions); driParseConfigFiles(screen-optionCache, screen-optionCacheDefaults, diff --git a/src/mesa/drivers/dri/common/dri_util.c b/src/mesa/drivers/dri/common/dri_util.c index 9ed9df4..fa520ea 100644 --- a/src/mesa/drivers/dri/common/dri_util.c +++ b/src/mesa/drivers/dri/common/dri_util.c @@ -52,8 +52,6 @@ PUBLIC const char __dri2ConfigOptions[] = DRI_CONF_SECTION_END DRI_CONF_END; -static const uint __dri2NConfigOptions = 1; - /*/ /** \name Screen handling functions */ /*/ @@ -112,7 +110,7 @@ dri2CreateNewScreen(int scrn, int fd, return NULL; } -driParseOptionInfo(psp-optionInfo, __dri2ConfigOptions, __dri2NConfigOptions); +driParseOptionInfo(psp-optionInfo, __dri2ConfigOptions); driParseConfigFiles(psp-optionCache, psp-optionInfo, psp-myNum, dri2); return psp; diff --git a/src/mesa/drivers/dri/common/xmlconfig.c b/src/mesa/drivers/dri/common/xmlconfig.c index 5c97c20..b95e452 100644 --- a/src/mesa/drivers/dri/common/xmlconfig.c +++ b/src/mesa/drivers/dri/common/xmlconfig.c @@ -132,16 +132,6 @@ static GLuint findOption (const driOptionCache *cache, const char *name) { return hash; } -/** \brief Count the real number of options in an option cache */ -static GLuint countOptions (const driOptionCache *cache) { -GLuint size = 1 cache-tableSize; -GLuint i, count = 0; -for (i = 0; i size; ++i) - if (cache-info[i].name) - count++; -return count; -} - /** \brief Like strdup but using malloc and with error checking. */ #define XSTRDUP(dest,source) do { \ GLuint len = strlen (source); \ @@ -685,25 +675,18 @@ static void optInfoEndElem (void *userData, const XML_Char *name) { } } -void driParseOptionInfo (driOptionCache *info, -const char *configOptions, GLuint nConfigOptions) { +void driParseOptionInfo (driOptionCache *info, const char *configOptions) { XML_Parser p; int status; struct OptInfoData userData; struct OptInfoData *data = userData; -GLuint realNoptions; - - /* determine hash table size and allocate memory: - * 3/2 of the number of options, rounded up, so there remains always - * at least one free entry. This is needed for detecting undefined - * options in configuration files without getting a hash table overflow. - * Round this up to a power of two. */ -GLuint minSize = (nConfigOptions*3 + 1) / 2; -GLuint size, log2size; -for (size = 1, log2size = 0; size minSize; size = 1, ++log2size); -info-tableSize = log2size; -info-info = calloc(size, sizeof (driOptionInfo)); -info-values = calloc(size, sizeof (driOptionValue)); + +/* Make the hash table big enough to fit more than the maximum number of + * config options we've ever seen in a driver. + */ +info-tableSize = 6; +info-info = calloc(1 info-tableSize, sizeof (driOptionInfo)); +info-values = calloc(1 info-tableSize, sizeof
Re: [Mesa-dev] [i965] i965/draw: Move constant formation outside of for loop and use an enum.
On Tue, Aug 6, 2013 at 12:31 PM, Ian Romanick i...@freedesktop.org wrote: On 08/06/2013 12:10 PM, mmueller wrote:Signed-off-by: mmueller markkmuel...@gmail.com --- src/mesa/drivers/dri/i965/brw_**draw.c | 18 +- 1 file changed, 9 insertions(+), 9 deletions(-) diff --git a/src/mesa/drivers/dri/i965/**brw_draw.c b/src/mesa/drivers/dri/i965/**brw_draw.c index 6170d07..e11d0d8 100644 --- a/src/mesa/drivers/dri/i965/**brw_draw.c +++ b/src/mesa/drivers/dri/i965/**brw_draw.c @@ -367,6 +367,15 @@ static bool brw_try_draw_prims( struct gl_context *ctx, bool retval = true; GLuint i; bool fail_next = false; + enum { + estimated_max_prim_size = + 512 + /* batchbuffer commands */ + ((BRW_MAX_TEX_UNIT * (sizeof(struct brw_sampler_state) + sizeof(struct gen5_sampler_default_color + + 1024 + /* gen6 VS push constants */ + 1024 + /* gen6 WM push constants */ + 512 /* misc. pad */ + }; + I think this would be better as 'const int estimated_max_prim_size = ...'. Using an enum is mostly the same (but also enforces that it's a compile-time constant), but it looks weird. :) OK. A new patch is on the way *sigh*, I knew this day was coming for the last 15 years, but I don't feel that was enough time to prepare. Now I'm forced to come out of the closet - I'm a lifelong member of PETA*. All too often I clearly communicate something to the compiler yet it still goes off and does something else. Before I started using enums to replace #defines and const PODs in my code, I too thought enums were weird and ugly. After time I came to realized that it was an effective means to get the compiler to do some useful things. After a while they don't look so weird. Obviously in this case it's of no concern. Cheers *PETA - Promoting Enums Through Awareness (www.PE.TA, not to be confused with People Eating Tasty Animals: PETA.org) ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] dri: Choose a decent global driNConfigOptions.
Ian Romanick i...@freedesktop.org writes: On 08/08/2013 03:59 AM, Eric Anholt wrote: Previously, we were asserting that each driver specified an NConfigOptions exactly equal to the number of options they supplied, leading to frequent bugs when people would forget to adjust the value when adjusting driver options. Instead, just overallocate the table by a bit and leave sanity checking to the assert in findOption(). This makes sense. I thought part of the reason for __driNConfigOptions being exact was that it was used by external driconf GUI tools. Will this change break those tools? I spent an embarassingly long time trying to find how the external tool used the config symbols, only to find out that it doesn't. driconf calls out to xdriinfo, which does a glXGetProcAddress(glXGetDriverConfig), and parses the XML returned From calling the returned function. It's like a sane interface! (except for not checking for an extension string trying all this) pgpLliH2uKKgC.pgp Description: PGP signature ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 16/20] radeonsi: add FMASK texture binding slots and resource setup
On Thu, Aug 8, 2013 at 1:34 PM, Marek Olšák mar...@gmail.com wrote: On Thu, Aug 8, 2013 at 6:57 PM, Christian König deathsim...@vodafone.de wrote: Am 08.08.2013 16:33, schrieb Marek Olšák: On Thu, Aug 8, 2013 at 3:09 PM, Christian König deathsim...@vodafone.de wrote: Am 08.08.2013 14:38, schrieb Marek Olšák: .On Thu, Aug 8, 2013 at 9:47 AM, Christian König deathsim...@vodafone.de wrote: Am 08.08.2013 02:20, schrieb Marek Olšák: FMASK is bound as a separate texture. For every texture, there can be an FMASK. Therefore a separate array of resource slots has to be added. This adds a new mechanism for emitting resource descriptors, its features are: - resource descriptors are stored in an ordinary buffer (not in a CS) Having resource descriptors outside of the CS has two problems that we need to solve first: 1. Fine grained descriptor updates doesn't work, I already tried that. The problem is that unlike previous asics descriptors are now a memory block, so no longer part of the CP context. So when we (for example) have a draw command executing and the next draw command is using new resources for a specific slot we would either block until the first draw command is finished (which is bad for performance) or change the descriptors while they are still in use (which results in VM faults). So what would the proper solution be here? Do I need to flush some caches or would moving the descriptor updates to the constant IB fix that? Actually the current implementation worked better than anything else I tried. When you really need the resource descriptors in a separate buffer you need to use one buffer for each draw call and always write the full buffer contents (no partial updates). Flushing anything won't really help either.. The only solution I see using one buffer is to block until the last draw call is finished with WAIT_REG_MEM, but that would be quite disastrous for performance. 2. If my understand is correct when they are embedded the descriptors are preloaded into the caches while executing the IB, so to archive the same speed with descriptors outside of the IB you need to add additional commands to the constant IB which is new to SI and we currently doesn't support in the CS interface. There seems to be support for the constant IB. The CS ioctl chunk ID is RADEON_CHUNK_ID_CONST_IB and the allowed packets are listed in si_vm_packet3_ce_check. Is there anything missing? The userspace side seems to be missing and except for throwing NOP packets into it we never tested it. I know from the closed source side that it actually was quite tricky for them to get working. Additional to that please note that I'm not 100% sure that just putting the descriptors into the IB is really helping here. It was just the most simplest solution to avoid allocating a new buffer on each draw call. I understand. I don't really need to have resource descriptors in a separate buffer, all I need is these 3 basic features a gallium driver should support: - fine-grained resource updates (mainly for performance, see below) - ability to unbind resources (e.g. by setting IMG_RSRC_WORD1 to 0) - no GPU crash if a shader is using SAMPLER[15] but there are no samplers bound FYI, partial sampler view and sampler state updates are coming to gallium, Brian Paul already has some patches, it's just a matter of time now. Vertex and constant buffer states already support partial updates. That shouldn't be to much off a problem. Just allocate a state at startup and initialize it with the proper pm4 commands for 16 samplers, then update the resource descriptors in that state when we change the bound textures/samplers/views/constants/whatever. All we need to do then is setting the emitted state to NULL so that it gets re-emitted in the next draw command. That would re-emit all 16 shader resources even if just one of them needs to be changed. I was trying to avoid this inefficiency. Is it really impossible to emit just one resource descriptor and keep the others unchanged? This is a basic D3D10/11 feature, for example: void ID3D11DeviceContext::VSSetShaderResources( [in] UINT StartSlot, [in] UINT NumViews, [in] ID3D11ShaderResourceView *const *ppShaderResourceViews ); If the constant engine is required to implement this interface efficiently, then I'd like to work on constant IB support. You'll need to either store them in memory or re-emit them if you store them in the IB. The CE is mainly there so that it can prime the TC in parallel with the command stream processing. Alex ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 1/3] draw: cleanup the extra attribs
Before inserting new front face and prim id outputs cleanup the old extra outputs, otherwise our cache will use previous output slots which will break as soon as outputs of the current shader don't match the last. Signed-off-by: Zack Rusin za...@vmware.com --- src/gallium/auxiliary/draw/draw_context.c |1 + 1 file changed, 1 insertion(+) diff --git a/src/gallium/auxiliary/draw/draw_context.c b/src/gallium/auxiliary/draw/draw_context.c index af9caee..2dc6772 100644 --- a/src/gallium/auxiliary/draw/draw_context.c +++ b/src/gallium/auxiliary/draw/draw_context.c @@ -555,6 +555,7 @@ draw_get_shader_info(const struct draw_context *draw) void draw_prepare_shader_outputs(struct draw_context *draw) { + draw_remove_extra_vertex_attribs(draw); draw_ia_prepare_outputs(draw, draw-pipeline.ia); draw_unfilled_prepare_outputs(draw, draw-pipeline.unfilled); } -- 1.7.10.4 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 2/3] draw: reset the vertex id when injecting new primitive id
Without reseting the vertex id, with primitives where the same vertex is used with different primitives (e.g. tri/lines strips) our vbuf module won't re-emit those vertices with the changed primitive id. So lets reset the vertex id whenever injecting new primitive id to make sure that the vertex data is correctly emitted. Signed-off-by: Zack Rusin za...@vmware.com --- src/gallium/auxiliary/draw/draw_pipe_ia.c |9 + 1 file changed, 9 insertions(+) diff --git a/src/gallium/auxiliary/draw/draw_pipe_ia.c b/src/gallium/auxiliary/draw/draw_pipe_ia.c index ecbb233..d64f19b 100644 --- a/src/gallium/auxiliary/draw/draw_pipe_ia.c +++ b/src/gallium/auxiliary/draw/draw_pipe_ia.c @@ -68,6 +68,15 @@ inject_primid(struct draw_stage *stage, for (i = 0; i num_verts; ++i) { struct vertex_header *v = header-v[i]; + /* We have to reset the vertex_id because it's used by + * vbuf to figure out if the vertex had already been + * emitted. For line/tri strips the first vertex of + * subsequent primitives would already be emitted, + * but since we're changing the primitive id on the vertex + * we want to make sure it's reemitted with the correct + * data. + */ + v-vertex_id = UNDEFINED_VERTEX_ID; memcpy(v-data[slot][0], primid, sizeof(primid)); memcpy(v-data[slot][1], primid, sizeof(primid)); memcpy(v-data[slot][2], primid, sizeof(primid)); -- 1.7.10.4 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 3/3] draw: rewrite primitive assembler
We can't be injecting the primitive id's in the pipeline because by that time the primitives have already been decomposed. To properly number the primitives we need to handle the adjacency primitives by hand. This patch moves the prim id injection into the original primitive assembler and completely removes the useless pipeline stage. Signed-off-by: Zack Rusin za...@vmware.com --- src/gallium/auxiliary/Makefile.sources |1 - src/gallium/auxiliary/draw/draw_context.c|8 +- src/gallium/auxiliary/draw/draw_pipe.c |4 - src/gallium/auxiliary/draw/draw_pipe.h |7 - src/gallium/auxiliary/draw/draw_pipe_ia.c| 259 -- src/gallium/auxiliary/draw/draw_pipe_validate.c | 14 -- src/gallium/auxiliary/draw/draw_prim_assembler.c | 168 +- src/gallium/auxiliary/draw/draw_prim_assembler.h | 12 + src/gallium/auxiliary/draw/draw_private.h|4 +- 9 files changed, 180 insertions(+), 297 deletions(-) delete mode 100644 src/gallium/auxiliary/draw/draw_pipe_ia.c diff --git a/src/gallium/auxiliary/Makefile.sources b/src/gallium/auxiliary/Makefile.sources index b0172de..acbcef7 100644 --- a/src/gallium/auxiliary/Makefile.sources +++ b/src/gallium/auxiliary/Makefile.sources @@ -13,7 +13,6 @@ C_SOURCES := \ draw/draw_pipe_clip.c \ draw/draw_pipe_cull.c \ draw/draw_pipe_flatshade.c \ -draw/draw_pipe_ia.c \ draw/draw_pipe_offset.c \ draw/draw_pipe_pstipple.c \ draw/draw_pipe_stipple.c \ diff --git a/src/gallium/auxiliary/draw/draw_context.c b/src/gallium/auxiliary/draw/draw_context.c index 2dc6772..2d4843e 100644 --- a/src/gallium/auxiliary/draw/draw_context.c +++ b/src/gallium/auxiliary/draw/draw_context.c @@ -40,6 +40,7 @@ #include util/u_prim.h #include draw_context.h #include draw_pipe.h +#include draw_prim_assembler.h #include draw_vs.h #include draw_gs.h @@ -95,6 +96,10 @@ draw_create_context(struct pipe_context *pipe, boolean try_llvm) if (!draw_init(draw)) goto err_destroy; + draw-ia = draw_prim_assembler_create(draw); + if (!draw-ia) + goto err_destroy; + return draw; err_destroy: @@ -206,6 +211,7 @@ void draw_destroy( struct draw_context *draw ) draw-render-destroy( draw-render ); */ + draw_prim_assembler_destroy(draw-ia); draw_pipeline_destroy( draw ); draw_pt_destroy( draw ); draw_vs_destroy( draw ); @@ -556,7 +562,7 @@ void draw_prepare_shader_outputs(struct draw_context *draw) { draw_remove_extra_vertex_attribs(draw); - draw_ia_prepare_outputs(draw, draw-pipeline.ia); + draw_prim_assembler_prepare_outputs(draw-ia); draw_unfilled_prepare_outputs(draw, draw-pipeline.unfilled); } diff --git a/src/gallium/auxiliary/draw/draw_pipe.c b/src/gallium/auxiliary/draw/draw_pipe.c index 8140299..f1ee6cb 100644 --- a/src/gallium/auxiliary/draw/draw_pipe.c +++ b/src/gallium/auxiliary/draw/draw_pipe.c @@ -49,7 +49,6 @@ boolean draw_pipeline_init( struct draw_context *draw ) draw-pipeline.clip = draw_clip_stage( draw ); draw-pipeline.flatshade = draw_flatshade_stage( draw ); draw-pipeline.cull = draw_cull_stage( draw ); - draw-pipeline.ia= draw_ia_stage( draw ); draw-pipeline.validate = draw_validate_stage( draw ); draw-pipeline.first = draw-pipeline.validate; @@ -62,7 +61,6 @@ boolean draw_pipeline_init( struct draw_context *draw ) !draw-pipeline.clip || !draw-pipeline.flatshade || !draw-pipeline.cull || - !draw-pipeline.ia || !draw-pipeline.validate) return FALSE; @@ -97,8 +95,6 @@ void draw_pipeline_destroy( struct draw_context *draw ) draw-pipeline.flatshade-destroy( draw-pipeline.flatshade ); if (draw-pipeline.cull) draw-pipeline.cull-destroy( draw-pipeline.cull ); - if (draw-pipeline.ia) - draw-pipeline.ia-destroy( draw-pipeline.ia ); if (draw-pipeline.validate) draw-pipeline.validate-destroy( draw-pipeline.validate ); if (draw-pipeline.aaline) diff --git a/src/gallium/auxiliary/draw/draw_pipe.h b/src/gallium/auxiliary/draw/draw_pipe.h index 70822a4..7c9ed6c 100644 --- a/src/gallium/auxiliary/draw/draw_pipe.h +++ b/src/gallium/auxiliary/draw/draw_pipe.h @@ -91,10 +91,6 @@ extern struct draw_stage *draw_stipple_stage( struct draw_context *context ); extern struct draw_stage *draw_wide_line_stage( struct draw_context *context ); extern struct draw_stage *draw_wide_point_stage( struct draw_context *context ); extern struct draw_stage *draw_validate_stage( struct draw_context *context ); -extern struct draw_stage *draw_ia_stage(struct draw_context *context); - -boolean draw_ia_stage_required(const struct draw_context *context, - unsigned prim); extern void draw_free_temp_verts( struct draw_stage *stage ); extern boolean draw_alloc_temp_verts( struct draw_stage *stage, unsigned nr ); @@ -108,9 +104,6 @@ void
Re: [Mesa-dev] [PATCH 2/3] draw: reset the vertex id when injecting new primitive id
Don't worry about this one too much. The next patch removes draw_pipe_ia.c anyway... - Original Message - Without reseting the vertex id, with primitives where the same vertex is used with different primitives (e.g. tri/lines strips) our vbuf module won't re-emit those vertices with the changed primitive id. So lets reset the vertex id whenever injecting new primitive id to make sure that the vertex data is correctly emitted. Signed-off-by: Zack Rusin za...@vmware.com --- src/gallium/auxiliary/draw/draw_pipe_ia.c |9 + 1 file changed, 9 insertions(+) diff --git a/src/gallium/auxiliary/draw/draw_pipe_ia.c b/src/gallium/auxiliary/draw/draw_pipe_ia.c index ecbb233..d64f19b 100644 --- a/src/gallium/auxiliary/draw/draw_pipe_ia.c +++ b/src/gallium/auxiliary/draw/draw_pipe_ia.c @@ -68,6 +68,15 @@ inject_primid(struct draw_stage *stage, for (i = 0; i num_verts; ++i) { struct vertex_header *v = header-v[i]; + /* We have to reset the vertex_id because it's used by + * vbuf to figure out if the vertex had already been + * emitted. For line/tri strips the first vertex of + * subsequent primitives would already be emitted, + * but since we're changing the primitive id on the vertex + * we want to make sure it's reemitted with the correct + * data. + */ + v-vertex_id = UNDEFINED_VERTEX_ID; memcpy(v-data[slot][0], primid, sizeof(primid)); memcpy(v-data[slot][1], primid, sizeof(primid)); memcpy(v-data[slot][2], primid, sizeof(primid)); -- 1.7.10.4 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 2/2] gallivm: use texture target from shader instead of static state for size query
Series looks good to me. Reviewed-by: Zack Rusin za...@vmware.com - Original Message - From: Roland Scheidegger srol...@vmware.com d3d10 has no notion of distinct array resources neither at the resource nor sampler view level. However, shader dcl of resources certainly has, and d3d10 expects resinfo to return the values according to that - in particular a resource might have been a 1d texture with some array layers, then the sampler view might have only used 1 layer so it can be accessed both as 1d or 1d array texture (I think - the former definitely works). resinfo of a resource decleared as array needs to return number of array layers but non-array resource needs to return 0 (and not 1). Hence fix this by passing the target from the shader decl to emit_size_query and use that (in case of OpenGL the target will come from the instruction itself). Could probably do the same for actual sampling, though it may not matter there (as the bogus components will essentially get clamped away), possibly could wreak havoc though if it REALLY doesn't match (which is of course an error but still). --- src/gallium/auxiliary/draw/draw_llvm_sample.c |2 + src/gallium/auxiliary/gallivm/lp_bld_sample.h |1 + src/gallium/auxiliary/gallivm/lp_bld_sample_soa.c | 32 ++- src/gallium/auxiliary/gallivm/lp_bld_tgsi.h |1 + src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c | 43 - src/gallium/drivers/llvmpipe/lp_tex_sample.c |2 + 6 files changed, 77 insertions(+), 4 deletions(-) diff --git a/src/gallium/auxiliary/draw/draw_llvm_sample.c b/src/gallium/auxiliary/draw/draw_llvm_sample.c index 3016d7c..f10cba3 100644 --- a/src/gallium/auxiliary/draw/draw_llvm_sample.c +++ b/src/gallium/auxiliary/draw/draw_llvm_sample.c @@ -270,6 +270,7 @@ draw_llvm_sampler_soa_emit_size_query(const struct lp_build_sampler_soa *base, struct gallivm_state *gallivm, struct lp_type type, unsigned texture_unit, + unsigned target, boolean need_nr_mips, boolean scalar_lod, LLVMValueRef explicit_lod, /* optional */ @@ -284,6 +285,7 @@ draw_llvm_sampler_soa_emit_size_query(const struct lp_build_sampler_soa *base, sampler-dynamic_state.base, type, texture_unit, + target, need_nr_mips, scalar_lod, explicit_lod, diff --git a/src/gallium/auxiliary/gallivm/lp_bld_sample.h b/src/gallium/auxiliary/gallivm/lp_bld_sample.h index dff8be2..db3ea1d 100644 --- a/src/gallium/auxiliary/gallivm/lp_bld_sample.h +++ b/src/gallium/auxiliary/gallivm/lp_bld_sample.h @@ -497,6 +497,7 @@ lp_build_size_query_soa(struct gallivm_state *gallivm, struct lp_sampler_dynamic_state *dynamic_state, struct lp_type int_type, unsigned texture_unit, +unsigned target, boolean need_nr_mips, boolean scalar_lod, LLVMValueRef explicit_lod, diff --git a/src/gallium/auxiliary/gallivm/lp_bld_sample_soa.c b/src/gallium/auxiliary/gallivm/lp_bld_sample_soa.c index b0bb58b..e403ac8 100644 --- a/src/gallium/auxiliary/gallivm/lp_bld_sample_soa.c +++ b/src/gallium/auxiliary/gallivm/lp_bld_sample_soa.c @@ -1943,6 +1943,7 @@ lp_build_size_query_soa(struct gallivm_state *gallivm, struct lp_sampler_dynamic_state *dynamic_state, struct lp_type int_type, unsigned texture_unit, +unsigned target, boolean need_nr_mips, boolean scalar_lod, LLVMValueRef explicit_lod, @@ -1955,9 +1956,36 @@ lp_build_size_query_soa(struct gallivm_state *gallivm, unsigned num_lods = 1; struct lp_build_context bld_int_vec; - dims = texture_dims(static_state-target); + /* +* Do some sanity verification about bound texture and shader dcl target. +* Not entirely sure what's possible but assume array/non-array +* always compatible (probably not ok for OpenGL but d3d10 has no +* distinction of arrays at the resource level). +* Everything else looks bogus (though not entirely sure about rect/2d). +* Currently disabled because it causes assertion failures if there's +* nothing bound (or rather a dummy texture, not that this case would +* return the right values). +*/ + if (0
Re: [Mesa-dev] [PATCH] gallivm: set non-existing values really to zero in size queries for d3d10
Looks good. Reviewed-by: Zack Rusin za...@vmware.com - Original Message - From: Roland Scheidegger srol...@vmware.com My previous attempt at doing so double-failed miserably (minification of zero still gives one, and even if it would not the value was never written anyway). While here also rename the confusingly named int_vec bld as we have int vecs of different sizes, and rename need_nr_mips (as this also changes out-of-bounds behavior) to is_sviewinfo too. --- src/gallium/auxiliary/draw/draw_llvm_sample.c |4 +-- src/gallium/auxiliary/gallivm/lp_bld_sample.h |2 +- src/gallium/auxiliary/gallivm/lp_bld_sample_soa.c | 34 ++--- src/gallium/drivers/llvmpipe/lp_tex_sample.c |4 +-- 4 files changed, 22 insertions(+), 22 deletions(-) diff --git a/src/gallium/auxiliary/draw/draw_llvm_sample.c b/src/gallium/auxiliary/draw/draw_llvm_sample.c index f10cba3..97b0255 100644 --- a/src/gallium/auxiliary/draw/draw_llvm_sample.c +++ b/src/gallium/auxiliary/draw/draw_llvm_sample.c @@ -271,7 +271,7 @@ draw_llvm_sampler_soa_emit_size_query(const struct lp_build_sampler_soa *base, struct lp_type type, unsigned texture_unit, unsigned target, - boolean need_nr_mips, + boolean is_sviewinfo, boolean scalar_lod, LLVMValueRef explicit_lod, /* optional */ LLVMValueRef *sizes_out) @@ -286,7 +286,7 @@ draw_llvm_sampler_soa_emit_size_query(const struct lp_build_sampler_soa *base, type, texture_unit, target, - need_nr_mips, + is_sviewinfo, scalar_lod, explicit_lod, sizes_out); diff --git a/src/gallium/auxiliary/gallivm/lp_bld_sample.h b/src/gallium/auxiliary/gallivm/lp_bld_sample.h index db3ea1d..75e8c59 100644 --- a/src/gallium/auxiliary/gallivm/lp_bld_sample.h +++ b/src/gallium/auxiliary/gallivm/lp_bld_sample.h @@ -498,7 +498,7 @@ lp_build_size_query_soa(struct gallivm_state *gallivm, struct lp_type int_type, unsigned texture_unit, unsigned target, -boolean need_nr_mips, +boolean is_viewinfo, boolean scalar_lod, LLVMValueRef explicit_lod, LLVMValueRef *sizes_out); diff --git a/src/gallium/auxiliary/gallivm/lp_bld_sample_soa.c b/src/gallium/auxiliary/gallivm/lp_bld_sample_soa.c index e403ac8..65d6e7b 100644 --- a/src/gallium/auxiliary/gallivm/lp_bld_sample_soa.c +++ b/src/gallium/auxiliary/gallivm/lp_bld_sample_soa.c @@ -1944,7 +1944,7 @@ lp_build_size_query_soa(struct gallivm_state *gallivm, struct lp_type int_type, unsigned texture_unit, unsigned target, -boolean need_nr_mips, +boolean is_sviewinfo, boolean scalar_lod, LLVMValueRef explicit_lod, LLVMValueRef *sizes_out) @@ -1954,7 +1954,7 @@ lp_build_size_query_soa(struct gallivm_state *gallivm, int dims, i; boolean has_array; unsigned num_lods = 1; - struct lp_build_context bld_int_vec; + struct lp_build_context bld_int_vec4; /* * Do some sanity verification about bound texture and shader dcl target. @@ -1997,24 +1997,19 @@ lp_build_size_query_soa(struct gallivm_state *gallivm, assert(!int_type.floating); - lp_build_context_init(bld_int_vec, gallivm, lp_type_int_vec(32, 128)); + lp_build_context_init(bld_int_vec4, gallivm, lp_type_int_vec(32, 128)); if (explicit_lod) { /* FIXME: this needs to honor per-element lod */ lod = LLVMBuildExtractElement(gallivm-builder, explicit_lod, lp_build_const_int32(gallivm, 0), ); first_level = dynamic_state-first_level(dynamic_state, gallivm, texture_unit); level = LLVMBuildAdd(gallivm-builder, lod, first_level, level); - lod = lp_build_broadcast_scalar(bld_int_vec, level); + lod = lp_build_broadcast_scalar(bld_int_vec4, level); } else { - lod = bld_int_vec.zero; + lod = bld_int_vec4.zero; } - if (need_nr_mips) { - size = bld_int_vec.zero; - } - else { - size = bld_int_vec.undef; - } + size = bld_int_vec4.undef; size = LLVMBuildInsertElement(gallivm-builder, size,
[Mesa-dev] [i965][V2] i965/draw: Move constant formation outside of for loop and use an enum.
Signed-off-by: Mark Mueller markkmuel...@gmail.com --- src/mesa/drivers/dri/i965/brw_draw.c | 16 ++-- 1 file changed, 6 insertions(+), 10 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_draw.c b/src/mesa/drivers/dri/i965/brw_draw.c index 6170d07..1b5ed55 100644 --- a/src/mesa/drivers/dri/i965/brw_draw.c +++ b/src/mesa/drivers/dri/i965/brw_draw.c @@ -367,6 +367,12 @@ static bool brw_try_draw_prims( struct gl_context *ctx, bool retval = true; GLuint i; bool fail_next = false; + const int estimated_max_prim_size = + 512 + /* batchbuffer commands */ + ((BRW_MAX_TEX_UNIT * (sizeof(struct brw_sampler_state) + sizeof(struct gen5_sampler_default_color + + 1024 + /* gen6 VS push constants */ + 1024 + /* gen6 WM push constants */ + 512; /* misc. pad */ if (ctx-NewState) _mesa_update_state( ctx ); @@ -405,16 +411,6 @@ static bool brw_try_draw_prims( struct gl_context *ctx, brw-state.dirty.brw |= BRW_NEW_VERTICES; for (i = 0; i nr_prims; i++) { - int estimated_max_prim_size; - - estimated_max_prim_size = 512; /* batchbuffer commands */ - estimated_max_prim_size += (BRW_MAX_TEX_UNIT * - (sizeof(struct brw_sampler_state) + - sizeof(struct gen5_sampler_default_color))); - estimated_max_prim_size += 1024; /* gen6 VS push constants */ - estimated_max_prim_size += 1024; /* gen6 WM push constants */ - estimated_max_prim_size += 512; /* misc. pad */ - /* Flush the batch if it's approaching full, so that we don't wrap while * we've got validated state that needs to be in the same batch as the * primitives. -- 1.8.3.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] mesa: Enable LTO by default on i965/libdricore release builds.
We can't just smash it on globally due to (probably resolvable) issues with the asm in glapi. And we don't want to penalize developers with longer build times for their normal debug environment. Due to libdricore making almost all of our symbols public, the effect is very small -- cairo-gl with INTEL_NO_HW=1 shows -0.798709% +/- 0.333703% change in runtime (n=30). --- If we were to avoid dricore, there's an additional 5% improvement available (see the megadriver branch of my tree). configure.ac | 25 + src/mesa/Makefile.am | 4 ++-- src/mesa/drivers/dri/i965/Makefile.am | 1 + src/mesa/libdricore/Makefile.am | 8 +++- src/mesa/program/Makefile.am | 4 ++-- 5 files changed, 37 insertions(+), 5 deletions(-) diff --git a/configure.ac b/configure.ac index 62d06e0..26c230d 100644 --- a/configure.ac +++ b/configure.ac @@ -314,6 +314,7 @@ AC_ARG_ENABLE([debug], [enable_debug=$enableval], [enable_debug=no] ) +enable_lto=yes if test x$enable_debug = xyes; then DEFINES_FOR_BUILD=$DEFINES_FOR_BUILD -DDEBUG if test x$GCC_FOR_BUILD = xyes; then @@ -330,7 +331,31 @@ if test x$enable_debug = xyes; then if test x$GXX = xyes; then CXXFLAGS=$CXXFLAGS -g -O0 fi + +# Disable LTO by default on debug builds, since it's so expensive at +# compile time. +enable_lto=no +fi + +AC_ARG_ENABLE([lto], +[AS_HELP_STRING([--disable-lto], +[Enable link-time optimization @:@default=auto, disabled if debug is enabled@:@])], +[enable_lto=$enableval]) + +if test x$enable_lto = xyes; then +# Enable LTO if the compiler supports it and not disabled by the user, +# which allows the compiler to inline Mesa's many small, +# non-manually-inlined helper functions. +save_CFLAGS=$CFLAGS +AC_MSG_CHECKING([whether $CC supports LTO]) +LTOFLAGS=-flto -fuse-linker-plugin +CFLAGS=$CFLAGS $LTOFLAGS +AC_LINK_IFELSE([AC_LANG_PROGRAM()], + [AC_MSG_RESULT([yes])], + [AC_MSG_RESULT([no]); LTOFLAGS=]); +CFLAGS=$save_CFLAGS fi +AC_SUBST([LTOFLAGS]) dnl dnl library names diff --git a/src/mesa/Makefile.am b/src/mesa/Makefile.am index e9c16e7..ad04cc6 100644 --- a/src/mesa/Makefile.am +++ b/src/mesa/Makefile.am @@ -94,8 +94,8 @@ BUILDDIR = $(top_builddir)/src/mesa/ include Makefile.sources AM_CPPFLAGS = $(DEFINES) $(INCLUDE_DIRS) -AM_CFLAGS = $(LLVM_CFLAGS) $(VISIBILITY_CFLAGS) -AM_CXXFLAGS = $(LLVM_CFLAGS) $(VISIBILITY_CXXFLAGS) +AM_CFLAGS = $(LLVM_CFLAGS) $(VISIBILITY_CFLAGS) $(LTOFLAGS) +AM_CXXFLAGS = $(LLVM_CFLAGS) $(VISIBILITY_CXXFLAGS) $(LTOFLAGS) MESA_ASM_FILES_FOR_ARCH = diff --git a/src/mesa/drivers/dri/i965/Makefile.am b/src/mesa/drivers/dri/i965/Makefile.am index 27c67d1..5b3e9cc 100644 --- a/src/mesa/drivers/dri/i965/Makefile.am +++ b/src/mesa/drivers/dri/i965/Makefile.am @@ -36,6 +36,7 @@ AM_CFLAGS = \ -I$(top_builddir)/src/mesa/drivers/dri/common \ $(DEFINES) \ $(VISIBILITY_CFLAGS) \ + $(LTOFLAGS) \ $(INTEL_CFLAGS) if HAVE_OPENGL_ES1 diff --git a/src/mesa/libdricore/Makefile.am b/src/mesa/libdricore/Makefile.am index 56ceeb7..22a358d 100644 --- a/src/mesa/libdricore/Makefile.am +++ b/src/mesa/libdricore/Makefile.am @@ -31,6 +31,9 @@ AM_CPPFLAGS = \ $(DEFINES) \ -DUSE_DRICORE +AM_CFLAGS = $(LTOFLAGS) +AM_CXXFLAGS = $(LTOFLAGS) + libdricore@VERSION@_la_SOURCES = \ $(MESA_FILES) \ $(LIBGLCPP_GENERATED_FILES) \ @@ -38,7 +41,10 @@ libdricore@VERSION@_la_SOURCES = \ $(LIBGLSL_FILES) \ $(BUILTIN_COMPILER_GENERATED_CXX_FILES) \ $(top_builddir)/src/glsl/builtin_function.cpp -libdricore@VERSION@_la_LDFLAGS = -version-number 1:0 +libdricore@VERSION@_la_LDFLAGS = \ +-version-number 1:0 \ +$(LTOFLAGS) \ +$() libdricore@VERSION@_la_LIBADD = \ ../program/libdricore_program.la \ $() diff --git a/src/mesa/program/Makefile.am b/src/mesa/program/Makefile.am index ab565e2..f6b7c94 100644 --- a/src/mesa/program/Makefile.am +++ b/src/mesa/program/Makefile.am @@ -24,8 +24,8 @@ include ../Makefile.sources AM_CPPFLAGS = $(DEFINES) $(INCLUDE_DIRS) AM_CFLAGS = $(VISIBILITY_CFLAGS) AM_CXXFLAGS = $(VISIBILITY_CXXFLAGS) -libdricore_program_la_CFLAGS = $(NOVISIBILITY_CFLAGS) -libdricore_program_la_CXXFLAGS = $(NOVISIBILITY_CXXFLAGS) +libdricore_program_la_CFLAGS = $(NOVISIBILITY_CFLAGS) $(LTOFLAGS) +libdricore_program_la_CXXFLAGS = $(NOVISIBILITY_CXXFLAGS) $(LTOFLAGS) SRCDIR = $(top_srcdir)/src/mesa/ BUILDDIR = $(top_builddir)/src/mesa/ -- 1.8.4.rc1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] dri: Choose a decent global driNConfigOptions.
On 08/08/2013 03:59 AM, Eric Anholt wrote: Previously, we were asserting that each driver specified an NConfigOptions exactly equal to the number of options they supplied, leading to frequent bugs when people would forget to adjust the value when adjusting driver options. Instead, just overallocate the table by a bit and leave sanity checking to the assert in findOption(). Looks reasonable to me. It'll be nice to not have to change this all the time. I didn't verify that the assertions you mention actually catch things, though. Reviewed-by: Kenneth Graunke kenn...@whitecape.org ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [i965][V2] i965/draw: Move constant formation outside of for loop and use an enum.
Mark Mueller markkmuel...@gmail.com writes: Signed-off-by: Mark Mueller markkmuel...@gmail.com --- src/mesa/drivers/dri/i965/brw_draw.c | 16 ++-- 1 file changed, 6 insertions(+), 10 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_draw.c b/src/mesa/drivers/dri/i965/brw_draw.c index 6170d07..1b5ed55 100644 --- a/src/mesa/drivers/dri/i965/brw_draw.c +++ b/src/mesa/drivers/dri/i965/brw_draw.c @@ -367,6 +367,12 @@ static bool brw_try_draw_prims( struct gl_context *ctx, bool retval = true; GLuint i; bool fail_next = false; + const int estimated_max_prim_size = + 512 + /* batchbuffer commands */ + ((BRW_MAX_TEX_UNIT * (sizeof(struct brw_sampler_state) + sizeof(struct gen5_sampler_default_color + + 1024 + /* gen6 VS push constants */ + 1024 + /* gen6 WM push constants */ + 512; /* misc. pad */ What's the point of this change? Moving loop invariants out of loops is something basic that your compiler does, and you're moving the definition farther from its use. pgpMfzBtdnQ0v.pgp Description: PGP signature ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] egl: Do not export private symbols
On 08/06/2013 05:30 PM, Jakob Bornecrantz wrote: On Wed, Aug 7, 2013 at 2:00 AM, Chad Versace chad.vers...@linux.intel.comwrote: On 08/06/2013 03:02 PM, Jakob Bornecrantz wrote: On Wed, Aug 7, 2013 at 12:00 AM, Jakob Bornecrantz wallbra...@gmail.com wrote: On Tue, Aug 6, 2013 at 9:21 PM, Chad Versace chad.vers...@linux.intel.com wrote: libEGL was incorrectly exporting *all* symbols, public and private. This patch adds -fvisibility=hidden to libEGL's linker flags to ensure that only symbols annotated with __attribute__((visibility(** default))) get exported. Sanity-checked on X11/EGL with Piglit, and on Wayland by running weston-gears. You need to test this patch on a non-inbuilt driver, egl_gallium or egl_glx, none of which require you to have any hardware. Or force egl_dri2 to not be built in. Do that and you have my Reviewed-by: Jakob Bornecrantz ja...@vmware.com Ok. I'll run this on egl_gallium or egl_glx, and then report back. Either is fine, tho I have no idea if egl_glx even works. All looks good. I tested es2gears_x11 using egl_gallium with Gallium's swrast driver. I also verified with gdb that those drivers were actually used. I'll commit to master now. By the way, I talked to krh today, and he suggested that we delete egl_glx rather than allow it to bitrot. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 67921] New: [bisected commit 883987] crosscompiling fails with util/u_cpu_detect.c:247:4: error: 'asm' undeclared (first use in this function)
https://bugs.freedesktop.org/show_bug.cgi?id=67921 Priority: medium Bug ID: 67921 Assignee: mesa-dev@lists.freedesktop.org Summary: [bisected commit 883987] crosscompiling fails with util/u_cpu_detect.c:247:4: error: 'asm' undeclared (first use in this function) Severity: normal Classification: Unclassified OS: All Reporter: alexandre.f.dem...@gmail.com Hardware: Other Status: NEW Version: git Component: Other Product: Mesa Commit 883987503fc79691398eb024f37480ff083805a3 breaks crosscompilation when building with the following: baseExec=./autogen.sh --prefix=/usr \ --enable-debug \ --enable-osmesa \ --enable-gbm \ --enable-xvmc \ --enable-vdpau \ --enable-gles1 \ --enable-gles2 \ --enable-openvg \ --enable-xorg \ --enable-xa \ --enable-egl \ --enable-gallium-egl \ --enable-glx-tls \ --enable-texture-float \ --enable-wgl \ --with-gallium-drivers=r600,swrast,svga \ --with-dri-drivers= \ --with-egl-platforms=x11,drm,wayland export CFLAGS='-m32' export CXXFLAGS='-m32' export LLVM_CONFIG='/usr/bin/llvm-config32' export PKG_CONFIG_PATH=/usr/lib/i386-linux-gnu/pkgconfig Exec32=$baseExec --enable-32-bit --build=x86_64-pc-linux-gnu --host=i686-pc-linux-gnu --libdir=/usr/lib/i386-linux-gnu --includedir=/usr/include/i386-linux-gnu $Exec32 --- It fails with: CC util/u_cpu_detect.lo util/u_cpu_detect.c: In function 'sse2_has_daz': util/u_cpu_detect.c:247:4: error: 'asm' undeclared (first use in this function) asm volatile (fxsave %0 :: m (fxarea)); ^ util/u_cpu_detect.c:247:4: note: each undeclared identifier is reported only once for each function it appears in util/u_cpu_detect.c:247:8: error: expected ';' before 'volatile' asm volatile (fxsave %0 :: m (fxarea)); ^ make[3]: *** [util/u_cpu_detect.lo] Error 1 --- However, compiling normally for my 64bit architecture works fine. Bad commit 883987503fc79691398eb024f37480ff083805a3 util: try much harder to set DAZ flag Author Roland Scheideggersrol...@vmware.com Author date 8/5/13 9:30 PM -- You are receiving this mail because: You are the assignee for the bug. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 67921] [bisected commit 883987] crosscompiling fails with util/u_cpu_detect.c:247:4: error: 'asm' undeclared (first use in this function)
https://bugs.freedesktop.org/show_bug.cgi?id=67921 Alexandre Demers alexandre.f.dem...@gmail.com changed: What|Removed |Added CC||srol...@vmware.com -- You are receiving this mail because: You are the assignee for the bug. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 3/3] draw: rewrite primitive assembler
Am 08.08.2013 21:46, schrieb Zack Rusin: We can't be injecting the primitive id's in the pipeline because by that time the primitives have already been decomposed. To properly number the primitives we need to handle the adjacency primitives by hand. This patch moves the prim id injection into the original primitive assembler and completely removes the useless pipeline stage. Signed-off-by: Zack Rusin za...@vmware.com --- src/gallium/auxiliary/Makefile.sources |1 - src/gallium/auxiliary/draw/draw_context.c|8 +- src/gallium/auxiliary/draw/draw_pipe.c |4 - src/gallium/auxiliary/draw/draw_pipe.h |7 - src/gallium/auxiliary/draw/draw_pipe_ia.c| 259 -- src/gallium/auxiliary/draw/draw_pipe_validate.c | 14 -- src/gallium/auxiliary/draw/draw_prim_assembler.c | 168 +- src/gallium/auxiliary/draw/draw_prim_assembler.h | 12 + src/gallium/auxiliary/draw/draw_private.h|4 +- 9 files changed, 180 insertions(+), 297 deletions(-) delete mode 100644 src/gallium/auxiliary/draw/draw_pipe_ia.c diff --git a/src/gallium/auxiliary/Makefile.sources b/src/gallium/auxiliary/Makefile.sources index b0172de..acbcef7 100644 --- a/src/gallium/auxiliary/Makefile.sources +++ b/src/gallium/auxiliary/Makefile.sources @@ -13,7 +13,6 @@ C_SOURCES := \ draw/draw_pipe_clip.c \ draw/draw_pipe_cull.c \ draw/draw_pipe_flatshade.c \ -draw/draw_pipe_ia.c \ draw/draw_pipe_offset.c \ draw/draw_pipe_pstipple.c \ draw/draw_pipe_stipple.c \ diff --git a/src/gallium/auxiliary/draw/draw_context.c b/src/gallium/auxiliary/draw/draw_context.c index 2dc6772..2d4843e 100644 --- a/src/gallium/auxiliary/draw/draw_context.c +++ b/src/gallium/auxiliary/draw/draw_context.c @@ -40,6 +40,7 @@ #include util/u_prim.h #include draw_context.h #include draw_pipe.h +#include draw_prim_assembler.h #include draw_vs.h #include draw_gs.h @@ -95,6 +96,10 @@ draw_create_context(struct pipe_context *pipe, boolean try_llvm) if (!draw_init(draw)) goto err_destroy; + draw-ia = draw_prim_assembler_create(draw); + if (!draw-ia) + goto err_destroy; + return draw; err_destroy: @@ -206,6 +211,7 @@ void draw_destroy( struct draw_context *draw ) draw-render-destroy( draw-render ); */ + draw_prim_assembler_destroy(draw-ia); draw_pipeline_destroy( draw ); draw_pt_destroy( draw ); draw_vs_destroy( draw ); @@ -556,7 +562,7 @@ void draw_prepare_shader_outputs(struct draw_context *draw) { draw_remove_extra_vertex_attribs(draw); - draw_ia_prepare_outputs(draw, draw-pipeline.ia); + draw_prim_assembler_prepare_outputs(draw-ia); draw_unfilled_prepare_outputs(draw, draw-pipeline.unfilled); } diff --git a/src/gallium/auxiliary/draw/draw_pipe.c b/src/gallium/auxiliary/draw/draw_pipe.c index 8140299..f1ee6cb 100644 --- a/src/gallium/auxiliary/draw/draw_pipe.c +++ b/src/gallium/auxiliary/draw/draw_pipe.c @@ -49,7 +49,6 @@ boolean draw_pipeline_init( struct draw_context *draw ) draw-pipeline.clip = draw_clip_stage( draw ); draw-pipeline.flatshade = draw_flatshade_stage( draw ); draw-pipeline.cull = draw_cull_stage( draw ); - draw-pipeline.ia= draw_ia_stage( draw ); draw-pipeline.validate = draw_validate_stage( draw ); draw-pipeline.first = draw-pipeline.validate; @@ -62,7 +61,6 @@ boolean draw_pipeline_init( struct draw_context *draw ) !draw-pipeline.clip || !draw-pipeline.flatshade || !draw-pipeline.cull || - !draw-pipeline.ia || !draw-pipeline.validate) return FALSE; @@ -97,8 +95,6 @@ void draw_pipeline_destroy( struct draw_context *draw ) draw-pipeline.flatshade-destroy( draw-pipeline.flatshade ); if (draw-pipeline.cull) draw-pipeline.cull-destroy( draw-pipeline.cull ); - if (draw-pipeline.ia) - draw-pipeline.ia-destroy( draw-pipeline.ia ); if (draw-pipeline.validate) draw-pipeline.validate-destroy( draw-pipeline.validate ); if (draw-pipeline.aaline) diff --git a/src/gallium/auxiliary/draw/draw_pipe.h b/src/gallium/auxiliary/draw/draw_pipe.h index 70822a4..7c9ed6c 100644 --- a/src/gallium/auxiliary/draw/draw_pipe.h +++ b/src/gallium/auxiliary/draw/draw_pipe.h @@ -91,10 +91,6 @@ extern struct draw_stage *draw_stipple_stage( struct draw_context *context ); extern struct draw_stage *draw_wide_line_stage( struct draw_context *context ); extern struct draw_stage *draw_wide_point_stage( struct draw_context *context ); extern struct draw_stage *draw_validate_stage( struct draw_context *context ); -extern struct draw_stage *draw_ia_stage(struct draw_context *context); - -boolean draw_ia_stage_required(const struct draw_context *context, - unsigned prim); extern void
Re: [Mesa-dev] [PATCH 3/3] draw: rewrite primitive assembler
Series looks good though I'm unsure why the pipeline stage doesn't work. Where does that decomposition happen? Is that something like GS outputting multiple prims in the same topology which all need the same id? No, it's because the pipeline stage is ran on the decomposed primitives. The issue is that the pipeline stage is ran after stream output and stream output requires decomposed primitives, meaning that by the time we get to the pipeline we lost the original primitive info. The d3d10 wants the primitive id's to be injected into vertices but in the order in which they are traversed on the original (striped) primitives, so we need to do it when doing the original decomposition where we have access to the original topology and can number the vertices correctly. z ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 3/3] draw: rewrite primitive assembler
Am 09.08.2013 00:40, schrieb Zack Rusin: Series looks good though I'm unsure why the pipeline stage doesn't work. Where does that decomposition happen? Is that something like GS outputting multiple prims in the same topology which all need the same id? No, it's because the pipeline stage is ran on the decomposed primitives. The issue is that the pipeline stage is ran after stream output and stream output requires decomposed primitives, meaning that by the time we get to the pipeline we lost the original primitive info. The d3d10 wants the primitive id's to be injected into vertices but in the order in which they are traversed on the original (striped) primitives, so we need to do it when doing the original decomposition where we have access to the original topology and can number the vertices correctly. z I see I totally forgot stream out needs decomposed primitives, and I guess stream out (and prim assembler) can't run as an ordinary pipeline stage? Roland ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] mesa: Reject VertexAttribPointer() with GL_BGRA and !normalized.
On 07/30/2013 09:57 AM, Ian Romanick wrote: On 07/23/2013 03:50 PM, Matt Turner wrote: On Mon, Jul 15, 2013 at 6:37 AM, Fredrik Höglund fred...@kde.org wrote: On Monday 15 July 2013, Kenneth Graunke wrote: Fixes Piglit's ARB_vertex_attrib_bgra/api-errors test. Signed-off-by: Kenneth Graunke kenn...@whitecape.org --- src/mesa/main/varray.c | 10 ++ 1 file changed, 10 insertions(+) diff --git a/src/mesa/main/varray.c b/src/mesa/main/varray.c index 529d933..48f15bd 100644 --- a/src/mesa/main/varray.c +++ b/src/mesa/main/varray.c @@ -463,6 +463,16 @@ _mesa_VertexAttribPointer(GLuint index, GLint size, GLenum type, return; } + /* From the GL_ARB_vertex_array_bgra specification: +* The error INVALID_VALUE is generated by VertexAttribPointer if +* size is BGRA and normalized is FALSE. +*/ + if (size == GL_BGRA !normalized) { + _mesa_error(ctx, GL_INVALID_VALUE, + glVertexAttribPointer(BGRA and !normalized)); + return; + } + I think this code belongs in update_array(), since it already handles other BGRA errors, and also checks if the extension is supported. This also reminds me that we need to decide if we should make the error code depend on the GL version and entry point. Both OpenGL 3.3 and GL_ARB_vertex_attrib_binding changes the error code to GL_INVALID_OPERATION. The extension only does it for the entry points it adds. I personally don't think it's worth adding the additional complexity for that since the conditions under which the errors to be generated haven't changed. Ian, I remember you mentioning when we were showing the interns how to write piglit tests that the return value seemed wrong. Can you elaborate here? This is holding up Fredrik's ARB_vertex_attrib_binding patches. I believe the ARB_vertex_array_bgra should have specified INVALID_OPERATION as the error code. Section 2.5 (GL Errors) provides the general guidance that INVALID_VALUE is generated for Numeric argument out of range. This is error is that a GLboolean is false instead of true. Clearly false is in range for GLboolean, so INVALID_VALUE is not appropriate. Since OpenGL 3.3 corrects the error code, we should follow that behavior. Any chance of an updated patch in time for 9.2? :) Thanks, Matt ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] OpenGL ES only configuration (without desktop OpenGL support)
On 08/06/2013 02:13 PM, Siarhei Siamashka wrote: Hello, Some months ago, the commit configure.ac: Allow OpenGL ES1 and ES2 only with enabled OpenGL dropped support for the OpenGL-free configuration. http://lists.freedesktop.org/archives/mesa-dev/2013-February/033909.html http://lists.freedesktop.org/archives/mesa-commit/2013-February/041708.html Could this be possibly reverted to allow me to continue shooting myself in the foot? The support for OpenGL ES is pretty horrible in the open source software. One nice exception is Qt5 which is doing pretty well. But the rest of the software does not generally work out of the box without patches or tweaks. You can also hardly find a problem-free OpenGL ES compatible open source game (other than Quake3). I have an open feature request for Gentoo, which is a very configurable Linux distribution and should not have any troubles working either with or without OpenGL (the choice is up to the user): https://bugs.gentoo.org/show_bug.cgi?id=476524 But if upstream Mesa treats this configuration as unsupported, then I also don't see it progressing anywhere in Gentoo. So could you please re-consider this decision? We've removed all of the #ifdef code inside Mesa that would have made any difference. It was a nightmare to maintain, and we almost always got it wrong... because nobody was testing that configuration. The only thing this is possibly going to gain you is a trivial amount of build time (by not building libGL, etc.). Thanks. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 67924] New: [softpipe] piglit tesselation quad_strip wireframe regression
https://bugs.freedesktop.org/show_bug.cgi?id=67924 Priority: medium Bug ID: 67924 Keywords: regression CC: za...@vmware.com Assignee: mesa-dev@lists.freedesktop.org Summary: [softpipe] piglit tesselation quad_strip wireframe regression Severity: normal Classification: Unclassified OS: Linux (All) Reporter: v...@freedesktop.org Hardware: x86-64 (AMD64) Status: NEW Version: git Component: Other Product: Mesa mesa: 2c2e64edaba0f6aeb181ca5b51eb8dea8e9b39f9 (master) $ ./bin/ext_transform_feedback-tessellation quad_strip wireframe -auto Probe at (48,129) Expected: 0.00 0.00 0.00 0.00 Observed: 0.501961 0.501961 0.501961 0.501961 PIGLIT: {'result': 'fail' } d6b3a193d4d525c5048ebf793e6a63fd98f92d64 is the first bad commit commit d6b3a193d4d525c5048ebf793e6a63fd98f92d64 Author: Zack Rusin za...@vmware.com Date: Wed Jul 31 07:34:49 2013 -0400 draw: inject frontface info into wireframe outputs Draw module can decompose primitives into wireframe models, which is a fancy word for 'lines', unfortunately that decomposition means that we weren't able to preserve the original front-face info which could be derived from the original primitives (lines don't have a 'face'). To fix it allow draw module to inject a fake face semantic into outputs from which the backends can figure out the original frontfacing info of the primitives. Signed-off-by: Zack Rusin za...@vmware.com Reviewed-by: Roland Scheidegger srol...@vmware.com Reviewed-by: Jose Fonseca jfons...@vmware.com :04 04 c03c3c58f6b2d41a576aa2c1388323bc4fdfb21e cb98c963ea0f7797496053122ad6b5c3f56a1861 Msrc bisect run success -- You are receiving this mail because: You are the assignee for the bug. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 1/3] meta: Don't call _mesa_Ortho with width or height of 0
From: Ian Romanick ian.d.roman...@intel.com Fixes failures in oglconform fbo mipmap.manual.color, mipmap.manual.colorAndDepth, mipmap.automatic, and mipmap.manualIterateTexTargets subtests. Signed-off-by: Ian Romanick ian.d.roman...@intel.com Cc: 9.2 mesa-sta...@lists.freedesktop.org --- src/mesa/drivers/common/meta.c | 11 --- 1 file changed, 8 insertions(+), 3 deletions(-) diff --git a/src/mesa/drivers/common/meta.c b/src/mesa/drivers/common/meta.c index c62927c..69b06ed 100644 --- a/src/mesa/drivers/common/meta.c +++ b/src/mesa/drivers/common/meta.c @@ -704,9 +704,14 @@ _mesa_meta_begin(struct gl_context *ctx, GLbitfield state) _mesa_LoadIdentity(); _mesa_MatrixMode(GL_PROJECTION); _mesa_LoadIdentity(); - _mesa_Ortho(0.0, ctx-DrawBuffer-Width, - 0.0, ctx-DrawBuffer-Height, - -1.0, 1.0); + + /* glOrtho with width = 0 or height = 0 generates GL_INVALID_VALUE. + * This can occur when there is no draw buffer. + */ + if (ctx-DrawBuffer-Width != 0 ctx-DrawBuffer-Height != 0) + _mesa_Ortho(0.0, ctx-DrawBuffer-Width, + 0.0, ctx-DrawBuffer-Height, + -1.0, 1.0); } if (state MESA_META_CLIP) { -- 1.8.1.4 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 2/3] mesa: Make detach_renderbuffer available outside fbobject.c
From: Ian Romanick ian.d.roman...@intel.com Also add a return value indicating whether any work was done. This will be used by the next patch. Signed-off-by: Ian Romanick ian.d.roman...@intel.com Cc: 9.2 mesa-sta...@lists.freedesktop.org --- src/mesa/main/fbobject.c | 42 +- src/mesa/main/fbobject.h | 6 ++ 2 files changed, 39 insertions(+), 9 deletions(-) diff --git a/src/mesa/main/fbobject.c b/src/mesa/main/fbobject.c index 74f294c..d121167 100644 --- a/src/mesa/main/fbobject.c +++ b/src/mesa/main/fbobject.c @@ -1227,19 +1227,43 @@ _mesa_BindRenderbufferEXT(GLenum target, GLuint renderbuffer) * the renderbuffer. * This is used when a renderbuffer object is deleted. * The spec calls for unbinding. + * + * \returns + * \c true if the renderbuffer was detached from an attachment point. \c + * false otherwise. */ -static void -detach_renderbuffer(struct gl_context *ctx, -struct gl_framebuffer *fb, -struct gl_renderbuffer *rb) +bool +_mesa_detach_renderbuffer(struct gl_context *ctx, + struct gl_framebuffer *fb, + const void *att) { - GLuint i; + unsigned i; + bool progress = false; + for (i = 0; i BUFFER_COUNT; i++) { - if (fb-Attachment[i].Renderbuffer == rb) { + if (fb-Attachment[i].Texture == att + || fb-Attachment[i].Renderbuffer == att) { _mesa_remove_attachment(ctx, fb-Attachment[i]); + progress = true; } } - invalidate_framebuffer(fb); + + /* Section 4.4.4 (Framebuffer Completeness), subsection Whole Framebuffer +* Completeness, of the OpenGL 3.1 spec says: +* +* Performing any of the following actions may change whether the +* framebuffer is considered complete or incomplete: +* +* ... +* +*- Deleting, with DeleteTextures or DeleteRenderbuffers, an object +* containing an image that is attached to a framebuffer object +* that is bound to the framebuffer. +*/ + if (progress) + invalidate_framebuffer(fb); + + return progress; } @@ -1264,11 +1288,11 @@ _mesa_DeleteRenderbuffers(GLsizei n, const GLuint *renderbuffers) } if (_mesa_is_user_fbo(ctx-DrawBuffer)) { - detach_renderbuffer(ctx, ctx-DrawBuffer, rb); + _mesa_detach_renderbuffer(ctx, ctx-DrawBuffer, rb); } if (_mesa_is_user_fbo(ctx-ReadBuffer) ctx-ReadBuffer != ctx-DrawBuffer) { - detach_renderbuffer(ctx, ctx-ReadBuffer, rb); + _mesa_detach_renderbuffer(ctx, ctx-ReadBuffer, rb); } /* Remove from hash table immediately, to free the ID. diff --git a/src/mesa/main/fbobject.h b/src/mesa/main/fbobject.h index 0a2a5cc..ab138cf 100644 --- a/src/mesa/main/fbobject.h +++ b/src/mesa/main/fbobject.h @@ -28,6 +28,7 @@ #include compiler.h #include glheader.h +#include stdbool.h struct gl_context; struct gl_texture_object; @@ -113,6 +114,11 @@ _mesa_is_legal_color_format(const struct gl_context *ctx, GLenum baseFormat); extern GLenum _mesa_base_fbo_format(struct gl_context *ctx, GLenum internalFormat); +extern bool +_mesa_detach_renderbuffer(struct gl_context *ctx, + struct gl_framebuffer *fb, + const void *att); + extern GLboolean GLAPIENTRY _mesa_IsRenderbuffer(GLuint renderbuffer); -- 1.8.1.4 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 3/3] mesa: Use _mesa_detach_renderbuffer when deleting a texture
From: Ian Romanick ian.d.roman...@intel.com The functional change is that now invalidate_framebuffer is called if the texture is actually detached from one of the currently bound FBOs. Previously this was only done for renderbuffers. The remaining changes make the texture delete path look more similar to the renderbuffer delete path. This includes adding relevant spec quotations to justify the behavior. Fixes piglit fbo-incomplete delete texture of bound FBO test. Signed-off-by: Ian Romanick ian.d.roman...@intel.com Cc: 9.2 mesa-sta...@lists.freedesktop.org --- src/mesa/main/fbobject.c | 23 +++ src/mesa/main/texobj.c | 42 +++--- 2 files changed, 46 insertions(+), 19 deletions(-) diff --git a/src/mesa/main/fbobject.c b/src/mesa/main/fbobject.c index d121167..1034c7a 100644 --- a/src/mesa/main/fbobject.c +++ b/src/mesa/main/fbobject.c @@ -1223,10 +1223,8 @@ _mesa_BindRenderbufferEXT(GLenum target, GLuint renderbuffer) /** - * If the given renderbuffer is anywhere attached to the framebuffer, detach - * the renderbuffer. - * This is used when a renderbuffer object is deleted. - * The spec calls for unbinding. + * Remove the specified renderbuffer or texture from any attachment point in + * the framebuffer. * * \returns * \c true if the renderbuffer was detached from an attachment point. \c @@ -1287,6 +1285,23 @@ _mesa_DeleteRenderbuffers(GLsizei n, const GLuint *renderbuffers) _mesa_BindRenderbuffer(GL_RENDERBUFFER_EXT, 0); } +/* Section 4.4.2 (Attaching Images to Framebuffer Objects), + * subsection Attaching Renderbuffer Images to a Framebuffer, of + * the OpenGL 3.1 spec says: + * + * If a renderbuffer object is deleted while its image is + * attached to one or more attachment points in the currently + * bound framebuffer, then it is as if FramebufferRenderbuffer + * had been called, with a renderbuffer of 0, for each + * attachment point to which this image was attached in the + * currently bound framebuffer. In other words, this + * renderbuffer image is first detached from all attachment + * points in the currently bound framebuffer. Note that the + * renderbuffer image is specifically not detached from any + * non-bound framebuffers. Detaching the image from any + * non-bound framebuffers is the responsibility of the + * application. + */ if (_mesa_is_user_fbo(ctx-DrawBuffer)) { _mesa_detach_renderbuffer(ctx, ctx-DrawBuffer, rb); } diff --git a/src/mesa/main/texobj.c b/src/mesa/main/texobj.c index 1666a66..7c8f04d 100644 --- a/src/mesa/main/texobj.c +++ b/src/mesa/main/texobj.c @@ -1041,23 +1041,35 @@ static void unbind_texobj_from_fbo(struct gl_context *ctx, struct gl_texture_object *texObj) { - const GLuint n = (ctx-DrawBuffer == ctx-ReadBuffer) ? 1 : 2; - GLuint i; + bool progress = false; - for (i = 0; i n; i++) { - struct gl_framebuffer *fb = (i == 0) ? ctx-DrawBuffer : ctx-ReadBuffer; - if (_mesa_is_user_fbo(fb)) { - GLuint j; - for (j = 0; j BUFFER_COUNT; j++) { -if (fb-Attachment[j].Type == GL_TEXTURE -fb-Attachment[j].Texture == texObj) { - /* Vertices are already flushed by _mesa_DeleteTextures */ - ctx-NewState |= _NEW_BUFFERS; - _mesa_remove_attachment(ctx, fb-Attachment + j); -} - } - } + /* Section 4.4.2 (Attaching Images to Framebuffer Objects), subsection +* Attaching Texture Images to a Framebuffer, of the OpenGL 3.1 spec +* says: +* +* If a texture object is deleted while its image is attached to one +* or more attachment points in the currently bound framebuffer, then +* it is as if FramebufferTexture* had been called, with a texture of +* zero, for each attachment point to which this image was attached in +* the currently bound framebuffer. In other words, this texture image +* is first detached from all attachment points in the currently bound +* framebuffer. Note that the texture image is specifically not +* detached from any other framebuffer objects. Detaching the texture +* image from any other framebuffer objects is the responsibility of +* the application. +*/ + if (_mesa_is_user_fbo(ctx-DrawBuffer)) { + progress = _mesa_detach_renderbuffer(ctx, ctx-DrawBuffer, texObj); } + if (_mesa_is_user_fbo(ctx-ReadBuffer) +ctx-ReadBuffer != ctx-DrawBuffer) { + progress = _mesa_detach_renderbuffer(ctx, ctx-ReadBuffer, texObj) + || progress; + } + + if (progress) + /* Vertices are
Re: [Mesa-dev] Mesa CVS: compilation fails in u_cpu_detect.c
Am 09.08.2013 01:05, schrieb Dieter Nützel: After git pull 8. Aug 19:18 compilation fails in src/gallium/auxiliary/util/u_cpu_detect.c gmake[3]: Entering directory `/opt/mesa/src/gallium/auxiliary' CC util/u_cpu_detect.lo util/u_cpu_detect.c: In function 'sse2_has_daz': util/u_cpu_detect.c:247:4: error: 'asm' undeclared (first use in this function) util/u_cpu_detect.c:247:4: note: each undeclared identifier is reported only once for each function it appears in util/u_cpu_detect.c:247:8: error: expected ';' before 'volatile' gmake[3]: *** [util/u_cpu_detect.lo] Fehler 1 gmake[3]: Leaving directory `/opt/mesa/src/gallium/auxiliary' gmake[2]: *** [all-recursive] Fehler 1 gmake[2]: Leaving directory `/opt/mesa/src/gallium/auxiliary' gmake[1]: *** [all-recursive] Fehler 1 gmake[1]: Leaving directory `/opt/mesa/src' make: *** [all-recursive] Fehler 1 2.482u 0.628s 0:05.66 54.7% 0+0k 0+16io 0pf+0w ./configure --prefix=/usr --with-dri-drivers= --with-gallium-drivers=r600 --enable-vdpau --enable-texture-float --enable-r600-llvm-compiler This helps: --- src/gallium/auxiliary/util/u_cpu_detect.c.orig 2013-08-08 19:18:16.113550440 +0200 +++ src/gallium/auxiliary/util/u_cpu_detect.c 2013-08-09 00:08:20.681470540 +0200 @@ -244,7 +244,7 @@ fxarea.mxcsr_mask = 0; #if (defined(PIPE_CC_GCC) || defined(PIPE_CC_SUNPRO)) - asm volatile (fxsave %0 :: m (fxarea)); + __asm __volatile (fxsave %0 :: m (fxarea)); #elif (defined(PIPE_CC_MSVC) || defined(PIPE_CC_ICL)) _fxsave(fxarea); #endif Regards, Dieter Fixed thanks! Roland ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 67925] New: ARB_vertex_array_bgra api-errors test fails when normalized is false
https://bugs.freedesktop.org/show_bug.cgi?id=67925 Priority: medium Bug ID: 67925 CC: co...@octayn.net, fred...@kde.org, i...@freedesktop.org, kenn...@whitecape.org Assignee: mesa-dev@lists.freedesktop.org Blocks: 67224 Summary: ARB_vertex_array_bgra api-errors test fails when normalized is false Severity: normal Classification: Unclassified OS: All Reporter: i...@freedesktop.org Hardware: Other Status: NEW Version: git Component: Mesa core Product: Mesa There are two patches for this out on the list. Sort it out and commit something. :) http://lists.freedesktop.org/archives/mesa-dev/2013-July/041774.html http://lists.freedesktop.org/archives/mesa-dev/2013-July/042487.html -- You are receiving this mail because: You are the assignee for the bug. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 67921] [bisected commit 883987] crosscompiling fails with util/u_cpu_detect.c:247:4: error: 'asm' undeclared (first use in this function)
https://bugs.freedesktop.org/show_bug.cgi?id=67921 Roland Scheidegger srol...@vmware.com changed: What|Removed |Added Status|NEW |RESOLVED Resolution|--- |FIXED --- Comment #1 from Roland Scheidegger srol...@vmware.com --- (In reply to comment #0) However, compiling normally for my 64bit architecture works fine. Yes, the code in question is never compiled on x64. Weird though when I hacked around the ifdefs to test it on x64 it worked just fine. Anyway fixed with 8f40fa0e7f47093d6e93ca4dd12569a6f948dae6. -- You are receiving this mail because: You are the assignee for the bug. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 67926] New: [softpipe] piglit tesselation polygon wireframe regression
https://bugs.freedesktop.org/show_bug.cgi?id=67926 Priority: medium Bug ID: 67926 Keywords: have-backtrace, regression CC: za...@vmware.com Assignee: mesa-dev@lists.freedesktop.org Summary: [softpipe] piglit tesselation polygon wireframe regression Severity: critical Classification: Unclassified OS: Linux (All) Reporter: v...@freedesktop.org Hardware: x86-64 (AMD64) Status: NEW Version: git Component: Mesa core Product: Mesa [softpipe] piglit tesselation polygon wireframe regression mesa: 2c2e64edaba0f6aeb181ca5b51eb8dea8e9b39f9 (master) $ ./bin/ext_transform_feedback-tessellation polygon wireframe -auto Segmentation fault (core dumped) (gdb) bt #0 0x7fb5df2ba92e in sp_setup_line (setup=0x9ed330, v0=0xac4280, v1=0xbc21c0) at sp_setup.c:1074 #1 0x7fb5df2a8f42 in sp_vbuf_draw_elements (vbr=0x9ed290, indices=0x9effd0, nr=6) at sp_prim_vbuf.c:180 #2 0x7fb5df18f6c8 in vbuf_flush_vertices (vbuf=0x9eff10) at draw/draw_pipe_vbuf.c:327 #3 0x7fb5df18f8ca in vbuf_flush (stage=0x9eff10, flags=4) at draw/draw_pipe_vbuf.c:392 #4 0x7fb5df18ddde in unfilled_flush (stage=0x933860, flags=4) at draw/draw_pipe_unfilled.c:209 #5 0x7fb5df189303 in cull_flush (stage=0x9e7210, flags=4) at draw/draw_pipe_cull.c:231 #6 0x7fb5df188acd in clip_flush (stage=0x9e0b10, flags=4) at draw/draw_pipe_clip.c:748 #7 0x7fb5df182226 in draw_pipeline_flush (draw=0x9dc800, flags=4) at draw/draw_pipe.c:353 #8 0x7fb5df17b9b6 in draw_do_flush (draw=0x9dc800, flags=4) at draw/draw_context.c:722 #9 0x7fb5df17ad9e in draw_flush (draw=0x9dc800) at draw/draw_context.c:225 #10 0x7fb5df2a8b2a in softpipe_draw_vbo (pipe=0x945080, info=0x7056a420) at sp_draw_arrays.c:145 #11 0x7fb5df17968a in cso_draw_vbo (cso=0xa775a0, info=0x7056a420) at cso_cache/cso_context.c:1413 #12 0x7fb5df084526 in st_draw_vbo (ctx=0xa0b920, prims=0x7056a4e0, nr_prims=1, ib=0x0, index_bounds_valid=1 '\001', min_index=0, max_index=4, tfb_vertcount=0x0) at ../../src/mesa/state_tracker/st_draw.c:286 #13 0x7fb5df04082d in vbo_draw_arrays (ctx=0xa0b920, mode=9, start=0, count=5, numInstances=1, baseInstance=0) at ../../src/mesa/vbo/vbo_exec_array.c:624 #14 0x7fb5df04127e in vbo_exec_DrawArrays (mode=9, start=0, count=5) at ../../src/mesa/vbo/vbo_exec_array.c:776 #15 0x7fb5e2c43d9c in stub_glDrawArrays (mode=9, first=0, count=5) at piglit/tests/util/generated_dispatch.c:6223 #16 0x004021b0 in draw (prog=3, use_xfb=false, y_offset=64, mode=9, num_vertices=5) at piglit/tests/spec/ext_transform_feedback/tessellation.c:446 #17 0x00402836 in piglit_display () at piglit/tests/spec/ext_transform_feedback/tessellation.c:618 #18 0x7fb5e2c35e1c in display () at piglit/tests/util/piglit-framework-gl/piglit_glut_framework.c:60 #19 0x7fb5e23d3fc4 in fghRedrawWindow (window=0x91a140) at freeglut_main.c:210 #20 fghcbDisplayWindow (window=0x91a140, enumerator=0x7056a9c0) at freeglut_main.c:227 #21 0x7fb5e23d7719 in fgEnumWindows (enumCallback=enumCallback@entry=0x7fb5e23d3f20 fghcbDisplayWindow, enumerator=enumerator@entry=0x7056a9c0) at freeglut_structure.c:394 #22 0x7fb5e23d445c in fghDisplayAll () at freeglut_main.c:249 #23 glutMainLoopEvent () at freeglut_main.c:1450 #24 0x7fb5e23d4d81 in glutMainLoop () at freeglut_main.c:1498 #25 0x7fb5e2c3604b in run_test (gl_fw=0x7fb5e2f1e040 glut_fw, argc=3, argv=0x7056ad78) at piglit/tests/util/piglit-framework-gl/piglit_glut_framework.c:142 #26 0x7fb5e2c340e9 in piglit_gl_test_run (argc=3, argv=0x7056ad78, config=0x7056ac60) at piglit/tests/util/piglit-framework-gl.c:141 #27 0x00401a81 in main (argc=3, argv=0x7056ad78) at piglit/tests/spec/ext_transform_feedback/tessellation.c:131 (gdb) frame 0 #0 0x7fb5df2ba92e in sp_setup_line (setup=0x9ed330, v0=0xac4280, v1=0xbc21c0) at sp_setup.c:1074 1074 int x1 = (int) v1[0][0]; (gdb) print v1[0] Cannot access memory at address 0xbc21c0 d6b3a193d4d525c5048ebf793e6a63fd98f92d64 is the first bad commit commit d6b3a193d4d525c5048ebf793e6a63fd98f92d64 Author: Zack Rusin za...@vmware.com Date: Wed Jul 31 07:34:49 2013 -0400 draw: inject frontface info into wireframe outputs Draw module can decompose primitives into wireframe models, which is a fancy word for 'lines', unfortunately that decomposition means that we weren't able to preserve the original front-face info which could be derived from the original primitives (lines don't have a 'face'). To fix it allow draw module to inject a fake face semantic into outputs from which the backends can figure out the original frontfacing info of the primitives. Signed-off-by: Zack Rusin za...@vmware.com Reviewed-by: Roland Scheidegger srol...@vmware.com
[Mesa-dev] [PATCH] glsl: Don't allow const on out or inout function parameters
From: Ian Romanick ian.d.roman...@intel.com Fixes piglit tests const-inout-parameter.frag and const-out-parameter.frag. Signed-off-by: Ian Romanick ian.d.roman...@intel.com Cc: 9.2 mesa-sta...@lists.freedesktop.org Cc: Matt Turner matts...@gmail.com --- This currently causes a regression in relaxed-parameter-qualifier-ordering.vert. I believe that test is incorrect (by using const with out and inout parameters), and a patch has been sent to the piglit list already. src/glsl/ast_to_hir.cpp | 15 +++ 1 file changed, 15 insertions(+) diff --git a/src/glsl/ast_to_hir.cpp b/src/glsl/ast_to_hir.cpp index 482ab3c..6266456 100644 --- a/src/glsl/ast_to_hir.cpp +++ b/src/glsl/ast_to_hir.cpp @@ -1967,6 +1967,21 @@ apply_type_qualifier_to_variable(const struct ast_type_qualifier *qual, _mesa_glsl_shader_target_name(state-target)); } + /* Section 6.1.1 (Function Calling Conventions) of the GLSL 1.10 spec says: +* +* However, the const qualifier cannot be used with out or inout. +* +* The same section of the GLSL 4.40 spec further clarifies this saying: +* +* The const qualifier cannot be used with out or inout, or a +* compile-time error results. +*/ + if (is_parameter qual-flags.q.constant qual-flags.q.out) { + _mesa_glsl_error(loc, state, + `const' may not be applied to `out' or `inout' + function parameters); + } + /* If there is no qualifier that changes the mode of the variable, leave * the setting alone. */ -- 1.8.1.4 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 67925] ARB_vertex_array_bgra api-errors test fails when normalized is false
https://bugs.freedesktop.org/show_bug.cgi?id=67925 --- Comment #1 from Fredrik Höglund fred...@kde.org --- I have yet a third version in my arb-vertex-attrib-binding branch: http://cgit.freedesktop.org/~fredrik/mesa/commit/?h=arb-vertex-attrib-bindingid=df98f9f8 -- You are receiving this mail because: You are the assignee for the bug. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 3/3] draw: rewrite primitive assembler
Am 09.08.2013 00:40, schrieb Zack Rusin: Series looks good though I'm unsure why the pipeline stage doesn't work. Where does that decomposition happen? Is that something like GS outputting multiple prims in the same topology which all need the same id? No, it's because the pipeline stage is ran on the decomposed primitives. The issue is that the pipeline stage is ran after stream output and stream output requires decomposed primitives, meaning that by the time we get to the pipeline we lost the original primitive info. The d3d10 wants the primitive id's to be injected into vertices but in the order in which they are traversed on the original (striped) primitives, so we need to do it when doing the original decomposition where we have access to the original topology and can number the vertices correctly. z I see I totally forgot stream out needs decomposed primitives, and I guess stream out (and prim assembler) can't run as an ordinary pipeline stage? I was thinking about that when I was doing it and I thought it should be possible to rewrite SO as a pipeline stage, but we'd need to change the interface to include some sort of a prepare stage and then redo the code in so. Once so would be in a pipeline then we could think about primitive assembler, but that would require also more changes to the pipeline because we want to know if the primitives are adjacency primitives and pipeline stages get only tris/lines/points... and this was the point at which I went screw it, i'm injecting prim ids in the primitive assembler. z ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 67928] New: [softpipe] piglit tesselation quads wireframe regression
https://bugs.freedesktop.org/show_bug.cgi?id=67928 Priority: medium Bug ID: 67928 Keywords: regression CC: za...@vmware.com Assignee: mesa-dev@lists.freedesktop.org Summary: [softpipe] piglit tesselation quads wireframe regression Severity: critical Classification: Unclassified OS: Linux (All) Reporter: v...@freedesktop.org Hardware: x86-64 (AMD64) Status: NEW Version: git Component: Mesa core Product: Mesa mesa: 2c2e64edaba0f6aeb181ca5b51eb8dea8e9b39f9 (master) $ ./bin/ext_transform_feedback-tessellation quads wireframe -auto *** Error in `./bin/ext_transform_feedback-tessellation': double free or corruption (!prev): 0x01bc1910 *** d6b3a193d4d525c5048ebf793e6a63fd98f92d64 is the first bad commit commit d6b3a193d4d525c5048ebf793e6a63fd98f92d64 Author: Zack Rusin za...@vmware.com Date: Wed Jul 31 07:34:49 2013 -0400 draw: inject frontface info into wireframe outputs Draw module can decompose primitives into wireframe models, which is a fancy word for 'lines', unfortunately that decomposition means that we weren't able to preserve the original front-face info which could be derived from the original primitives (lines don't have a 'face'). To fix it allow draw module to inject a fake face semantic into outputs from which the backends can figure out the original frontfacing info of the primitives. Signed-off-by: Zack Rusin za...@vmware.com Reviewed-by: Roland Scheidegger srol...@vmware.com Reviewed-by: Jose Fonseca jfons...@vmware.com :04 04 c03c3c58f6b2d41a576aa2c1388323bc4fdfb21e cb98c963ea0f7797496053122ad6b5c3f56a1861 Msrc bisect run success -- You are receiving this mail because: You are the assignee for the bug. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] glsl: Don't allow const on out or inout function parameters
On Thu, Aug 8, 2013 at 4:54 PM, Ian Romanick i...@freedesktop.org wrote: From: Ian Romanick ian.d.roman...@intel.com Fixes piglit tests const-inout-parameter.frag and const-out-parameter.frag. Signed-off-by: Ian Romanick ian.d.roman...@intel.com Cc: 9.2 mesa-sta...@lists.freedesktop.org Cc: Matt Turner matts...@gmail.com --- This currently causes a regression in relaxed-parameter-qualifier-ordering.vert. I believe that test is incorrect (by using const with out and inout parameters), and a patch has been sent to the piglit list already. Reviewed-by: Matt Turner matts...@gmail.com ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] r600g: disable GPUVM by default
Cayman and trinity systems still seem to suffer from stability problems with GPUVM. This also fixes compute on these asics. It can still be enabled for testing by setting env var RADEON_VA=true. Fixes: https://bugs.freedesktop.org/show_bug.cgi?id=65958 Signed-off-by: Alex Deucher alexander.deuc...@amd.com CC: 9.2 mesa-sta...@lists.freedesktop.org CC: 9.1 mesa-sta...@lists.freedesktop.org --- src/gallium/winsys/radeon/drm/radeon_drm_winsys.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/gallium/winsys/radeon/drm/radeon_drm_winsys.c b/src/gallium/winsys/radeon/drm/radeon_drm_winsys.c index 033e78f..69c42a0 100644 --- a/src/gallium/winsys/radeon/drm/radeon_drm_winsys.c +++ b/src/gallium/winsys/radeon/drm/radeon_drm_winsys.c @@ -404,7 +404,7 @@ static boolean do_winsys_init(struct radeon_drm_winsys *ws) ws-info.r600_ib_vm_max_size)) ws-info.r600_virtual_address = FALSE; } - if (ws-gen == DRV_R600 !debug_get_bool_option(RADEON_VA, TRUE)) + if (ws-gen == DRV_R600 !debug_get_bool_option(RADEON_VA, FALSE)) ws-info.r600_virtual_address = FALSE; } -- 1.8.3.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [i965][V2] i965/draw: Move constant formation outside of for loop and use an enum.
On Thu, Aug 8, 2013 at 2:19 PM, Eric Anholt e...@anholt.net wrote: Mark Mueller markkmuel...@gmail.com writes: Signed-off-by: Mark Mueller markkmuel...@gmail.com --- src/mesa/drivers/dri/i965/brw_draw.c | 16 ++-- 1 file changed, 6 insertions(+), 10 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_draw.c b/src/mesa/drivers/dri/i965/brw_draw.c index 6170d07..1b5ed55 100644 --- a/src/mesa/drivers/dri/i965/brw_draw.c +++ b/src/mesa/drivers/dri/i965/brw_draw.c @@ -367,6 +367,12 @@ static bool brw_try_draw_prims( struct gl_context *ctx, bool retval = true; GLuint i; bool fail_next = false; + const int estimated_max_prim_size = + 512 + /* batchbuffer commands */ + ((BRW_MAX_TEX_UNIT * (sizeof(struct brw_sampler_state) + sizeof(struct gen5_sampler_default_color + + 1024 + /* gen6 VS push constants */ + 1024 + /* gen6 WM push constants */ + 512; /* misc. pad */ What's the point of this change? Moving loop invariants out of loops is something basic that your compiler does, Is that universally true for the code as it looked originally (see below)? I've worked on embedded Atom and other systems with heavily dumbed down gcc or other cross compilers. For instance there is a good chance that the compilers from vehicle infotainment systems that I've worked on recently would generate assembly for each line of code below inside the loop. original code: int estimated_max_prim_size; estimated_max_prim_size = 512; /* batchbuffer commands */ estimated_max_prim_size += (BRW_MAX_TEX_UNIT * (sizeof(struct brw_sampler_state) + sizeof(struct gen5_sampler_default_color))); estimated_max_prim_size += 1024; /* gen6 VS push constants */ estimated_max_prim_size += 1024; /* gen6 WM push constants */ estimated_max_prim_size += 512; /* misc. pad */ . . and you're moving the definition farther from its use. Sure, yet it's in the company of fail_next with a similar problem. What about keeping the definition inside the for loop but adding the const keyword and adding all of the immediates as one operation? for (i = 0; i nr_prims; i++) { const int estimated_max_prim_size = 512 + /* batchbuffer commands */ ((BRW_MAX_TEX_UNIT * (sizeof(struct brw_sampler_state) + sizeof(struct gen5_sampler_default_color + 1024 + /* gen6 VS push constants */ 1024 + /* gen6 WM push constants */ 512; /* misc. pad */ I'm also fine with dropping the whole thing and moving on as the real point of this was to calibrate myself to the contribution process. It's been good for that. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev