Re: [Mesa-dev] [PATCH] r600g: remove one pointless flush
On Sat, 2011-10-29 at 02:32 +0200, Marek Olšák wrote: On Sat, Oct 29, 2011 at 12:58 AM, Vadim Girlin vadimgir...@gmail.com wrote: On Sat, 2011-10-29 at 01:29 +0400, Vadim Girlin wrote: On Fri, 2011-10-28 at 23:16 +0200, Marek Olšák wrote: On Fri, Oct 28, 2011 at 10:52 PM, Vadim Girlin vadimgir...@gmail.com wrote: On Fri, 2011-10-28 at 19:47 +0200, Marek Olšák wrote: I've got no idea what the flush was good for, but it's useless from the looks of it. The rest of the patch is just a cleanup resulting from some of the variables being no longer used for anything useful. There are no piglit regressions. It was intended to handle multiple interleaved query and conditional render calls with single query object (in this case in theory we may have multiple outstanding queries in current CS and separate data block in the buffer for each query, with possible buffer overflow). I wasn't Do you mean this? while (1) { begin_query(q); draw(); end_query(q); render_condition(q); draw(); render_condition(NULL); } begin_query always resets query results to 0 anyway, so in theory, we could re-use the same data block over and over again. I think it's possible to run this loop without flushes, so we'll have multiple queries queued in current cs. From the driver point of view these queries will be executed simultaneously after flush, that's why we need to reserve and initialize separate data blocks for them. Probably we also need to use PIPE_TRANSFER_UNSYCHRONIZED to avoid the flush when mapping the buffer for data block initialization in the r600_query_begin. It seems I missed this, or the mapping semantics were changed with winsys change. Ah I see. I entirely missed the map/unmap part. Of course there is an implicit sync in the winsys (the flush is not so expensive, but the sync is). UNSYCHRONIZED would be dangerous in this particular case, because the GPU may still use the previous buffer data. UNSYCHRONIZED can only be used if we're 100% sure the GPU doesn't use the buffer range we're going to change. I see only two ways out of this: 1) If the buffer is full, we can allocate another one and use that. I don't think we have any other choice with current Mesa master. 2) We can program the GPU to memset the buffer. This would be very easy with transform feedback. I prefer (2). (2) looks interesting, though currently I know next to nothing about transform feedback on radeon hw. I also have some ideas about possible improvements for this code, but I never found the time to try them, and probably it will require more work than your way. I was thinking about the following: 1) We can pre-initialize all data blocks in the buffer at once, e.g. with memcpy, to avoid map/init/unmap overhead for every data block. 2) We can use common data buffer(s) for all occlusion queries to reduce allocation overhead and to avoid unnecessary initialization of multiple data blocks for one-shot queries. Common buffer will be used sequentially, similar to existing code. 3) We can use two big enough common buffers in turn, to reduce/avoid sync overhead. With high probability (depending on the size) one buffer will be completely processed by the GPU while we are emitting queries with another. When current buffer is over, we'll initialize and use another one. Probably having e.g. 32Kb buffers should be enough to avoid waiting for GPU completely in most cases. 4) We can offload next buffer initialization to another thread. 5) We can adjust the size/number of the buffers dynamically, depending on the usage. Anyway, I'm OK with your solution. Vadim ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] r600g: remove one pointless flush
On Fri, 2011-10-28 at 19:47 +0200, Marek Olšák wrote: I've got no idea what the flush was good for, but it's useless from the looks of it. The rest of the patch is just a cleanup resulting from some of the variables being no longer used for anything useful. There are no piglit regressions. It was intended to handle multiple interleaved query and conditional render calls with single query object (in this case in theory we may have multiple outstanding queries in current CS and separate data block in the buffer for each query, with possible buffer overflow). I wasn't sure if this is possible to do in one CS, so I used this additional check to be on the safe side. Probably such situation is not possible currently due to the flushes from the other paths, but might be possible in the future after removing all unnecessary flushes. If you think it won't be possible at all, I'm OK with this patch. Though I don't like existing solution in any case, so even if we'll need this check later, probably it should be done in some better way. IIRR this case isn't covered by the piglit. Probably I'll try to create/modify some test for this. Vadim --- src/gallium/drivers/r600/r600.h|6 +- src/gallium/drivers/r600/r600_blit.c |2 +- src/gallium/drivers/r600/r600_hw_context.c | 21 ++--- 3 files changed, 4 insertions(+), 25 deletions(-) diff --git a/src/gallium/drivers/r600/r600.h b/src/gallium/drivers/r600/r600.h index f58549a..9367651 100644 --- a/src/gallium/drivers/r600/r600.h +++ b/src/gallium/drivers/r600/r600.h @@ -176,10 +176,6 @@ struct r600_query { unsignedresults_end; /* Size of the result */ unsignedresult_size; - /* Count of new queries started in one stream without flushing */ - unsignedqueries_emitted; - /* State flags */ - boolean flushed; /* The buffer where query results are stored. It's used as a ring, * data blocks for current query are stored sequentially from * results_start to results_end, with wrapping on the buffer end */ @@ -258,7 +254,7 @@ boolean r600_context_query_result(struct r600_context *ctx, void r600_query_begin(struct r600_context *ctx, struct r600_query *query); void r600_query_end(struct r600_context *ctx, struct r600_query *query); void r600_context_queries_suspend(struct r600_context *ctx); -void r600_context_queries_resume(struct r600_context *ctx, boolean flushed); +void r600_context_queries_resume(struct r600_context *ctx); void r600_query_predication(struct r600_context *ctx, struct r600_query *query, int operation, int flag_wait); void r600_context_emit_fence(struct r600_context *ctx, struct r600_resource *fence, diff --git a/src/gallium/drivers/r600/r600_blit.c b/src/gallium/drivers/r600/r600_blit.c index 3eba0ad..9326dc6 100644 --- a/src/gallium/drivers/r600/r600_blit.c +++ b/src/gallium/drivers/r600/r600_blit.c @@ -96,7 +96,7 @@ static void r600_blitter_end(struct pipe_context *ctx) rctx-saved_render_cond_mode); rctx-saved_render_cond = NULL; } - r600_context_queries_resume(rctx-ctx, FALSE); + r600_context_queries_resume(rctx-ctx); } static unsigned u_num_layers(struct pipe_resource *r, unsigned level) diff --git a/src/gallium/drivers/r600/r600_hw_context.c b/src/gallium/drivers/r600/r600_hw_context.c index a7d7ce6..1332748 100644 --- a/src/gallium/drivers/r600/r600_hw_context.c +++ b/src/gallium/drivers/r600/r600_hw_context.c @@ -1521,7 +1521,7 @@ void r600_context_flush(struct r600_context *ctx, unsigned flags) r600_init_cs(ctx); /* resume queries */ - r600_context_queries_resume(ctx, TRUE); + r600_context_queries_resume(ctx); /* set all valid group as dirty so they get reemited on * next draw command @@ -1619,18 +1619,6 @@ void r600_query_begin(struct r600_context *ctx, struct r600_query *query) r600_context_flush(ctx, RADEON_FLUSH_ASYNC); } - if (query-type == PIPE_QUERY_OCCLUSION_COUNTER) { - /* Count queries emitted without flushes, and flush if more than - * half of buffer used, to avoid overwriting results which may be - * still in use. */ - if (query-flushed) { - query-queries_emitted = 1; - } else { - if (++query-queries_emitted query-buffer-b.b.b.width0 / query-result_size / 2) - r600_context_flush(ctx, RADEON_FLUSH_ASYNC); - } - } - new_results_end = query-results_end + query-result_size; if (new_results_end = query-buffer-b.b.b.width0) new_results_end = 0; @@ -1713,8 +1701,6 @@ void
Re: [Mesa-dev] [PATCH] r600g: remove one pointless flush
On Fri, Oct 28, 2011 at 10:52 PM, Vadim Girlin vadimgir...@gmail.com wrote: On Fri, 2011-10-28 at 19:47 +0200, Marek Olšák wrote: I've got no idea what the flush was good for, but it's useless from the looks of it. The rest of the patch is just a cleanup resulting from some of the variables being no longer used for anything useful. There are no piglit regressions. It was intended to handle multiple interleaved query and conditional render calls with single query object (in this case in theory we may have multiple outstanding queries in current CS and separate data block in the buffer for each query, with possible buffer overflow). I wasn't Do you mean this? while (1) { begin_query(q); draw(); end_query(q); render_condition(q); draw(); render_condition(NULL); } begin_query always resets query results to 0 anyway, so in theory, we could re-use the same data block over and over again. I think the only case where buffer overflow may happen is: begin_query(q); while (1) { draw(); suspend_queries(); // flush or u_blitter resume_queries(); } end_query(q); Correct? Marek ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] r600g: remove one pointless flush
On Fri, 2011-10-28 at 23:16 +0200, Marek Olšák wrote: On Fri, Oct 28, 2011 at 10:52 PM, Vadim Girlin vadimgir...@gmail.com wrote: On Fri, 2011-10-28 at 19:47 +0200, Marek Olšák wrote: I've got no idea what the flush was good for, but it's useless from the looks of it. The rest of the patch is just a cleanup resulting from some of the variables being no longer used for anything useful. There are no piglit regressions. It was intended to handle multiple interleaved query and conditional render calls with single query object (in this case in theory we may have multiple outstanding queries in current CS and separate data block in the buffer for each query, with possible buffer overflow). I wasn't Do you mean this? while (1) { begin_query(q); draw(); end_query(q); render_condition(q); draw(); render_condition(NULL); } begin_query always resets query results to 0 anyway, so in theory, we could re-use the same data block over and over again. I think it's possible to run this loop without flushes, so we'll have multiple queries queued in current cs. From the driver point of view these queries will be executed simultaneously after flush, that's why we need to reserve and initialize separate data blocks for them. Vadim I think the only case where buffer overflow may happen is: begin_query(q); while (1) { draw(); suspend_queries(); // flush or u_blitter resume_queries(); } end_query(q); Correct? Marek ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] r600g: remove one pointless flush
On Sat, 2011-10-29 at 01:29 +0400, Vadim Girlin wrote: On Fri, 2011-10-28 at 23:16 +0200, Marek Olšák wrote: On Fri, Oct 28, 2011 at 10:52 PM, Vadim Girlin vadimgir...@gmail.com wrote: On Fri, 2011-10-28 at 19:47 +0200, Marek Olšák wrote: I've got no idea what the flush was good for, but it's useless from the looks of it. The rest of the patch is just a cleanup resulting from some of the variables being no longer used for anything useful. There are no piglit regressions. It was intended to handle multiple interleaved query and conditional render calls with single query object (in this case in theory we may have multiple outstanding queries in current CS and separate data block in the buffer for each query, with possible buffer overflow). I wasn't Do you mean this? while (1) { begin_query(q); draw(); end_query(q); render_condition(q); draw(); render_condition(NULL); } begin_query always resets query results to 0 anyway, so in theory, we could re-use the same data block over and over again. I think it's possible to run this loop without flushes, so we'll have multiple queries queued in current cs. From the driver point of view these queries will be executed simultaneously after flush, that's why we need to reserve and initialize separate data blocks for them. Probably we also need to use PIPE_TRANSFER_UNSYCHRONIZED to avoid the flush when mapping the buffer for data block initialization in the r600_query_begin. It seems I missed this, or the mapping semantics were changed with winsys change. Vadim Vadim I think the only case where buffer overflow may happen is: begin_query(q); while (1) { draw(); suspend_queries(); // flush or u_blitter resume_queries(); } end_query(q); Correct? Marek ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] r600g: remove one pointless flush
On Sat, Oct 29, 2011 at 12:58 AM, Vadim Girlin vadimgir...@gmail.com wrote: On Sat, 2011-10-29 at 01:29 +0400, Vadim Girlin wrote: On Fri, 2011-10-28 at 23:16 +0200, Marek Olšák wrote: On Fri, Oct 28, 2011 at 10:52 PM, Vadim Girlin vadimgir...@gmail.com wrote: On Fri, 2011-10-28 at 19:47 +0200, Marek Olšák wrote: I've got no idea what the flush was good for, but it's useless from the looks of it. The rest of the patch is just a cleanup resulting from some of the variables being no longer used for anything useful. There are no piglit regressions. It was intended to handle multiple interleaved query and conditional render calls with single query object (in this case in theory we may have multiple outstanding queries in current CS and separate data block in the buffer for each query, with possible buffer overflow). I wasn't Do you mean this? while (1) { begin_query(q); draw(); end_query(q); render_condition(q); draw(); render_condition(NULL); } begin_query always resets query results to 0 anyway, so in theory, we could re-use the same data block over and over again. I think it's possible to run this loop without flushes, so we'll have multiple queries queued in current cs. From the driver point of view these queries will be executed simultaneously after flush, that's why we need to reserve and initialize separate data blocks for them. Probably we also need to use PIPE_TRANSFER_UNSYCHRONIZED to avoid the flush when mapping the buffer for data block initialization in the r600_query_begin. It seems I missed this, or the mapping semantics were changed with winsys change. Ah I see. I entirely missed the map/unmap part. Of course there is an implicit sync in the winsys (the flush is not so expensive, but the sync is). UNSYCHRONIZED would be dangerous in this particular case, because the GPU may still use the previous buffer data. UNSYCHRONIZED can only be used if we're 100% sure the GPU doesn't use the buffer range we're going to change. I see only two ways out of this: 1) If the buffer is full, we can allocate another one and use that. I don't think we have any other choice with current Mesa master. 2) We can program the GPU to memset the buffer. This would be very easy with transform feedback. I prefer (2). Marek ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev