Re: [Mesa-dev] [PATCH] r600g: order atom emission

2012-09-07 Thread Jerome Glisse
On Thu, Sep 6, 2012 at 11:32 AM, Alex Deucher alexdeuc...@gmail.com wrote:
 On Thu, Sep 6, 2012 at 10:54 AM, Jerome Glisse j.gli...@gmail.com wrote:
 On Thu, Sep 6, 2012 at 6:20 AM, Dave Airlie airl...@gmail.com wrote:
 On Thu, Sep 6, 2012 at 5:21 PM, Philipp Klaus Krause p...@spth.de wrote:
 On 06.09.2012 07:35, j.gli...@gmail.com wrote:
 From: Jerome Glisse jgli...@redhat.com

 To avoid GPU lockup registers must be emited in a specific order
 (no kidding ...). This patch rework atom emission so order in which
 atom are emited in respect to each other is always the same. We
 don't have any informations on what is the correct order so order
 will need to be infered from fglrx command stream.

 Shouldn't this be stated in comments, so the next person who comes along
 and makes a change in this code doesn't inadvertently change the order?

 Also a comment on what ordering matters most, like I suspect this is
 just hiding a real issue.

 Dave.

 No it's not hiding an issue, afaict it's how the hw works. The hw do
 what some amd document call states validations. So here is how i
 understand how things happen and i can be completely wrong. Hw process
 register write in order it receive them and to avoid postponing state
 validation the hw do state validation while processing register. That
 means if writing register A trigger state validation that use some
 field of register B the hw might not redo state validation when
 register B is latter written. ie only some register trigger the state
 validation no matter on what they depends on. I believe state
 validation is only use as pipeline optimization by the hw, so the hw
 knows it can take some short cut. But in some rare case if short cut
 are taken for wrong reasons we end up in GPU lockup.

 No matter if my guess is right or wrong, i know for a fact that
 register order is important in some situation, that's the hard bottom
 line, no matter what is the reasons inside the hw.

 This patch is far from having all the order right, it's just a first
 step, i am atomizing everything and it's what needed to go forward
 without regression.

 I've talked to the internal hw and sw guys and they said there isn't
 any specific ordering required and the closed driver doesn't impose
 any specific order.  The pipeline doesn't get kicked off until a draw
 command is issued, so I don't see why the state update order would
 matter.  It's possible there are subtle ordering requirements and the
 closed driver just happened to get it right.  There are dependencies
 and hw bug workarounds however.  E.g., some blocks snoop registers
 from other blocks so you need to make sure those dependant registers
 have been initialized before drawing.  I don't know if it's the
 ordering so much as making sure we emit all the necessary state when
 needed.  The closed driver tends to update a lot more state the is
 minimally required for a lot of things.  That said, it probably
 wouldn't hurt to mirror the closed driver more closely.

 Alex

I don't know what are the reason but what register are emitted and
along which other register definitely matter. All files i am talking
in this mail are located at :
http://people.freedesktop.org/~glisse/registerposition/

So if you apply :
0001-r600g-FORCE-LOCKUP-BY-EMITTING-OR-NOT-REGISTER.patch

and run piglit test like in lockup-longprim.sh you will lockup the GPU
(i only tested on r6xx, r7xx so far).

I double checked through automated tools that no register that was
written by command stream from longprim piglist test are reprogram
properly by the fbo test (if you have my constant buffer size patch i
sent earlier).

The only diff with command stream is one where
R_02881C_PA_CL_VS_OUT_CNTL is emitted with each and the other only
once, when emitted with each draw it lockups.

bad command stream r600g-long-prim-simple-b.txt
good one r600g-long-prim-simple-g.txt
diff r600g-long-prim-simple-d.txt

Given the bad one emit more register some draw command are moved to
the second cs.

Emitting some other register along PA_CL_VS_OUT_CNTL fix the lockup
(don't have short list) but many other register behave the same as
PA_CL_VS_OUT_CNTL. So if order does not matter then register group
definitely does. I really wish that the hw were less picky about how
command stream are supposed to be formated. Anyhow given that we have
no information on what register need to be emitted together, mimicking
fglrx sounds like the way to go.

Cheers,
Jerome
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] r600g: order atom emission

2012-09-06 Thread Philipp Klaus Krause
On 06.09.2012 07:35, j.gli...@gmail.com wrote:
 From: Jerome Glisse jgli...@redhat.com
 
 To avoid GPU lockup registers must be emited in a specific order
 (no kidding ...). This patch rework atom emission so order in which
 atom are emited in respect to each other is always the same. We
 don't have any informations on what is the correct order so order
 will need to be infered from fglrx command stream.

Shouldn't this be stated in comments, so the next person who comes along
and makes a change in this code doesn't inadvertently change the order?

Philipp
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] r600g: order atom emission

2012-09-06 Thread Alex Deucher
On Thu, Sep 6, 2012 at 10:54 AM, Jerome Glisse j.gli...@gmail.com wrote:
 On Thu, Sep 6, 2012 at 6:20 AM, Dave Airlie airl...@gmail.com wrote:
 On Thu, Sep 6, 2012 at 5:21 PM, Philipp Klaus Krause p...@spth.de wrote:
 On 06.09.2012 07:35, j.gli...@gmail.com wrote:
 From: Jerome Glisse jgli...@redhat.com

 To avoid GPU lockup registers must be emited in a specific order
 (no kidding ...). This patch rework atom emission so order in which
 atom are emited in respect to each other is always the same. We
 don't have any informations on what is the correct order so order
 will need to be infered from fglrx command stream.

 Shouldn't this be stated in comments, so the next person who comes along
 and makes a change in this code doesn't inadvertently change the order?

 Also a comment on what ordering matters most, like I suspect this is
 just hiding a real issue.

 Dave.

 No it's not hiding an issue, afaict it's how the hw works. The hw do
 what some amd document call states validations. So here is how i
 understand how things happen and i can be completely wrong. Hw process
 register write in order it receive them and to avoid postponing state
 validation the hw do state validation while processing register. That
 means if writing register A trigger state validation that use some
 field of register B the hw might not redo state validation when
 register B is latter written. ie only some register trigger the state
 validation no matter on what they depends on. I believe state
 validation is only use as pipeline optimization by the hw, so the hw
 knows it can take some short cut. But in some rare case if short cut
 are taken for wrong reasons we end up in GPU lockup.

 No matter if my guess is right or wrong, i know for a fact that
 register order is important in some situation, that's the hard bottom
 line, no matter what is the reasons inside the hw.

 This patch is far from having all the order right, it's just a first
 step, i am atomizing everything and it's what needed to go forward
 without regression.

I've talked to the internal hw and sw guys and they said there isn't
any specific ordering required and the closed driver doesn't impose
any specific order.  The pipeline doesn't get kicked off until a draw
command is issued, so I don't see why the state update order would
matter.  It's possible there are subtle ordering requirements and the
closed driver just happened to get it right.  There are dependencies
and hw bug workarounds however.  E.g., some blocks snoop registers
from other blocks so you need to make sure those dependant registers
have been initialized before drawing.  I don't know if it's the
ordering so much as making sure we emit all the necessary state when
needed.  The closed driver tends to update a lot more state the is
minimally required for a lot of things.  That said, it probably
wouldn't hurt to mirror the closed driver more closely.

Alex


 Note that i have been told that in the r100/r200 days same issue came
 up and registers needed to be written in some specific order (well
 only some register matter but i doubt we have a good knowledge on
 that).

 Cheers,
 Jerome
 ___
 mesa-dev mailing list
 mesa-dev@lists.freedesktop.org
 http://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] r600g: order atom emission

2012-09-06 Thread Jerome Glisse
On Thu, Sep 6, 2012 at 11:32 AM, Alex Deucher alexdeuc...@gmail.com wrote:
 On Thu, Sep 6, 2012 at 10:54 AM, Jerome Glisse j.gli...@gmail.com wrote:
 On Thu, Sep 6, 2012 at 6:20 AM, Dave Airlie airl...@gmail.com wrote:
 On Thu, Sep 6, 2012 at 5:21 PM, Philipp Klaus Krause p...@spth.de wrote:
 On 06.09.2012 07:35, j.gli...@gmail.com wrote:
 From: Jerome Glisse jgli...@redhat.com

 To avoid GPU lockup registers must be emited in a specific order
 (no kidding ...). This patch rework atom emission so order in which
 atom are emited in respect to each other is always the same. We
 don't have any informations on what is the correct order so order
 will need to be infered from fglrx command stream.

 Shouldn't this be stated in comments, so the next person who comes along
 and makes a change in this code doesn't inadvertently change the order?

 Also a comment on what ordering matters most, like I suspect this is
 just hiding a real issue.

 Dave.

 No it's not hiding an issue, afaict it's how the hw works. The hw do
 what some amd document call states validations. So here is how i
 understand how things happen and i can be completely wrong. Hw process
 register write in order it receive them and to avoid postponing state
 validation the hw do state validation while processing register. That
 means if writing register A trigger state validation that use some
 field of register B the hw might not redo state validation when
 register B is latter written. ie only some register trigger the state
 validation no matter on what they depends on. I believe state
 validation is only use as pipeline optimization by the hw, so the hw
 knows it can take some short cut. But in some rare case if short cut
 are taken for wrong reasons we end up in GPU lockup.

 No matter if my guess is right or wrong, i know for a fact that
 register order is important in some situation, that's the hard bottom
 line, no matter what is the reasons inside the hw.

 This patch is far from having all the order right, it's just a first
 step, i am atomizing everything and it's what needed to go forward
 without regression.

 I've talked to the internal hw and sw guys and they said there isn't
 any specific ordering required and the closed driver doesn't impose
 any specific order.  The pipeline doesn't get kicked off until a draw
 command is issued, so I don't see why the state update order would
 matter.  It's possible there are subtle ordering requirements and the
 closed driver just happened to get it right.  There are dependencies
 and hw bug workarounds however.  E.g., some blocks snoop registers
 from other blocks so you need to make sure those dependant registers
 have been initialized before drawing.  I don't know if it's the
 ordering so much as making sure we emit all the necessary state when
 needed.  The closed driver tends to update a lot more state the is
 minimally required for a lot of things.  That said, it probably
 wouldn't hurt to mirror the closed driver more closely.

 Alex


Yeah it's possible that it's also related to some register need to be
re-emitted, i often see that fglrx is re-emitting some register even
if it emitted it with same value just before and some register are
emitted several time around other register block.

Anyhow this patch is a first step to atomize everything and match
fglrx register pattern more closely.

Cheers,
Jerome
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] r600g: order atom emission

2012-09-06 Thread Roland Scheidegger
Am 06.09.2012 16:54, schrieb Jerome Glisse:
 On Thu, Sep 6, 2012 at 6:20 AM, Dave Airlie airl...@gmail.com wrote:
 On Thu, Sep 6, 2012 at 5:21 PM, Philipp Klaus Krause p...@spth.de wrote:
 On 06.09.2012 07:35, j.gli...@gmail.com wrote:
 From: Jerome Glisse jgli...@redhat.com

 To avoid GPU lockup registers must be emited in a specific order
 (no kidding ...). This patch rework atom emission so order in which
 atom are emited in respect to each other is always the same. We
 don't have any informations on what is the correct order so order
 will need to be infered from fglrx command stream.

 Shouldn't this be stated in comments, so the next person who comes along
 and makes a change in this code doesn't inadvertently change the order?

 Also a comment on what ordering matters most, like I suspect this is
 just hiding a real issue.

 Dave.
 
 No it's not hiding an issue, afaict it's how the hw works. The hw do
 what some amd document call states validations. So here is how i
 understand how things happen and i can be completely wrong. Hw process
 register write in order it receive them and to avoid postponing state
 validation the hw do state validation while processing register. That
 means if writing register A trigger state validation that use some
 field of register B the hw might not redo state validation when
 register B is latter written. ie only some register trigger the state
 validation no matter on what they depends on. I believe state
 validation is only use as pipeline optimization by the hw, so the hw
 knows it can take some short cut. But in some rare case if short cut
 are taken for wrong reasons we end up in GPU lockup.
 
 No matter if my guess is right or wrong, i know for a fact that
 register order is important in some situation, that's the hard bottom
 line, no matter what is the reasons inside the hw.
 
 This patch is far from having all the order right, it's just a first
 step, i am atomizing everything and it's what needed to go forward
 without regression.
 
 Note that i have been told that in the r100/r200 days same issue came
 up and registers needed to be written in some specific order (well
 only some register matter but i doubt we have a good knowledge on
 that).

Yes, we had a similar problem with r200/r100, though IIRC it only
affected hw vp (TCL). Though I never saw lockups due to that, some tris
had color flickering. It was eventually fixed by fixed order emission of
the atoms, though we never figured out the reason why it was needed (or
what the order really should be). Quite possible some registers had some
dependencies on others.

Roland

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] r600g: order atom emission v2

2012-09-06 Thread Marek Olšák
On Thu, Sep 6, 2012 at 8:34 PM, Jerome Glisse j.gli...@gmail.com wrote:
 On Thu, Sep 6, 2012 at 2:29 PM, Marek Olšák mar...@gmail.com wrote:
 This looks good to me. It's funny to see the r300g architecture being
 re-implemented in r600g. :)

 There's one optimization that r300g has that this patch doesn't. r300g
 keeps the index of the first and the last dirty atom and the loops
 over the list of atoms look like this:
 for (i = first_dirty; i = last_dirty; i++)

 And after emission:
 first_dirty = some large number;
 last_dirty= 0;

 The atoms should be ordered according to how frequently they are
 updated (except when the ordering is required by the hw). But most
 importantly, if there are no state changes, the loops are trivially
 skipped.

 Marek

 Don't think this optimization is worth it, there won't be much more
 than 32 atom in the end and it definitely can't be ordered from most
 frequent to less frequent as some of the stuff need to be at the last
 being emitted and they are frequent one (primitive type for instance).

I didn't say all atoms *must* be sorted. I meant that some (most?)
atoms can be sorted, i.e. you can have some atoms at fixed positions
(like the primitype type or the seamless cubemap state), but you have
always at least *some* freedom where you put the rest. The ordering I
had in mind was actually from the least frequent to the most frequent,
in other words, from the framebuffer (least frequent) to shaders to
textures to constant buffers to vertex buffers (most frequent).

Of course, the code should document which atoms must have fixed
positions along with an explanation. The comment that all atom
positions must not be changed isn't enough, because it's not true.

Marek
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] r600g: order atom emission v2

2012-09-06 Thread Jerome Glisse
On Thu, Sep 6, 2012 at 4:10 PM, Marek Olšák mar...@gmail.com wrote:
 On Thu, Sep 6, 2012 at 8:34 PM, Jerome Glisse j.gli...@gmail.com wrote:
 On Thu, Sep 6, 2012 at 2:29 PM, Marek Olšák mar...@gmail.com wrote:
 This looks good to me. It's funny to see the r300g architecture being
 re-implemented in r600g. :)

 There's one optimization that r300g has that this patch doesn't. r300g
 keeps the index of the first and the last dirty atom and the loops
 over the list of atoms look like this:
 for (i = first_dirty; i = last_dirty; i++)

 And after emission:
 first_dirty = some large number;
 last_dirty= 0;

 The atoms should be ordered according to how frequently they are
 updated (except when the ordering is required by the hw). But most
 importantly, if there are no state changes, the loops are trivially
 skipped.

 Marek

 Don't think this optimization is worth it, there won't be much more
 than 32 atom in the end and it definitely can't be ordered from most
 frequent to less frequent as some of the stuff need to be at the last
 being emitted and they are frequent one (primitive type for instance).

 I didn't say all atoms *must* be sorted. I meant that some (most?)
 atoms can be sorted, i.e. you can have some atoms at fixed positions
 (like the primitype type or the seamless cubemap state), but you have
 always at least *some* freedom where you put the rest. The ordering I
 had in mind was actually from the least frequent to the most frequent,
 in other words, from the framebuffer (least frequent) to shaders to
 textures to constant buffers to vertex buffers (most frequent).

 Of course, the code should document which atoms must have fixed
 positions along with an explanation. The comment that all atom
 positions must not be changed isn't enough, because it's not true.

 Marek

I won't try to find which atom can have complete floating position, i
am just grouping together register that are always emitted together in
fglrx and then i position this group relative to each other according
to fglrx position. That means all atom are always emitted in a
specific order. So there won't be any freedom. The only freedom i can
think of is btw 2 position forced atom and that make the sorting
completely useless and complicated.

Cheers,
Jerome
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] r600g: order atom emission v2

2012-09-06 Thread Marek Olšák
On Fri, Sep 7, 2012 at 12:05 AM, Jerome Glisse j.gli...@gmail.com wrote:
 On Thu, Sep 6, 2012 at 4:10 PM, Marek Olšák mar...@gmail.com wrote:
 On Thu, Sep 6, 2012 at 8:34 PM, Jerome Glisse j.gli...@gmail.com wrote:
 On Thu, Sep 6, 2012 at 2:29 PM, Marek Olšák mar...@gmail.com wrote:
 This looks good to me. It's funny to see the r300g architecture being
 re-implemented in r600g. :)

 There's one optimization that r300g has that this patch doesn't. r300g
 keeps the index of the first and the last dirty atom and the loops
 over the list of atoms look like this:
 for (i = first_dirty; i = last_dirty; i++)

 And after emission:
 first_dirty = some large number;
 last_dirty= 0;

 The atoms should be ordered according to how frequently they are
 updated (except when the ordering is required by the hw). But most
 importantly, if there are no state changes, the loops are trivially
 skipped.

 Marek

 Don't think this optimization is worth it, there won't be much more
 than 32 atom in the end and it definitely can't be ordered from most
 frequent to less frequent as some of the stuff need to be at the last
 being emitted and they are frequent one (primitive type for instance).

 I didn't say all atoms *must* be sorted. I meant that some (most?)
 atoms can be sorted, i.e. you can have some atoms at fixed positions
 (like the primitype type or the seamless cubemap state), but you have
 always at least *some* freedom where you put the rest. The ordering I
 had in mind was actually from the least frequent to the most frequent,
 in other words, from the framebuffer (least frequent) to shaders to
 textures to constant buffers to vertex buffers (most frequent).

 Of course, the code should document which atoms must have fixed
 positions along with an explanation. The comment that all atom
 positions must not be changed isn't enough, because it's not true.

 Marek

 I won't try to find which atom can have complete floating position, i
 am just grouping together register that are always emitted together in
 fglrx and then i position this group relative to each other according
 to fglrx position. That means all atom are always emitted in a
 specific order. So there won't be any freedom. The only freedom i can
 think of is btw 2 position forced atom and that make the sorting
 completely useless and complicated.

I'll add the optimization anyway (without sorting). Draw operations
without state changes or with only one state update are quite common.

Anyway, it was said in the v1 thread that the hardware doesn't need
any specific ordering for proper functioning. While it may be
beneficial to emit one or two registers earlier than the others,
insisting on fixed ordering of all of them is not only limiting, it
seems useless and waste of time as well. What I don't understand: Why
do you blindly copy everything fglrx *seems* to be doing without any
real reason? It does not fix any bug, it does not improve performance,
it does not clean up the code... so why? I am all ears.

Marek
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] r600g: order atom emission

2012-09-05 Thread j . glisse
From: Jerome Glisse jgli...@redhat.com

To avoid GPU lockup registers must be emited in a specific order
(no kidding ...). This patch rework atom emission so order in which
atom are emited in respect to each other is always the same. We
don't have any informations on what is the correct order so order
will need to be infered from fglrx command stream.

Signed-off-by: Jerome Glisse jgli...@redhat.com
---
 src/gallium/drivers/r600/evergreen_compute.c |  2 +-
 src/gallium/drivers/r600/evergreen_state.c   | 53 +---
 src/gallium/drivers/r600/r600_hw_context.c   | 10 +++---
 src/gallium/drivers/r600/r600_pipe.c |  1 -
 src/gallium/drivers/r600/r600_pipe.h | 33 +
 src/gallium/drivers/r600/r600_state.c| 43 +-
 src/gallium/drivers/r600/r600_state_common.c | 36 ++-
 7 files changed, 96 insertions(+), 82 deletions(-)

diff --git a/src/gallium/drivers/r600/evergreen_compute.c 
b/src/gallium/drivers/r600/evergreen_compute.c
index acf91ba..3533312 100644
--- a/src/gallium/drivers/r600/evergreen_compute.c
+++ b/src/gallium/drivers/r600/evergreen_compute.c
@@ -583,7 +583,7 @@ void evergreen_init_atom_start_compute_cs(struct 
r600_context *ctx)
/* since all required registers are initialised in the
 * start_compute_cs_cmd atom, we can EMIT_EARLY here.
 */
-   r600_init_command_buffer(cb, 256, EMIT_EARLY);
+   r600_init_command_buffer(ctx, cb, 1, 256);
cb-pkt_flags = RADEON_CP_PACKET3_COMPUTE_MODE;
 
switch (ctx-family) {
diff --git a/src/gallium/drivers/r600/evergreen_state.c 
b/src/gallium/drivers/r600/evergreen_state.c
index bda8ed5..695c647 100644
--- a/src/gallium/drivers/r600/evergreen_state.c
+++ b/src/gallium/drivers/r600/evergreen_state.c
@@ -2161,27 +2161,40 @@ static void cayman_emit_sample_mask(struct r600_context 
*rctx, struct r600_atom
 
 void evergreen_init_state_functions(struct r600_context *rctx)
 {
-   r600_init_atom(rctx-cb_misc_state.atom, evergreen_emit_cb_misc_state, 
0, 0);
-   r600_atom_dirty(rctx, rctx-cb_misc_state.atom);
-   r600_init_atom(rctx-db_misc_state.atom, evergreen_emit_db_misc_state, 
7, 0);
-   r600_atom_dirty(rctx, rctx-db_misc_state.atom);
-   r600_init_atom(rctx-vertex_buffer_state.atom, 
evergreen_fs_emit_vertex_buffers, 0, 0);
-   r600_init_atom(rctx-cs_vertex_buffer_state.atom, 
evergreen_cs_emit_vertex_buffers, 0, 0);
-   r600_init_atom(rctx-vs_constbuf_state.atom, 
evergreen_emit_vs_constant_buffers, 0, 0);
-   r600_init_atom(rctx-ps_constbuf_state.atom, 
evergreen_emit_ps_constant_buffers, 0, 0);
-   r600_init_atom(rctx-vs_samplers.views.atom, 
evergreen_emit_vs_sampler_views, 0, 0);
-   r600_init_atom(rctx-ps_samplers.views.atom, 
evergreen_emit_ps_sampler_views, 0, 0);
-   r600_init_atom(rctx-cs_shader_state.atom, evergreen_emit_cs_shader, 
0, 0);
-   r600_init_atom(rctx-vs_samplers.atom_sampler, 
evergreen_emit_vs_sampler, 0, 0);
-   r600_init_atom(rctx-ps_samplers.atom_sampler, 
evergreen_emit_ps_sampler, 0, 0);
-
-   if (rctx-chip_class == EVERGREEN)
-   r600_init_atom(rctx-sample_mask.atom, 
evergreen_emit_sample_mask, 3, 0);
-   else
-   r600_init_atom(rctx-sample_mask.atom, 
cayman_emit_sample_mask, 4, 0);
+   unsigned id = 4;
+
+   /* shader const */
+   r600_init_atom(rctx, rctx-vs_constbuf_state.atom, id++, 
evergreen_emit_vs_constant_buffers, 0);
+   r600_init_atom(rctx, rctx-ps_constbuf_state.atom, id++, 
evergreen_emit_ps_constant_buffers, 0);
+   /* shader program */
+   r600_init_atom(rctx, rctx-cs_shader_state.atom, id++, 
evergreen_emit_cs_shader, 0);
+   /* sampler */
+   r600_init_atom(rctx, rctx-vs_samplers.atom_sampler, id++, 
evergreen_emit_vs_sampler, 0);
+   r600_init_atom(rctx, rctx-ps_samplers.atom_sampler, id++, 
evergreen_emit_ps_sampler, 0);
+   /* resources */
+   r600_init_atom(rctx, rctx-vertex_buffer_state.atom, id++, 
evergreen_fs_emit_vertex_buffers, 0);
+   r600_init_atom(rctx, rctx-cs_vertex_buffer_state.atom, id++, 
evergreen_cs_emit_vertex_buffers, 0);
+   r600_init_atom(rctx, rctx-vs_samplers.views.atom, id++, 
evergreen_emit_vs_sampler_views, 0);
+   r600_init_atom(rctx, rctx-ps_samplers.views.atom, id++, 
evergreen_emit_ps_sampler_views, 0);
+
+   if (rctx-chip_class == EVERGREEN) {
+   r600_init_atom(rctx, rctx-sample_mask.atom, id++, 
evergreen_emit_sample_mask, 3);
+   } else {
+   r600_init_atom(rctx, rctx-sample_mask.atom, id++, 
cayman_emit_sample_mask, 4);
+   }
rctx-sample_mask.sample_mask = ~0;
r600_atom_dirty(rctx, rctx-sample_mask.atom);
 
+   r600_init_atom(rctx, rctx-cb_misc_state.atom, id++, 
evergreen_emit_cb_misc_state, 0);
+   r600_atom_dirty(rctx, rctx-cb_misc_state.atom);
+
+   r600_init_atom(rctx, rctx-alphatest_state.atom, id++, 
r600_emit_alphatest_state, 3);
+