Woo hoo! I have been successful in implementing a fully GPU animated
particle system (60fps with 40,000 particles each moving in different
directions, rotating and scaling larger while fading out) that uses
AGAL for all simulation - not the CPU. 
http://www.mcfunkypants.com/molehill/particles/

Right now it does NOT use away3d classes at all for efficiency. It is
all hand coded as3. Forgive my laziness - I might try to incorporate
it into away3d once I finish my book on Molehill next month. But just
for reference, the good news is that I can confirm beyond any shadow
of a doubt that the "all done in AGAL + batched vertex buffer +
reusable pool" particle technique outlined in the first post on this
thread WORKS GREAT.

Is there any interest in my working this into away3d?  I don't want to
duplicate anyone's efforts and find away3d to be such a moving target
right now it might break in the future so I'm hesitant.  Let me know
if you think I should dive in.

Anyways, in explanation, apart from doing all sim on the GPU, the
other optimizaton technique used here is I have also created a
"particle pool" so that particles are reused if inactive - this avoids
any GC issues: we don't "spawn" new particles each time there is a new
explosion, we just reuse old ones and only create new ones if there
aren't any inactive ones available.  This means that the FPS hiccups
the very first few seconds as new particles are created but if you
hold down the spacebar for 2-3 seconds it stays near 60fps.

For now, to help you out, since I may never bother implementing it
inside the dependency-rich away3d, here is the AGAL and scene setup
for the technique outlined above (two vertex buffers, one is the start
pos and one is the end pos, precalculated, for each vertex).  None of
the animation is done in AS3 apart from calculating two constants that
go from 0..1 over time.  It is all done in AGAL, and particles are
"batched" in chunks - I'm using 336 polies per drawtriangles call but
this could be any number.  They key here is to NOT render each
particle separately.

With some more work I am confident that 100,000+ particles is fully
achieveable at 60fps with room to spare for all your other scene
models since we aren't using the CPU for any of the heavy lifting.

Enjoy!

trace("Compiling the TWO FRAME particle shader...");
vertexShader.assemble
(
        Context3DProgramType.VERTEX,
        // scale the starting position
        "mul vt0, va0, vc4.xxxx\n" +
        // scale the ending position
        "mul vt1, va2, vc4.yyyy\n" +
        // interpolate the two positions
        "add vt2, vt0, vt1\n" +
        // 4x4 matrix multiply to get camera angle
        "m44 op, vt2, vc0\n" +
        // tell fragment shader about UV
        "mov v1, va1"
);

// textured using UV coordinates
fragmentShader.assemble
(
        Context3DProgramType.FRAGMENT,
        // grab the texture color from texture 0
        // and uv coordinates from varying register 1
        // and store the interpolated value in ft0
        "tex ft0, v1, fs0 <2d,linear,repeat,miplinear>\n" +
        // multiply by "fade" color register (fc0)
        "mul ft0, ft0, fc0\n" +
        // move this value to the output color
        "mov oc, ft0\n"
);

// RENDER LOOP:
// notes:
// age_scale is a Vector.<Number> that goes from [1,0,1,1] to
[0,1,1,1]
// rgba_scale is a Vector.<Number> that goes from [1,1,1,1] to
[0,0,0,0]
// these are calculated each frame using sin and elapsed time
// what the vertex shader above does is
// - multiply the vertex position of frame one by age_scale[0]
// - multiply the vertex position of frame two by age_scale[1]
// - and add them together for the interpolated position

// Set the vertex program register vc0 to our model view projection
matrix
context.setProgramConstantsFromMatrix(
        Context3DProgramType.VERTEX, 0, matrix, true);

// Set the vertex program register vc4 to our time scale from (0..1)
// used to interpolate vertex positions over time
context.setProgramConstantsFromVector(
        Context3DProgramType.VERTEX, 4, age_scale);

// Set the fragment program register fc0 to our time scale from (0..1)
// used to interpolate colors and transparency over time
context.setProgramConstantsFromVector(
        Context3DProgramType.FRAGMENT, 0, rgba_scale);

// Set the AGAL program
context.setProgram(shader);

// Set the fragment program register ts0 to a texture
context.setTextureAt(0,texture);

// starting position (va0)
context.setVertexBufferAt(0, mesh.positionsBuffer,
        0, Context3DVertexBufferFormat.FLOAT_3);
// tex coords (va1)
context.setVertexBufferAt(1, mesh.uvBuffer,
        0, Context3DVertexBufferFormat.FLOAT_2);
// final position (va2)
context.setVertexBufferAt(2, mesh2.positionsBuffer,
        0, Context3DVertexBufferFormat.FLOAT_3);

context.setBlendFactors(blend_src, blend_dst);
context.setDepthTest(depth_test,depth_test_mode);
context.setCulling(culling_mode);

// render it
context.drawTriangles(mesh.indexBuffer,
        0, mesh.indexBufferCount);

Reply via email to