Hello Gerrit and Carsten,

I really would like to see support for unifirm blocks (UBO) in OpenSG.

After thinking a little and learning the OpenGL part of UBO I came to 
the conclusion that a first step could be much simpler if OpenSG would 
only handle the buffer and the application would handle the buffer 
layout and content.

To be more explicit about what I have in mind...

1. General
==========
The shader defines a std140 block. The individual member names don't 
matter at all. The block member types are used according to the std140 
specification by the host application to define and fill a memory buffer 
which is handed to OpenSG. The shader uniform block name is used to 
determine the required memory buffer space. Additionally, the binding 
point must be specified.

I would not go for automatic binding ("layout(std140,binding=2)") 
because that would require at least some OpenGL 4.x driver. This may 
optionally be provided.

2. Details
==========
i) class UniformBufferObjectChunk:
----------------------------------
This class should abstract the memory block (buffer) and must know the 
binding point (bind_pnt) and the update hint (hint) for the buffer. It 
is a state chunk and should bind the buffer to the GL_UNIFORM_BUFFER 
target at activation time. It is independent of any shader program.

a) initialization time
     glGenBuffers(1, &_ubo_id);
     glBindBuffer(GL_UNIFORM_BUFFER, _ubo_id);
     glBufferData(GL_UNIFORM_BUFFER, buffer.size(), &buffer[0], hint);

     glBindBufferBase(GL_UNIFORM_BUFFER, bind_pnt, _ubo_id);
     glBindBuffer(GL_UNIFORM_BUFFER, 0);

The buffer is provided from outside. The binding point must according to 
the hardware caps be valid. The hint come from the following set 
{GL_STATIC_DRAW, GL_DYNAMIC_DRAW,...).

b) update time
     glBindBuffer(GL_UNIFORM_BUFFER, _ubo_id);
     GLubyte* pBuffer = static_cast<GLubyte*>(
         glMapBuffer(GL_UNIFORM_BUFFER, GL_WRITE_ONLY));

     // overwrite pBuffer by buffer content with memcpy...

     glUnmapBuffer(GL_UNIFORM_BUFFER);
     glBindBuffer(GL_UNIFORM_BUFFER, 0);

d) deinitialization
     glDeleteBuffers(1, &_ubo_id);

ii) class UniformBlocks:
------------------------
The responsibility of this class is to bind the shader uniform blocks to 
the binding points (bind_pnt). It should have a map of block names to 
UniformBufferObjectChunk objects providing the appropriate binding 
points. An instance of this class can be added to a ShaderProgramChunk 
object. On usage of such a program the uniform block binding must be 
established:

     GLuint index = glGetUniformBlockIndex(
                         _shader.GetProgram(), block_name);
     glUniformBlockBinding(_shader.GetProgram(), index, bind_pnt);


iii) the host application:
--------------------------
The host application can easily setup the memory buffer according to 
std140 with the help of the following helper function:

static int align_offset(int base_alignment, int base_offset)
{
     if (base_offset == 0) return 0;

     int n = 1;
     while (n * base_alignment < base_offset) ++n;
     return n * base_alignment;
}

At the end (*) you can find some examples for usage...

iv) implementation
------------------
Ok, at this point I need help because I'm still not familiar enough with 
the OpenSG nuts and bolts to implement this idea properly.

Could you please review this proposal and check if it

a) is sound and feasible
b) has limitations
c) is desirable

And if so, I would appreciate some mentoring for implementing this in 
OpenSG.


Best,
Johannes




(*) Examples for std140 buffer setup with helper function
=========================================================
For instance the std140 specification provides an example which can be
described as follows:

int ao = 0; // aligned offset
int bo = 0; // base offset
                                                             // // Examples
//
//    The following example illustrates the rules specified by the "std140"
//    layout.
//
// layout(std140) uniform Example {
//
//                 // Base types below consume 4 basic machine units
//                 //
//                 //       base   base  align
//                 // rule  align  off.  off.  bytes used
//                 // ----  ------ ----  ----  -----------------------
//   float a;      //  1       4     0    0    0..3
ao = align_offset( 4,  bo);
// fill buffer at position ao with content of a
bo = ao + sizeof(float);

//   vec2 b;       //  2       8     4    8    8..15
ao = align_offset( 8,  bo); // fill buffer at...
bo = ao + sizeof(glm::vec2);

//   vec3 c;       //  3      16    16   16    16..27
ao = align_offset( 16, bo); // fill buffer at...
bo = ao + sizeof(glm::vec3);

//   struct {      //  9      16    28   32    (align begin)
ao = align_offset( 16, bo);
bo = ao;

//     int d;      //  1       4    32   32    32..35
ao = align_offset(  4, bo); // fill buffer at...
bo = ao + sizeof(int);

//     bvec2 e;    //  2       8    36   40    40..47
ao = align_offset(  8, bo); // fill buffer at...
bo = ao + 2 * sizeof(float);

//   } f;          //  9      16    48   48    (pad end)
ao = align_offset( 16, bo);
bo = ao;

//   float g;      //  1       4    48   48    48..51
ao = align_offset(  4, bo); // fill buffer at...
bo = ao + sizeof(float);

//   float h[2];   //  4      16    52   64    64..67 (h[0])
ao = align_offset( 16, bo); // fill buffer at...
bo = ao + sizeof(float);

//                 //         16    68   80    80..83 (h[1])
ao = align_offset( 16, bo); // fill buffer at...
bo = ao + sizeof(float);

//                 //  4      16    84   96    (pad end of h)
ao = align_offset( 16, bo);
bo = ao;

//   mat2x3 i;     // 5/4     16    96   96    96..107 (i, column 0)
ao = align_offset( 16, bo); // fill buffer at...
bo = ao + sizeof(glm::vec3);

//                 //         16   108  112    112..123 (i, column 1)
ao = align_offset( 16, bo); // fill buffer at...
bo = ao + sizeof(glm::vec3);

//                 // 5/4     16   124  128    (pad end of i)
ao = align_offset( 16, bo);
bo = ao;

//   struct {      //  10     16   128  128    (align begin)
ao = align_offset( 16, bo);
bo = ao;

//     uvec3 j;    //  3      16   128  128    128..139 (o[0].j)
ao = align_offset( 16, bo); // fill buffer at...
bo = ao + sizeof(glm::uvec3);

//     vec2 k;     //  2       8   140  144    144..151 (o[0].k)
ao = align_offset(  8, bo); // fill buffer at...
bo = ao + sizeof(glm::vec2);

//     float l[2]; //  4      16   152  160    160..163 (o[0].l[0])
ao = align_offset( 16, bo); // fill buffer at...
bo = ao + sizeof(float);

//                 //         16   164  176    176..179 (o[0].l[1])
ao = align_offset( 16, bo); // fill buffer at...
bo = ao + sizeof(float);

//                 //  4      16   180  192    (pad end of o[0].l)
ao = align_offset( 16, bo);
bo = ao;

//     vec2 m;     //  2       8   192  192    192..199 (o[0].m)
ao = align_offset(  8, bo); // fill buffer at...
bo = ao + sizeof(glm::vec2);

//     mat3 n[2];  // 6/4     16   200  208    208..219 (o[0].n[0], 
column 0)
ao = align_offset( 16, bo); // fill buffer at...
bo = ao + sizeof(glm::vec3);

//                 //         16   220  224    224..235 (o[0].n[0], 
column 1)
ao = align_offset( 16, bo); // fill buffer at...
bo = ao + sizeof(glm::vec3);

//                 //         16   236  240    240..251 (o[0].n[0], 
column 2)
ao = align_offset( 16, bo); // fill buffer at...
bo = ao + sizeof(glm::vec3);

//                 //         16   252  256    256..267 (o[0].n[1], 
column 0)
ao = align_offset( 16, bo); // fill buffer at...
bo = ao + sizeof(glm::vec3);

//                 //         16   268  272    272..283 (o[0].n[1], 
column 1)
ao = align_offset( 16, bo); // fill buffer at...
bo = ao + sizeof(glm::vec3);

//                 //         16   284  288    288..299 (o[0].n[1], 
column 2)
ao = align_offset( 16, bo); // fill buffer at...
bo = ao + sizeof(glm::vec3);

//                 // 6/4     16   300  304    (pad end of o[0].n)
ao = align_offset( 16, bo);
bo = ao;

//                 //  9      16   304  304    (pad end of o[0])
ao = align_offset( 16, bo);
bo = ao;

//                 //  3      16   304  304    304..315 (o[1].j)
ao = align_offset( 16, bo); // fill buffer at...
bo = ao + sizeof(glm::uvec3);

//                 //  2       8   316  320    320..327 (o[1].k)
ao = align_offset(  8, bo); // fill buffer at...
bo = ao + sizeof(glm::vec2);

//                 //  4      16   328  336    336..339 (o[1].l[0])
ao = align_offset( 16, bo); // fill buffer at...
bo = ao + sizeof(float);

//                 //         16   340  352    352..355 (o[1].l[1])
ao = align_offset( 16, bo); // fill buffer at...
bo = ao + sizeof(float);

//                 //  4      16   356  368    (pad end of o[1].l)
ao = align_offset( 16, bo);
bo = ao;

//                 //  2       8   368  368    368..375 (o[1].m)
ao = align_offset(  8, bo); // fill buffer at...
bo = ao + sizeof(glm::vec2);

//                 // 6/4     16   376  384    384..395 (o[1].n[0], 
column 0)
ao = align_offset( 16, bo); // fill buffer at...
bo = ao + sizeof(glm::vec3);

//                 //         16   396  400    400..411 (o[1].n[0], 
column 1)
ao = align_offset( 16, bo); // fill buffer at...
bo = ao + sizeof(glm::vec3);

//                 //         16   412  416    416..427 (o[1].n[0], 
column 2)
ao = align_offset( 16, bo); // fill buffer at...
bo = ao + sizeof(glm::vec3);

//                 //         16   428  432    432..443 (o[1].n[1], 
column 0)
ao = align_offset( 16, bo); // fill buffer at...
bo = ao + sizeof(glm::vec3);

//                 //         16   444  448    448..459 (o[1].n[1], 
column 1)
ao = align_offset( 16, bo); // fill buffer at...
bo = ao + sizeof(glm::vec3);

//                 //         16   460  464    464..475 (o[1].n[1], 
column 2)
ao = align_offset( 16, bo); // fill buffer at...
bo = ao + sizeof(glm::vec3);

//                 // 6/4     16   476  480    (pad end of o[1].n)
ao = align_offset( 16, bo);
bo = ao;

//                 //  9      16   480  480    (pad end of o[1])
ao = align_offset( 16, bo);
bo = ao;

//   } o[2];
// };

And a more explicit example

a) shader
layout (std140) uniform TransformBlock
{
     float scale;
     vec3  translation;
     float rotation[3];
     mat4  projection_matrix;
} transform;

b) C++:

struct TransformBlock
{
     TransformBlock()
     : scale(7.f)
     , translation(0.f, 0.3f, 12.6f)
     , projection_matrix(glm::mat4(1))
     {
         rotation[0] = 4.f;
         rotation[0] = 3.2f;
         rotation[0] = 0.5f;

         for (int i = 0; i < 4; ++i)
             for (int j = 0; j < 4; ++j)
                 projection_matrix[i][j] = static_cast<float>(4*i + j);
     }
     float      scale;
     glm::vec3  translation;
     float      rotation[3];
     glm::mat4  projection_matrix;
}  _dummy_transform;


std::vector<unsigned char> buffer(size);

int ao = 0; // aligned offset
int bo = 0; // base offset

// handle TransformBlock.scale -> float
// rule 1 => base alignment = 4
ao = align_offset( 4, bo);
*(reinterpret_cast<float*>(&buffer[0] + ao)) = _dummy_transform.scale;
bo = ao + sizeof(float);

// handle TransformBlock.translation -> vec3
// rule 3 => base alignment = 4*4 = 16
ao = align_offset(16, bo);
memcpy(&buffer[0] + ao, glm::value_ptr(_dummy_transform.translation), 
sizeof(glm::vec3));
bo = ao + sizeof(glm::vec3);

// handle TransformBlock.rotation -> float[3]
// rule 4 => base alignment = 4*4 = 16
for (int i = 0; i < 3; ++i) {
     ao = align_offset(16, bo);
     *reinterpret_cast<float*>(&buffer[0] + ao) = 
_dummy_transform.rotation[i];
     bo = ao + sizeof(float);
}
ao = align_offset( 16, bo); bo = ao;

// handle TransformBlock.projection_matrix -> mat4 -> column-major
// rule 5 => base alignment = 4*4 = 16
for (int i = 0; i < 4; ++i) {
     ao = align_offset(16, bo);
     const glm::vec4& column = _dummy_transform.projection_matrix[i];
     memcpy(&buffer[0] + ao, glm::value_ptr(column), sizeof(glm::vec4));
     bo = ao + sizeof(float);
}




------------------------------------------------------------------------------
Subversion Kills Productivity. Get off Subversion & Make the Move to Perforce.
With Perforce, you get hassle-free workflows. Merge that actually works. 
Faster operations. Version large binaries.  Built-in WAN optimization and the
freedom to use Git, Perforce or both. Make the move to Perforce.
http://pubads.g.doubleclick.net/gampad/clk?id=122218951&iu=/4140/ostg.clktrk
_______________________________________________
Opensg-users mailing list
Opensg-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensg-users

Reply via email to