On Wed, Dec 10, 2008 at 08:05:46AM -0700, Brian Paul wrote:
> Ian Romanick wrote:
>> I did an
>> implementation of Marc Olano's noise function[1].  It was quite fast on
>> a Geforce 8, but it choked Mesa's GLSL compiler.
>
> I'd be interested in seeing the shader, Ian, and fixing the compiler.  I  
> know we'll transition to a new compiler, but I want to fix bugs in the  
> current one too.

Unfortunately (or fortunately, depending on your point of view) the
current version seems to work just fine with Mesa's compiler.  I
haven't touched this code in months, so my memory is a bit fuzzy.  I
seem to recall that an earlier version used a small look-up table in
the grad function, and that was the source of the errors in Mesa.

There were two problems, IIRC.  The first was that GLSL 1.20
array initializers were not supported (such as below).  I think there
was also some problem with variable indexing into an array.

uniform vec2 grad_table[4] = vec2 [4] (vec2( 1.0,  1.0),
                                       vec2(-1.0,  1.0),
                                       vec2( 1.0, -1.0),
                                       vec2(-1.0, -1.0));

I worked around those cases so that it would work with the software
rasterizer.  With the work-arounds that I had in place (using
conditionals instead of a variable-indexed array), it worked with
software but not with i965.  It used too many registers and too many
instructions.  D'oh!

I made some changes yesterday, and the code is quite a bit more
optimized.  There are no conditionals and no arrays.  It's all just
straight-through code.  With those changes, it even works with i965
hardware.  I've attached the shader, a trivially modified version of
glslnoise.c, and a Perl script to convert the shader to a C header file.

Build with:

        stringify.pl < inoise.glsl > inoise.glsl.h
        gcc -O glslnoise.c -lglut -lGL

It should run with either software Mesa or recent versions (perhaps
git master only?) of the i965 driver.
#!/usr/bin/env perl

while (<STDIN>) {
      s/\\/\\\\/g;
      s/\"/\\\"/g;
      s/%/%%/g;
      s/\n/\\n/g;
      print "\"$_\"\n";
}
/*
 * (C) Copyright Ian D. Romanick Corporation 2008
 * All Rights Reserved.
 *
 * Permission is hereby granted, free of charge, to any person obtaining a
 * copy of this software and associated documentation files (the "Software"),
 * to deal in the Software without restriction, including without limitation
 * on the rights to use, copy, modify, merge, publish, distribute, sub
 * license, and/or sell copies of the Software, and to permit persons to whom
 * the Software is furnished to do so, subject to the following conditions:
 *
 * The above copyright notice and this permission notice (including the next
 * paragraph) shall be included in all copies or substantial portions of the
 * Software.
 *
 * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
 * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
 * FITNESS FOR A PARTICULAR PURPOSE AND NON-INFRINGEMENT.  IN NO EVENT SHALL
 * AUTHORS, COPYRIGHT HOLDERS, AND/OR THEIR SUPPLIERS BE LIABLE FOR ANY CLAIM,
 * DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR
 * OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE
 * USE OR OTHER DEALINGS IN THE SOFTWARE.
 */

/**
 * \file
 * Implementation of Marc Olano's modified noise function.
 *
 * Fairly straight-forward implementation of the modified noise algorithm
 * described in:
 *
 *     Olano, Marc. "Modified Noise for Evaluation on Graphics Hardware".
 *         Proceedings of Graphics Hardware 2005, Eurographics/ACM SIGGRAPH,
 *         July 2005.
 *
 * The implementations of inoise[234] could be vastly improved.  Vectorizing
 * the calculations instead of making multiple independent inoise1 calls
 * should give a healthy speed-up in that code.  I did not implement this
 * because I believe that inoise[234] are called much less frequently than
 * inoise1.  Patches and evidence to the contrary are always welcome.
 *
 * \author Ian Romanick <[email protected]>
 */


/**
 * Smooth transition with a 5th degree polynomial
 *
 * This function is much like the existing \c smoothstep function.  Instead
 * of using a 3rd degree polynomial, it uses the following 5th degree
 * polynomial:
 *
 *        s = 6t^5 - 15t^4 + 10t^3
 *
 * This is the same smooth step function used by Perlin's improved noise
 * function.  See:
 * 
 *     Perlin, K. 2002. Improving noise. In Proceedings of the 29th Annual
 *         Conference on Computer Graphics and interactive Techniques
 *         (San Antonio, Texas, July 23 - 26, 2002). SIGGRAPH '02. ACM,
 *         New York, NY, 681-682.  http://mrl.nyu.edu/~perlin/paper445.pdf
 * 
 * \bug
 * For unknown reasons ATI's GLSL compiler for R300 mis-compile this version of 
this
 * function.  See the use of \c ATI_DRIVER_WORKAROUND below.  This was observed 
on a
 * Mobile FireGL T2 (Radeon 9800-ish) in an IBM Thinkpad T41p.
 */
vec2 smoothstep5(vec2 t)
{
        return t * t * t * (t * (t * 6.0 - 15.0) + 10.0);
}


float smoothstep5(float t)
{
        return t * t * t * (t * (t * 6.0 - 15.0) + 10.0);
}


vec2 perm2(float x)
{
        vec2 v = vec2(x, x + 1.0);
        v = (3.0 * v) + 3.0;
        return mod(v * v, 61.0);
}


vec4 perm4(vec2 x)
{
        vec4 v = vec4(x.x, x.x + 1.0, x.y, x.y + 1.0);
        v = (3.0 * v) + 3.0;
        return mod(v * v, 61.0);
}


/**
 * Vectorized 1-dimensional gradient function
 *
 * \return
 * The output X component is \c p.x if \c x.x is even or \c -p.x if \c x.x
 * is odd.  The output Y component is set similarly but uses \c p.y and
 * \c x.y.
 */
vec2 grad(vec2 x, vec2 p)
{
        /* floor(x) is...       even   odd
         * previous % 2.0       0.0    1.0
         * 2.0 * previous       0.0    2.0
         * 1.0 - previous       1.0   -1.0
         */
        return p * (1.0 - (2.0 * mod(floor(x), 2.0)));
}


/**
 * Vectorized 2-dimensional gradient function
 *
 * Similar to the 1-dimensional gradient function except Y components are
 * based on the parity of \c x.y / 2.0.
 */
vec2 grad(vec2 x, vec2 p1, vec2 p2)
{
        vec4 x2 = x.xxyy * vec4(1.0, 0.5, 1.0, 0.5);
        vec4 grad_val = (1.0 - (2.0 * mod(floor(x2), 2.0)));

        return vec2(dot(grad_val.xy, p1), dot(grad_val.zw, p2));
}


float inoise1_2d(vec2 p)
{
        const vec2 bias = vec2(0.0, -1.0);
        vec2 intP = floor(p);
        vec2 frcP = fract(p);


#ifdef ATI_DRIVER_WORKAROUND
        vec2 f = vec2(smoothstep5(frcP.x), smoothstep5(frcP.y));
#else
        vec2 f = smoothstep5(frcP);
#endif
        vec2 A = perm2(intP.x) + intP.y;
        vec4 AA = perm4(A);

        vec4 g = vec4(grad(AA.xz, frcP          , frcP + bias.yx),
                      grad(AA.yw, frcP + bias.xy, frcP + bias.yy));
        vec4 fv = vec2(f.x, 1.0 - f.x).yxyx * vec2(f.y, 1.0 - f.y).yyxx;

        return dot(g, fv);
}


float inoise1_basis(float p, vec2 v)
{
        float f = smoothstep5(fract(p));
        vec2 A = perm2(floor(p));
        vec2 g = grad(A, v);

        return mix(g.x, g.y, f);
}


vec2 inoise1_basis(float p, vec4 v)
{
        float f = smoothstep5(fract(p));
        vec2 A = perm2(floor(p));
        vec4 g = vec4(grad(A, v.xy), grad(A, v.zw));

        return mix(g.xz, g.yw, f);
}


float inoise1_1d(float p)
{
        float frcP = fract(p);

        return inoise1_basis(p, vec2(frcP, frcP + 1.0));
}


float inoise1_3d(vec3 p)
{
        vec2 pp = perm2(floor(p.z)) + p.y;

        return inoise1_basis(p.z,
                             vec2(inoise1_2d(vec2(p.x, pp.x)),
                                  inoise1_2d(vec2(p.x, pp.y))));
}


/* The shorter version runs slower on real hardware.  However, the longer 
version
 * takes a really long time for Mesa' GLSL compiler to process.
 */
#if 0
float inoise1_4d(vec4 p)
{
        float intW = floor(p.w);
        vec2 pp = perm2(intW);

        return inoise1_basis(p.w,
                             vec2(inoise1_3d(vec3(p.xy, p.z + pp.x)),
                                  inoise1_3d(vec3(p.xy, p.z + pp.y))));
}
#else
float inoise1_4d(vec4 p)
{
        float intW = floor(p.w);
        vec2 pp = perm2(intW);

        vec2 q = pp + p.zz;
        vec2 intZ = floor(q);
        vec4 qq = perm4(intZ) + p.yyyy;

        qq.x = inoise1_2d(vec2(p.x, qq.x));
        qq.y = inoise1_2d(vec2(p.x, qq.y));
        qq.z = inoise1_2d(vec2(p.x, qq.z));
        qq.w = inoise1_2d(vec2(p.x, qq.w));

        return inoise1_basis(p.w, inoise1_basis(q.x, qq));
}
#endif


/* This indirection is a work-around for a bug in Nvidia's GLSL compiler.  When 
that
 * compiler tries to compile a function that calls a function with the same 
name,
 * even if the parameter signatures are different, it segfaults.
 */
float inoise1(float p) { return inoise1_1d(p); }
float inoise1(vec2 p) { return inoise1_2d(p); }
float inoise1(vec3 p) { return inoise1_3d(p); }
float inoise1(vec4 p) { return inoise1_4d(p); }


const vec4 noise2_bias = vec4(601., 313., 29., 277.);
const vec4 noise3_bias = vec4(1559., 113., 1861., 797.);

vec2 inoise2(vec4 p)
{ return vec2(inoise1(p), inoise1(p + noise2_bias)); }
vec3 inoise3(vec4 p)
{ return vec3(inoise2(p), inoise1(p + noise3_bias)); }
vec4 inoise4(vec4 p)
{ return vec4(inoise2(p), inoise2(p + noise3_bias)); }


vec2 inoise2(vec3 p)
{ return vec2(inoise1(p), inoise1(p + noise2_bias.xyz)); }
vec3 inoise3(vec3 p)
{ return vec3(inoise2(p), inoise1(p + noise3_bias.xyz)); }
vec4 inoise4(vec3 p)
{ return vec4(inoise2(p), inoise2(p + noise3_bias.xyz)); }


vec2 inoise2(vec2 p)
{ return vec2(inoise1(p), inoise1(p + noise2_bias.xy)); }
vec3 inoise3(vec2 p)
{ return vec3(inoise2(p), inoise1(p + noise3_bias.xy)); }
vec4 inoise4(vec2 p)
{ return vec4(inoise2(p), inoise2(p + noise3_bias.xy)); }


vec2 inoise2(float p)
{ return vec2(inoise1(p), inoise1(p + noise2_bias.x)); }
vec3 inoise3(float p)
{ return vec3(inoise2(p), inoise1(p + noise3_bias.x)); }
vec4 inoise4(float p)
{ return vec4(inoise2(p), inoise2(p + noise3_bias.x)); }
/*
 * GLSL noise demo.
 *
 * Michal Krol
 * 20 February 2006
 *
 * Based on the original demo by:
 * Stefan Gustavson ([email protected]) 2004, 2005
 */

#ifdef WIN32
#include <windows.h>
#endif

#include <stdio.h>
#include <stdlib.h>
#include <GL/gl.h>
#include <GL/glut.h>
#include <GL/glext.h>

#ifdef WIN32
#define GETPROCADDRESS(F) wglGetProcAddress(F)
#else
#define GETPROCADDRESS(F) glutGetProcAddress(F)
#endif

static GLhandleARB fragShader;
static GLhandleARB vertShader;
static GLhandleARB program;

static GLint uTime;

static GLint t0 = 0;
static GLint frames = 0;

static GLfloat u_time = 0.0f;

static PFNGLCREATESHADEROBJECTARBPROC glCreateShaderObjectARB = NULL;
static PFNGLSHADERSOURCEARBPROC glShaderSourceARB = NULL;
static PFNGLCOMPILESHADERARBPROC glCompileShaderARB = NULL;
static PFNGLCREATEPROGRAMOBJECTARBPROC glCreateProgramObjectARB = NULL;
static PFNGLATTACHOBJECTARBPROC glAttachObjectARB = NULL;
static PFNGLLINKPROGRAMARBPROC glLinkProgramARB = NULL;
static PFNGLUSEPROGRAMOBJECTARBPROC glUseProgramObjectARB = NULL;
static PFNGLGETUNIFORMLOCATIONARBPROC glGetUniformLocationARB = NULL;
static PFNGLUNIFORM1FARBPROC glUniform1fARB = NULL;
static PFNGLGETINFOLOGARBPROC gl_GetInfoLogARB = NULL;

static void Redisplay (void)
{
   GLint t;

        glClear (GL_COLOR_BUFFER_BIT | GL_DEPTH_BUFFER_BIT);

        glUniform1fARB (uTime, 0.5f * u_time);

        glPushMatrix ();
        glutSolidSphere (2.0, 20, 10);
        glPopMatrix ();

        glutSwapBuffers();
   frames++;

   t = glutGet (GLUT_ELAPSED_TIME);
   if (t - t0 >= 5000) {
      GLfloat seconds = (GLfloat) (t - t0) / 1000.0f;
      GLfloat fps = frames / seconds;
      printf ("%d frames in %6.3f seconds = %6.3f FPS\n", frames, seconds, fps);
      t0 = t;
      frames = 0;
   }
}

static void Idle (void)
{
        u_time += 0.1f;
        glutPostRedisplay ();
}

static void Reshape (int width, int height)
{
        glViewport (0, 0, width, height);
        glMatrixMode (GL_PROJECTION);
        glLoadIdentity ();
        glFrustum (-1.0, 1.0, -1.0, 1.0, 5.0, 25.0);
        glMatrixMode (GL_MODELVIEW);
        glLoadIdentity ();
        glTranslatef (0.0f, 0.0f, -15.0f);
}

static void Key (unsigned char key, int x, int y)
{
        (void) x;
        (void) y;

        switch (key)
        {
        case 27:
                exit(0);
                break;
        }
        glutPostRedisplay ();
}


static void dump_info_log(const char *name, GLhandleARB handle)
{
        static GLcharARB info_log[1024];
        GLsizei size;


        gl_GetInfoLogARB(handle, sizeof(info_log), &size, info_log);

        if (size > 0) {
                printf("%s log:\n", name);
                printf("%s\n", info_log);
        } else {
                printf("%s log: empty\n", name);
        }
}


static void Init (void)
{
   static const char *fragShaderText =
#include "inoise.glsl.h"
      "uniform float time;\n"
      "varying vec3 position;\n"
      "void main () {\n"
      "   gl_FragColor = vec4 (vec3 (0.5 + 0.5 * inoise1(vec4(position, 
time))), 1.0);\n"
      "}\n"
   ;
   static const char *vertShaderText =
      "varying vec3 position;\n"
      "void main () {\n"
      "   gl_Position = gl_ModelViewProjectionMatrix * gl_Vertex;\n"
      "   position = 4.0 * gl_Vertex.xyz;\n"
      "}\n"
   ;

        if (!glutExtensionSupported ("GL_ARB_fragment_shader"))
        {
                printf ("Sorry, this demo requires GL_ARB_fragment_shader\n");
                exit(1);
        }
        if (!glutExtensionSupported ("GL_ARB_shader_objects"))
        {
                printf ("Sorry, this demo requires GL_ARB_shader_objects\n");
                exit(1);
        }
        if (!glutExtensionSupported ("GL_ARB_shading_language_100"))
        {
                printf ("Sorry, this demo requires 
GL_ARB_shading_language_100\n");
                exit(1);
        }
        if (!glutExtensionSupported ("GL_ARB_vertex_shader"))
        {
                printf ("Sorry, this demo requires GL_ARB_vertex_shader\n");
                exit(1);
        }

        glCreateShaderObjectARB = (PFNGLCREATESHADEROBJECTARBPROC)
                GETPROCADDRESS("glCreateShaderObjectARB");
        glShaderSourceARB = (PFNGLSHADERSOURCEARBPROC)
                GETPROCADDRESS("glShaderSourceARB");
        glCompileShaderARB = (PFNGLCOMPILESHADERARBPROC)
                GETPROCADDRESS("glCompileShaderARB");
        glCreateProgramObjectARB = (PFNGLCREATEPROGRAMOBJECTARBPROC)
                GETPROCADDRESS("glCreateProgramObjectARB");
        glAttachObjectARB = (PFNGLATTACHOBJECTARBPROC)
                GETPROCADDRESS("glAttachObjectARB");
        glLinkProgramARB = (PFNGLLINKPROGRAMARBPROC)
                GETPROCADDRESS ("glLinkProgramARB");
        glUseProgramObjectARB = (PFNGLUSEPROGRAMOBJECTARBPROC)
                GETPROCADDRESS("glUseProgramObjectARB");          

        glGetUniformLocationARB = (PFNGLGETUNIFORMLOCATIONARBPROC)
                GETPROCADDRESS("glGetUniformLocationARB");
        glUniform1fARB = (PFNGLUNIFORM1FARBPROC)
                GETPROCADDRESS("glUniform1fARB");

        gl_GetInfoLogARB = (PFNGLGETINFOLOGARBPROC)
                GETPROCADDRESS("glGetInfoLogARB");

        fragShader = glCreateShaderObjectARB (GL_FRAGMENT_SHADER_ARB);
        glShaderSourceARB (fragShader, 1, &fragShaderText, NULL);
        glCompileShaderARB (fragShader);

        dump_info_log("frag shader", fragShader);

        vertShader = glCreateShaderObjectARB (GL_VERTEX_SHADER_ARB);
        glShaderSourceARB (vertShader, 1, &vertShaderText, NULL);
        glCompileShaderARB (vertShader);

        dump_info_log("vert shader", vertShader);

        program = glCreateProgramObjectARB ();
        glAttachObjectARB (program, fragShader);
        glAttachObjectARB (program, vertShader);
        glLinkProgramARB (program);

        dump_info_log("program", program);

        glUseProgramObjectARB (program);

        uTime = glGetUniformLocationARB (program, "time");

        glClearColor (0.0f, 0.1f, 0.3f, 1.0f);
        glEnable (GL_CULL_FACE);
        glEnable (GL_DEPTH_TEST);

        printf ("GL_RENDERER = %s\n", (const char *) glGetString (GL_RENDERER));
}

int main (int argc, char *argv[])
{
        glutInit (&argc, argv);
        glutInitWindowPosition ( 0, 0);
        glutInitWindowSize (1300, 600);
        glutInitDisplayMode (GLUT_RGB | GLUT_DOUBLE | GLUT_DEPTH);
        glutCreateWindow (argv[0]);
        glutReshapeFunc (Reshape);
        glutKeyboardFunc (Key);
        glutDisplayFunc (Redisplay);
        glutIdleFunc (Idle);
        Init ();
        glutMainLoop ();
        return 0;
}

Attachment: pgptUAnoRyMwN.pgp
Description: PGP signature

------------------------------------------------------------------------------
SF.Net email is Sponsored by MIX09, March 18-20, 2009 in Las Vegas, Nevada.
The future of the web can't happen without you.  Join us at MIX09 to help
pave the way to the Next Web now. Learn more and register at
http://ad.doubleclick.net/clk;208669438;13503038;i?http://2009.visitmix.com/
_______________________________________________
Mesa3d-dev mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/mesa3d-dev

Reply via email to