Forwarded with permission.
----- Forwarded message from John Klassa <[EMAIL PROTECTED]> -----
From: John Klassa <[EMAIL PROTECTED]>
To: Kragen Javier Sitaker <[EMAIL PROTECTED]>
Subject: Re: wave mechanics in C with SDL
Interesting... I'd never installed SDL on my MacOS box, so I gave it
a whirl.
I downloaded SDL-1.2.13.dmg and SDL-devel-1.2.13-extras.dmg, thinking
I'd need the latter in order to build anything. Turns out, all you
need (at least for your example) is SDL proper.
So, from SDL-1.2.13, I copied SDL.framework into /Library/Frameworks,
then copied the (apparently required) SDLMain.m to /usr/local/lib (for
lack of a better place).
To build it, I did:
gcc -O5 -fomit-frame-pointer -g \
-I /Library/Frameworks/SDL.framework/Headers \
kragen.c /usr/local/lib/SDLMain.m \
-framework SDL -framework Cocoa -Wall -o kragen
and then ran it. It worked! Nice wave effect...
Thanks for sharing!
JK
--
John Klassa | [EMAIL PROTECTED]
On Jan 5, 2008, at 3:37 AM, Kragen Javier Sitaker wrote:
>I wanted to see how much faster I could get in C than in Python. The
>answer so far is only about five times.
>
>// A simple little wave-mechanics display thing in SDL, by Kragen
>// Javier Sitaker, 2007-12-07 and 08.
>
>// Originally I wrote it in Python with PyGame (49 lines of code), but
>// I was disappointed with how slowly it ran, about 17fps on my
>// PIII-Coppermine-700 at 320x240. So I thought I'd try writing it in
>// C to see how much faster it was, and much to my surprise, it was
>// only something like 50% faster at 26fps --- although the size
>// penalty was not as bad as I thought, as my first C version is only
>// about 70 lines of code. And it was fast to write! It took under
>// an hour. Let's hear it for SDL!
>
>// But it seemed like the sqrt and sin functions were taking up a lot
>// of the run time, and any floating-point math seemed to be kind of
>// costly, so I wrote this "all-integer" version. In places, the
>// integers are scaled by 256 so as to be able to represent fractional
>// values, and there are tables that are initialized (to integers)
>// using floating-point math. (I thought floating-point was supposed
>// to be fast now, but I guess my 1999 CPU hasn't gotten the news
>// yet.) This quadrupled the speed to 105fps.
>
>// This version has a couple of interfering waves, one of which has a
>// moving center, and so it only gets 61fps instead of 105fps at
>// 320x240 on my laptop. But that's still faster than my screen
>// refreshes.
>
>// Unfortunately the moving center doesn't produce a Doppler effect as
>// it ought to. That will require a different approach.
>
>// I compiled with gcc -Wall -O5 -fomit-frame-pointer -g -lSDL on gcc
>// 4.1.2.
>
>#include <SDL/SDL.h>
>#include <stdio.h>
>#include <sys/time.h>
>#include <time.h>
>#include <assert.h>
>#include <math.h>
>
>// returns current time in 256ths of a second
>int getnow_scaled() {
> struct timeval now;
> gettimeofday(&now, 0);
> return (now.tv_sec << 8) + now.tv_usec * 256 / 1000000;
>}
>
>int make_grayscale_component(Uint32 mask, float level) {
> return (int)(mask * level) & mask;
>}
>
>#define max_grayscale_value 256
>short grayscale_palette[max_grayscale_value];
>void init_grayscale_palette(struct SDL_PixelFormat *format) {
> int ii;
> assert(sizeof(grayscale_palette[0]) == format->BytesPerPixel);
> for (ii = 0; ii < max_grayscale_value; ii++) {
> float level = ii / 256.0;
> grayscale_palette[ii] =
> make_grayscale_component(format->Rmask, level) +
> make_grayscale_component(format->Gmask, level) +
> make_grayscale_component(format->Bmask, level);
> }
>}
>
>// The idea here is that sqrtx256[x/256] == 256 * sqrt(x), which is
>// equivalent to saying that sqrtx256[x] == 256 * 16 * sqrt(x), so we
>// can do exact lookups for small numbers and linear interpolation for
>// large ones.
>
>// The max we'll currently encounter is 640*640 + 480*480 = 640 000,
>// which is 256 * 2500.
>#define max_sqrtx256 2500
>int sqrtx256[max_sqrtx256];
>void init_sqrtx256() {
> int ii;
> for (ii = 0; ii < max_sqrtx256; ii++)
> sqrtx256[ii] = (int)(0.5 + 256 * 16 * sqrt(ii));
>}
>
>// returns an approximation of 256 * sqrt(sqr)
>int inline interp_sqrt(int sqr) {
> // This line eats 10% of performance:
> if (sqr < max_sqrtx256) return sqrtx256[sqr] >> 4;
> // the rest of this routine compiles inline to 10 instructions: mov
> // and sar mov mov sub imul sar lea jmp.
> int high = sqr >> 8; // we hope high < max_sqrtx256-1
> int below = sqrtx256[high];
> int above = sqrtx256[high+1];
> int spread = above - below;
> int correction = ((sqr & 0xff) * spread) >> 8;
> return correction + below;
>}
>
>// 4 * pi * 256, accurate to six places; I'd take things modulo
>// 2 * pi * 256 but rounding that to an integer introduces 100x as
>// much error
>#define fourpi 3217
>
>// rtable: the idea is that you index into rtable with a value that
>// represents an angle, in radians, scaled by 256, which in this case
>// comes from the distance from the center of some wave minus the
>// current time; and you get back a value scaled to the range [0, 256)
>// representing (1 + sin(theta))/2.
>unsigned char rtable[fourpi];
>void init_rtable() {
> int ii;
> for (ii = 0; ii < fourpi; ii++) {
> int sin_val = 128 + 128 * sin(ii / 256.0) + 0.5;
> if (sin_val > 255) sin_val = 255;
> if (sin_val < 0) sin_val = 0;
> rtable[ii] = sin_val;
> }
>}
>
>int xorigin=0, yorigin=100;
>void redraw_world(SDL_Surface *screen) {
> int xx, yy, w = screen->w, h = screen->h;
> short *pix;
> int now_scaled = getnow_scaled();
> assert(sizeof(*pix) == screen->format->BytesPerPixel);
> SDL_LockSurface(screen);
> pix = screen->pixels;
> for (yy = 0; yy < h; yy++) {
> for (xx = 0; xx < w; xx++) {
> int rx = xx - w/2, ry = yy - h/2;
> unsigned rscaled = interp_sqrt(rx*rx + ry*ry)/(w/64) -
>now_scaled;
> int dx = xx - xorigin, dy = yy - yorigin;
> unsigned rscaled2= interp_sqrt(dx*dx + dy*dy)/(w/32) -
>now_scaled * 4;
> *pix++ = grayscale_palette[(rtable[rscaled % fourpi] +
> rtable[rscaled2% fourpi])/2];
> }
> }
> SDL_UnlockSurface(screen);
>}
>
>int lastframe; // global so main() can
>initialize it
>#define movement_scaling 16
>void update_moving_center(SDL_Surface *screen) {
> int now = getnow_scaled();
> int dt = now - lastframe; // delta time since last frame
> int dx, dy;
> static int dx_err = 0, dy_err = 0; // keep track of fractional
>pixels
> static int xdir = 1, ydir = 1;
> lastframe = now; // we only wanted lastframe to
>get dt
> dx = xdir * dt + dx_err; // now calculate desired pixel
>movements
> dy = ydir * dt + dy_err;
>
> xorigin += dx / movement_scaling; // apply them
> yorigin += dy / movement_scaling;
> dx_err = dx % movement_scaling; // save up fractional pixels
>for later
> dy_err = dy % movement_scaling;
>
> if (xorigin > screen->w) xdir = -1; // bounce if need be; it's OK
>to be
> if (xorigin < 0) xdir = 1; // a little bit off the screen
> if (yorigin > screen->h) ydir = -1;
> if (yorigin < 0) ydir = 1;
>}
>
>int main(int argc, char **argv) {
> int frames = 0;
> int start, end;
> SDL_Surface *screen;
> SDL_Init(SDL_INIT_EVERYTHING);
> screen = SDL_SetVideoMode(320, 240, 16, SDL_FULLSCREEN);
> //screen = SDL_SetVideoMode(640, 480, 16, SDL_FULLSCREEN);
> if (!screen) abort();
> init_grayscale_palette(screen->format);
> init_sqrtx256();
> //memcpy(sqrtx256, init_sqrtx256, 1024); just for fun
> init_rtable();
> lastframe = start = getnow_scaled();
> for (;;) {
> SDL_Event ev;
> if (SDL_PollEvent(&ev)) {
> if (ev.type == SDL_MOUSEBUTTONDOWN || ev.type == SDL_QUIT) break;
> } else {
> redraw_world(screen);
> SDL_Flip(screen);
> frames++;
> update_moving_center(screen);
> }
> }
> end = getnow_scaled();
> SDL_Quit();
> printf("%.2f seconds, %.2f fps\n", (end - start)/256.0,
> 256.0 * frames / (end - start));
> return 0;
>}
>
>
----- End forwarded message -----