On 10/22/06, Marvin Humphrey <[EMAIL PROTECTED]> wrote:
Greets,
Since C doesn't support OO directly, we'll have to roll our own scheme.
Keeping things as simple and as C-like as possible is desirable. The
more that Lucy looks like other C apps, the easier it will be for
people to contribute.
Here are two full-featured OO architectures which are very
interesting and from which we might draw some ideas, but which would
be overkill for our purposes:
http://ldeniau.home.cern.ch/ldeniau/html/oopc/oopc.html
http://www.planetpdf.com/developer/article.asp?ContentID=6635
Both Ferret and KinoSearch implement OO via structs which include
methods as function pointers within them. They diverge in how
inheritance is handled, though the differences are somewhat
superficial. Here's an illustration of how they work using "Animal"
as a superclass and "Dog" as a subclass.
typedef struct Animal Animal;
typedef struct Dog Dog;
Ferret creates a base class, then includes the base class struct as
the first member in a subclass struct:
struct Animal {
char *name;
char* (*speak)(Animal *self);
};
struct Dog {
Animal super;
void (*chase_cat)(Dog *self);
};
KinoSearch uses macros to manage inheritance.
#define Animal_Base_Struct \
char *name; \
char* (*speak)(Animal *self);
struct Animal {
Animal_Base_Struct
}
struct Dog {
Animal_Base_Struct
void (*chase_cat)(Dog *self);
}
These schemes differ in how casts get applied, but they're pretty
close to the same thing.
/* Ferret */
#define ANIMAL(x) ((Animal*)x)
static void
chase_cat(Dog *self)
{
ANIMAL(self)->speak(self);
scamper();
}
/* KinoSearch */
static void
chase_cat(Dog *self)
{
self->speak((Animal*)self);
scamper();
}
Obviously, things have been working out well for both KS and Ferret
with this architecture. However, including all methods as function
pointers directly within the primary object has a big drawback: it's
wastes memory, especially if we try to establish a base class Obj
with a bunch of handy methods like to_string(), compare(), clone(),
and so on.
Ferret's InStream class points the way forward, implementing
something akin to a poor man's virtual table. InStream objects
include a struct member instream->m, which is a pointer to either one
of two InStreamMethods structs, depending on whether the InStream is
using a ram file or not. The InStreamMethods struct has four
members: read_i, seek_i, length_i, and close_i. The InStream object
thus gets four methods for the price of one.
The cost of a double-dereference to dispatch a method is
insignificant. However, the syntax is godawful...
len = instream->m->length_i(instream);
... which is why Dave has macro-fied it:
#define is_length(mis) mis->m->length_i(mis)
How about if we implement every method call in Lucy that way?
Sounds good to me. I would have implemented all of Ferret's classes
this way except that I was naively worrying about the performance
detriment of the extra point dereference. I thought the extra speed
may be worth the cost in memory. Anyway, I changed the Streams to test
the difference and I mean to refactor the rest of the classes like
this when I have time.
Each class gets its own virtual table, where all method pointers are
ensconced. Class data, e.g. class name, can also go there. However,
we don't have to go whole hog and implement all the features of C++;
this gets us basic inheritance and polymorphism, which is enough to
get by.
One thing that's a little funky about Dave's is_length macro is that
it looks like a function call. I think it's important to distinguish
visually between function calls and methods, so if we go this route,
I suggest we reserve the capitalization scheme currently reserved for
non-constant macros under the Lucy style guide for method calls instead.
#define Animal_Speak(self) (self)->m->speak((Animal*)self)
static chase_cat(Dog *self)
{
Animal_Speak(self);
scamper();
);
Animal_Speak(self) is easier to parse visually than either of these:
self->speak((Animal*)self);
ANIMAL(self)->speak(self);
This looks great. I lot clearer than what Ferret has. The other nice
thing about this approach is that you've virtually implemented access
levels. In ferret I append "_i" for private methods which are not
supposed to be called directly or from outside of the class. For
example, the Store class has the exists method which you can call from
anywhere. It also has the close_i method which you should never call
directly. Instead you should should use store_deref which will
dereference the store and close it if necessary.
By using macros to access all class methods in this way we basically
designate those class methods as public. I really wish I'd thought of
this months ago. Another really nice result of this approach is that
we can easily apply filters to any method call, simply by swapping out
the macro with a function. So for example, if we want our animal to
breath in before it speaks (best example I would think of in this case
:P):
#define Animal_Speak(self) Animal_speak((Animal *)self)
void Animal_speak(Animal *self)
{
self->m->breath_in(self); // call Animal's private breat_in method
self->m->speak(self);
}
So in addition to making it possible to add methods to any class for
free, this scheme also cleans up the code visually where it matters
most: in the meat of the library.
Agreed
There are costs to this approach. It requires a fair amount of
boilerplate code. (We might want to consider using a symbol
generator.) I'm also pretty sure we'll want to write some
bootstrapping code to prep the virtual tables at startup, because it
will be difficult if not impossible to resolve all aspects of
inheritance at compile-time.
Could you give me an example of where this is difficult? I really
don't like the idea of bootstrapping, simply because it makes it
impossible to clean up the memory allocated during this process when
the application exits which in turn makes it difficult to use valgrind
to track down memory leaks. It also makes using the library from C a
bit uglier IMHO (although not necessarily one of Lucy's goals but it'd
still be nice). I'm happy to give up my beloved valgrind for this
project if it is really necessary.
--
Dave Balmain
http://www.davebalmain.com/