Yes, objects need to be reference counted for methods to work. This is because
only reference types can point to variable-length memory regions.
Take the below code:
type
Animal = ref object of RootObj
name: string
Dog = ref object of Animal
breed: string
method makeNoise(this: Animal) =
echo "Hi, I'm ", this.name
method makeNoise(this: Dog) =
echo "*Bark!* [said ", this.name, "]"
These type definitions translate roughly to the equivalent structures:
# TypeInfo is an object containing type information
# makeTypeInfo creates a TypeInfo object holding a type's information
type
AnimalObjBase = object of RootObj
typeInfo = ptr TypeInfo
AnimalBase = ptr AnimalObjBase
AnimalObj = object of RootObj
typeInfo = ptr TypeInfo
name: pointer
Animal = ptr AnimalObj
DogObj = object of RootObj
typeInfo = ptr TypeInfo
name: pointer
breed: pointer
Dog = ptr DogObj
const
animalTypeInfo: TypeInfo = makeTypeInfo(AnimalObjBase)
dogTypeInfo: TypeInfo = makeTypeInfo(DogObjBase)
proc makeNoise_Animal(this: Animal) =
echo "Hi, I'm ", this.name
proc makeNoise_Dog(this: Dog) =
echo "*Bark!* [said ", this.name, "]"
proc makeNoise(this: AnimalBase) =
if baseObj.typeInfo == animalTypeInfo:
makeNoise_Animal(cast[Animal](this))
elif baseObj.typeInfo == dogTypeInfo:
makeNoise_Dog(cast[Dog](this))
(Note that this isn't exactly valid code, nor is it precisely how methods are
implemented)
Note that 'AnimalObjBase', 'AnimalObj', and 'DogObj' all share common fields,
'typeInfo' for all three, and 'name' for the latter two. This means that, given
a region of memory holding data from one of these three types, we will always
be able to access the 'typeInfo' field, and given a region of memory holding
data from AnimalObj or DogObj, we can access the 'name' field (this
field-sharing is the basis for subtyping).
+---------------+ +---------------+ +---------------+
| AnimalObjBase | | AnimalObj | | DogObj |
+---------------+ +---------------+ +---------------+
| typeInfo | | typeInfo | | typeInfo |
+---------------+ +---------------+ +---------------+
| name | | name |
+---------------+ +---------------+
| breed |
+---------------+
The typeInfo field is used to mark these regions of memory. As long as every
AnimalObj's 'typeInfo' member points to 'animalTypeInfo' and every DogObj's
'typeInfo' member points to 'dogTypeInfo', we can reinterpret (cast) these
regions of memory to their appropriate types, and pass them into their
corresponding procedures/methods.
Now lets look at how objects are stored in memory. In contrast to references,
which are pointers that always point to heap-allocated memory, object data may
be located either in the heap _or_ the stack. It's this latter case that
reveals why methods won't work on object types.
Say we create Animal and Dog variables in a main method, then pass those
variables into a procedure which calls the 'makeNoise' method:
method makeNoise(this: AnimalBase)
proc makeLotsOfNoise(someAnimal: Animal):
makeNoise(someAnimal)
makeNoise(someAnimal)
makeNoise(someAnimal)
proc main =
var animal = Animal(name: "Unknown")
var dog = Dog(name: "Spot", breed: "Poodle")
makeLotsOfNoise(animal)
makeLotsOfNoise(dog)
main()
When 'main' is called, after the variables are created, the stack holds two
references that point to regions of heap memory:
main():
animal: 8 byte pointer -> 16 byte heap memory region
dog: 8 byte pointer -> 24 byte heap memory region
And when makeLotsOfNoise is called, the stack layout looks something like this:
main():
animal: 8 byte pointer -> 16 byte heap memory region
dog: 8 byte pointer -> 24 byte heap memory region
makeLotsOfNoise(someAnimal = animal):
someAnimal: 8 byte pointer -> 16 byte heap memory region
makeNoise(this = someAnimal):
this: 8 byte pointer -> 16 byte heap memory region
...
makeLotsOfNoise(someAnimal = dog):
someAnimal: 8 byte pointer -> 24 byte heap memory region
makeNoise(this = someAnimal):
this: 8 byte pointer -> 24 byte heap memory region
...
Make note of the size of the parameter passed into 'makeLotsOfNoise' \- it's
always an 8 byte pointer. This is a constraint of how procedure calls work, as
the size of the parameters usually needs to be known ahead of time.
Furthermore, the semantics of procedure calls must allow for the possibility
(even if optimization decides otherwise) for parameter data to be copied from
the previous procedure frame to the current procedure frame.
Now observe what happens if we were allowed to use objects instead. Our code
becomes:
method makeNoise(this: AnimalObjBase)
proc makeLotsOfNoise(someAnimal: AnimalObj):
makeNoise(someAnimal)
makeNoise(someAnimal)
makeNoise(someAnimal)
proc main =
var animal = AnimalObj(name: "Unknown")
var dog = DogObj(name: "Spot", breed: "Poodle")
makeLotsOfNoise(animal)
makeLotsOfNoise(dog)
main()
And our stack looks like this:
main():
animal: 16 byte stack memory region
dog: 24 byte stack memory region
makeLotsOfNoise(someAnimal = animal):
someAnimal: 16 byte memory region
makeNoise(this = someAnimal):
this: 8 byte memory region
...
makeLotsOfNoise(someAnimal = dog):
someAnimal: 16 byte memory region
makeNoise(this = someAnimal):
this: 8 byte memory region
...
Notice that, because parameter data is copied from frame to frame, the region
containing the 'Dog' data was truncated from 24 to 8 bytes! This would
obviously lead to problems - what happens when makeNoise dispatches to the
Animal and Dog methods, and the name/breed fields are accessed? We would get
garbage, as the program tries to read from wrong areas of the stack.
While there are workarounds for this (the one that comes to my mind is passing
a pointer to the stack data*, instead of copying it around), they all come with
additional costs/caveats, or make parameter passing semantics even more complex
than they already are.
Disclaimers:
* *This is actually already done, except if certain pragmas are used (which
the semantics still have to accommodate)
* Yes, I know about alignments and have the stack would actually be laid out.
The above stack diagrams are meant to illustrate the point, not the reality.
* All the above implementation details are subject to change. For all I know
type information could be passed as a hidden parameter in the future (or maybe
it already is).