Re: [vos-d] s5 design overview

2007-03-30 Thread Reed Hedges
Ken Taylor wrote:
 Peter Amstutz wrote:
 
 1.  Memory footprint

 The current s4 design has a lot of per-vobject overhead, leading to a
 significant memory footprint.  The development version improves on this
 a bit, but the honest truth is that the implementation was not written
 with memory efficiency in mind.  The s5 design will attempt to address
 in various ways, including attempting to directly minimize the byte
 count of crucial classes and data structures, and by using the
 flyweight pattern to collapse identical classes into a single shared
 instance.

 In particular, symbols (method names, message fields, types, etc) will
 be stored in a single global symbol table.  This will ensure that such
 strings (which are expected to be used a lot in the system) are not
 duplicated, and has the convient effect of also making string comparison
 a constant-time pointer comparison instead of a linear time compare.
 
 This seems like a very good optimization, given how string-heavy the system
 is. Could string tables also be used to compress COD files more? How about
 network optimization by having two sites share a string table (or have a
 static standard string table of commonly used strings in the core
 protocol?). Also ... are properties stored as strings at the moment or as
 native bytes (I actually don't know this)? If they're stored as strings,
 compressing them to native data structures would probably be a big win too
 (same thing goes for properties with storing them in COD and sending them
 over the network).
 
 [I also had another idea related to network optimization -- which is
 unrelated to memory footprint. The idea is particularly for large worlds
 with lots of objects to listen to: instead of listening update messages
 sending the whole message string and object identification, etc, have an
 optimized listen protocol, where a chosen-at-runtime tag is used to
 identify a certain kind of update message coming from a specific object, and
 that's all you need to have in the network traffic (other than the actual
 updated values, of course). This could even be done transparently at the
 network interface level]
 
 Another method for making the most out of the memory will be the ability
 to load vobjects from persistant storage on demand, and swap out rarely
 used vobjects.  This will allow a site to contain many more vobjects
 than would otherwise fit in RAM.  This
 
 Could this also be extended to support persistant transparent caching of
 remote sites that have been visited? Perhaps paired with a new method to
 query certain information about vobjects, such as is-cacheable and
 last-updated/version?
 
 To reduce the footprint of vobjects themselves, I am introduce a concept
 called embedded children.  These are objects that appear in protocol
 terms to be independently addressable vobjects that can send and receive
 messages, but are actually stored as a field of another vobject.  This
 emerged from the observation that while it was extremely powerful from a
 design pattern standpoint for properties to be first-class vobjects, in
 s4 this often led to situations where it hundreds of bytes of overhead
 to store even a single four byte integer (for example).  As an embedded
 child, the child vobject is simply a normal data field in the parent
 class, and it falls upon the parent sends and receives messages on
 behalf of the child.
 
 Sounds reasonable... What reasons (other than asthetics/symmetry) are there
 for properties to be first-class? Will a property object ever have children?
 Will one ever be at site root with no other parent? Will one ever be a
 meta-object other than property? I guess you could have two parents
 sharing the same property if it's a big one ...

Heh, well actually we tried that :)

In s2 properties were indelible components of their parent objects. But
it made everything more complicated and made the properties themselves
much less useful.

Being able to share properties is a critical feature of VOS that has
emerged. And all the usefulness that comes from metaobjects/vobjects
too-- Pete mentioned FileProperty and ExtrapolatedProperty (which no
longer exist), but the new vostoolbox library I've been working on
recently has several property subclasses; since they're metabobjects you
can instantiate them remotely (e.g. in mesh, create foo
property:property.foo

With s5's new embedded children, you will still be able to optionally
override an embedded child with a full Vobject.


 Yes. I've run into this ls at site root problem and it's a bit annoying :)
 And orphaned vobjects just seem like a lose.


Related to this, I added a new setting and command to mesh, cache. The
setting does an initial request for all objects when you enter a new
object to cache them.   This doesn't actually solve the problem (since
you still have to wait for the initial cache) but it makes the actual
ls command operate at a reasonable speed.  Set settings in mesh with
the set command (and you can 

Re: [vos-d] s5 design overview

2007-03-30 Thread Lalo Martins
On Thu, 29 Mar 2007 07:29:35 -0700, Ken Taylor wrote:
 Sounds reasonable... What reasons (other than asthetics/symmetry) are there
 for properties to be first-class? Will a property object ever have
children?

Two use cases we encountered in the past for putting children in a property:

- Translations.  Depending on the data, the translation can be a further
property inside the property, or it can be a sibling, with each language
version having a property containing the language code.

- And misc:metadata, of course; you can surely imagine lots of situations
where you'd want to store a date, author information, etc for a property,
right?

best,
   Lalo Martins
--
  So many of our dreams at first seem impossible,
   then they seem improbable, and then, when we
   summon the will, they soon become inevitable.
   -
GNU: never give up freedom http://www.gnu.org/


___
vos-d mailing list
vos-d@interreality.org
http://www.interreality.org/cgi-bin/mailman/listinfo/vos-d


Re: [vos-d] s5 design overview

2007-03-29 Thread Ken Taylor
Peter Amstutz wrote:

 1.  Memory footprint

 The current s4 design has a lot of per-vobject overhead, leading to a
 significant memory footprint.  The development version improves on this
 a bit, but the honest truth is that the implementation was not written
 with memory efficiency in mind.  The s5 design will attempt to address
 in various ways, including attempting to directly minimize the byte
 count of crucial classes and data structures, and by using the
 flyweight pattern to collapse identical classes into a single shared
 instance.

 In particular, symbols (method names, message fields, types, etc) will
 be stored in a single global symbol table.  This will ensure that such
 strings (which are expected to be used a lot in the system) are not
 duplicated, and has the convient effect of also making string comparison
 a constant-time pointer comparison instead of a linear time compare.

This seems like a very good optimization, given how string-heavy the system
is. Could string tables also be used to compress COD files more? How about
network optimization by having two sites share a string table (or have a
static standard string table of commonly used strings in the core
protocol?). Also ... are properties stored as strings at the moment or as
native bytes (I actually don't know this)? If they're stored as strings,
compressing them to native data structures would probably be a big win too
(same thing goes for properties with storing them in COD and sending them
over the network).

[I also had another idea related to network optimization -- which is
unrelated to memory footprint. The idea is particularly for large worlds
with lots of objects to listen to: instead of listening update messages
sending the whole message string and object identification, etc, have an
optimized listen protocol, where a chosen-at-runtime tag is used to
identify a certain kind of update message coming from a specific object, and
that's all you need to have in the network traffic (other than the actual
updated values, of course). This could even be done transparently at the
network interface level]


 Another method for making the most out of the memory will be the ability
 to load vobjects from persistant storage on demand, and swap out rarely
 used vobjects.  This will allow a site to contain many more vobjects
 than would otherwise fit in RAM.  This

Could this also be extended to support persistant transparent caching of
remote sites that have been visited? Perhaps paired with a new method to
query certain information about vobjects, such as is-cacheable and
last-updated/version?


 To reduce the footprint of vobjects themselves, I am introduce a concept
 called embedded children.  These are objects that appear in protocol
 terms to be independently addressable vobjects that can send and receive
 messages, but are actually stored as a field of another vobject.  This
 emerged from the observation that while it was extremely powerful from a
 design pattern standpoint for properties to be first-class vobjects, in
 s4 this often led to situations where it hundreds of bytes of overhead
 to store even a single four byte integer (for example).  As an embedded
 child, the child vobject is simply a normal data field in the parent
 class, and it falls upon the parent sends and receives messages on
 behalf of the child.

Sounds reasonable... What reasons (other than asthetics/symmetry) are there
for properties to be first-class? Will a property object ever have children?
Will one ever be at site root with no other parent? Will one ever be a
meta-object other than property? I guess you could have two parents
sharing the same property if it's a big one ...


 A final goal is to simplify memory management overall.  One significant
 change will be the introduction of a single root to the vobject tree.
 In s4, the site contains a list of every vobject on the site, and the
 site URL + vobject site name acts as the unique identifier for a
 vobject.  Unfortunately, there are two problems with this.  The first is
 that typing ls in mesh in the site root on a large site results in it
 printing out 1000s of vobjects to your screen, which is rarely useful.
 The second, more subtle problem is that vobjects are easily leaked by
 orphaning them so that no other vobject points to it but failing to to
 call excise() on it.  In s5 I propose to change this system, to a design
 inspired by the design of the Unix filesystem.  Vobjects will have
 unique, opaque identifiers (which correspond to inodes) that are not
 generally visible to the user, and vobjects will be reqired to be
 accessable from the site root via some path.  Vobjects which are
 orphaned can then be garbage collected.

Yes. I've run into this ls at site root problem and it's a bit annoying :)
And orphaned vobjects just seem like a lose.


 Other strategies to make memory management easier will include a greater
 emphasis on copying semantics when the amount of memory involved is
 small. 

Re: [vos-d] s5 design overview

2007-03-29 Thread Peter Amstutz
On Thu, Mar 29, 2007 at 07:29:35AM -0700, Ken Taylor wrote:

 This seems like a very good optimization, given how string-heavy the system
 is. Could string tables also be used to compress COD files more? How about
 network optimization by having two sites share a string table (or have a
 static standard string table of commonly used strings in the core
 protocol?). Also ... are properties stored as strings at the moment or as
 native bytes (I actually don't know this)? If they're stored as strings,
 compressing them to native data structures would probably be a big win too
 (same thing goes for properties with storing them in COD and sending them
 over the network).

Presently there is support for having COD files gzipped by default, 
which helps a bit.  The COD file format will need to be reworked to take 
advantage of s5 features anyhow.

Properties are presently stored as strings, but in s5 will be stored in 
the native binary encoding.  This is a major change since it requires 
that we define the set of primitive types that VOS will support.  
Indeed, this ties in directly with a general overhaul of the type system 
that I'm going to propose when I write about scripting.

 [I also had another idea related to network optimization -- which is
 unrelated to memory footprint. The idea is particularly for large worlds
 with lots of objects to listen to: instead of listening update messages
 sending the whole message string and object identification, etc, have an
 optimized listen protocol, where a chosen-at-runtime tag is used to
 identify a certain kind of update message coming from a specific object, and
 that's all you need to have in the network traffic (other than the actual
 updated values, of course). This could even be done transparently at the
 network interface level]

I think what you're describing is essentially packet compression by 
creating a dictionary during the session mapping short identifiers to 
common packet headers.

A separate problem with updates at the moment comes from the fact that 
the client presently has to subscribe to each vobject individually, 
which is one source of memory overhead since this is often redundant.  
I've been mulling over ways of subscribing to changes in a subtree of 
vobjects without having to communicate with each one separately.

 Could this also be extended to support persistant transparent caching of
 remote sites that have been visited? Perhaps paired with a new method to
 query certain information about vobjects, such as is-cacheable and
 last-updated/version?

Yes!  I'm glad you mentioned caching, because that is another thing that 
is sorely lacking in s4 and needs to be designed for in s5.

 Sounds reasonable... What reasons (other than asthetics/symmetry) are there
 for properties to be first-class? Will a property object ever have children?
 Will one ever be at site root with no other parent? Will one ever be a
 meta-object other than property? I guess you could have two parents
 sharing the same property if it's a big one ...

Well, so you can share properties, and originally (in s3/s4) so we could 
have special properties like the file property (its value was bound to 
the contents of a particular file on disk) and extrapolated property 
(the property included velocity/acceleration, used for client-side 
movement prediction).  It is a like like the notion of boxed objects 
in some object oriented languages like Java, C# and Ruby, where a 
primitive value like an integer is treated like a first class object 
which allows you to assign it to System.Object type references.

  Other strategies to make memory management easier will include a greater
  emphasis on copying semantics when the amount of memory involved is
  small.  This avoids having to maintain reference counts, and we can keep
  simple data structures on the stack (and in L1 cache?) to avoid
  potentially expensive calls to the heap allocator.
 
 I don't think I understand this one

Memory management is exceedingly difficult in C and C++, since if a data 
structure has multiple pointers going to it can be difficult to 
determine when it is appropriate to free() it.  Concurrency makes this 
even worse, since you could have a thread still accessing that structure 
while it is being free()ed.  Smart pointers and reference counts can 
help a lot, but they can be incredibly difficult to debug.  For s5 I 
want minimize the amount of shared state (which I'll talk about in the 
next email about concurrency) and one way to do that is when copying 
data to someone else, to pass a copy rather than a reference.

Another related technique will be the use of opaque identifiers rather 
than pointers where possible, so that if a vobject is freed in the 
background (or alternately, the object is swapped out) we don't 
dereference an invalid pointer.

 Scalability is going to be hugely important if vos is ever going to support
 a world-wide metaverse with large, interactive worlds. I'm glad you