On Fri, Aug 30, 2013 at 6:35 PM, Scott Purdy <[email protected]> wrote:
> My intuition is that something like Protocol Buffers/Thrift/Cap'n Proto > makes backwards-compatibility much easier (relative to something like > MessagePack or manually writing out checkpoints). But those methods make > it a little more difficult to tie the logic to he different parts in an > object-oriented way. They are more suited to functional programming, which > isn't bad but different from what we currently have. > IMO, for something like this where you're going to have millions or billions of small objects, some representing a bit or two, a somewhat functional style is unavoidable - the alternative is adding a minimum of 8 bytes overhead to every object allocation, which, if you're representing a couple of bits, is insane. Also, for maximum performance, you're going to want to minimize cache misses, which means laying the data out contiguously in memory - which no language that abstracts memory management is guaranteed to do for you. You can do that and still offer users a nice OOP API - the difference just being that your objects are flyweight - they consist of an offset into an array of the actual data, and read and write all their state from there - and most of the time there will only be one such object instance at a time - you make one, pass it to the caller so they get an object oriented view of the data, and dispose of it once the caller is done with it, and on to the next. Once you've got that, the simple pattern to use is for clients to write "visitors" - functions (or one-off classes, depending on the language) which get passed the objects one by one. The result is similar to an API for a compiler's ASTs. If your target audience is really used to array-like collection objects, you can write something collection-like that that creates objects on the fly (assuming a garbage collected language or a known lifecycle for the objects - the visitor pattern makes it easier to scope the lifecycle of flyweight objects). That's one you probably don't do unless your target audience is really struggling with visitors, because it will be harder to get right and harder to maintain, and harder to keep client code from leaking memory by leaking objects. This is one of those cases where "premature optimization is the root of all evil" needs to go out the window - thinking carefully about the memory model is a necessity. -Tim -- http://timboudreau.com
_______________________________________________ nupic mailing list [email protected] http://lists.numenta.org/mailman/listinfo/nupic_lists.numenta.org
