On Fri, 2007-03-23 at 22:48 -0400, Ben Goertzel wrote: > Samantha Atknis wrote: > > Ben Goertzel wrote: > >> > >> Regarding languages, I personally am a big fan of both Ruby and > >> Haskell. But, for Novamente we use C++ for reasons of scalability. > > I am curious as to how C++ helps scalability. What sorts of > > "scalability"? Along what dimensions? There are ways that C++ does > > not scale so well like across large project sizes or in terms of > > maintainability. It also doesn't scale that well in terms of space > > requirements if the class hierarchy gets too deep or uses much > > multiple inheritance of non-mixin classes. It also doesn't scale > > well in large team development. So I am curious what you mean. > > > > I mean that Novamente needs to manage large amounts of data in heap > memory, which needs to be very frequently garbage collected according to > complex patterns. >
So all collection is being done by hand since C/C++ have no facilities. But complex heap management could be done in most any language where you can get at the bits. Heap management could logically be handled separately from the what is using the structures to be managed as long as their is good enough binding between the languages. But I see here that having a language relatively "close to the metal" was useful. > We are doing probabilistic logical inference IN REAL TIME, for real time > embodied agent control. This is pretty intense. A generic GC like > exists in LISP or Java won't do. > Lisp can though handle allocating large arrays in memory that are then subdivided. Exactly what you need can be modeled in Lisp and then tweak the code generation to make it as efficient as needed. It would be a bit unusual but doable. > Aside from C++, another option might have been to use LISP and write our > own custom garbage collector for LISP. Or, to go with C# and then use > unsafe code blocks for the stuff requiring intensive memory management. > Yes. Similar path. It could almost be done in Java at the byte code level but that would arguably be more unfriendly than C++. > Additionally, we need real-time, very fast coordinated usage of multiple > processors in an SMP environment. Java, for one example, is really slow > at context switching between different threads. > Depending on the threading model I can see that. Clever hacking can get around some needs for context switching but then you start stepping beyond the things Java is good for. Did you have much opportunity to form a judgment about Erlang? > Finally, we need rapid distributed processing, meaning that we need to > rapidly get data out of complex data structures in memory and into > serialized bit streams (and then back into complex data structures at > the other end). This means we can't use languages in which object > serialization is a slow process with limited > customizability-for-efficiently. > Lisp could excel at streaming data. Java data streaming isn't that bad either as it can be customized to stream only what you wished streamed for your specific needs with custom readers and writers per object. It is relatively easy to do this custom approach in Java. You aren't stuck with the defaults. > When you start trying to do complex learning in real time in a > distributed multiprocessor context, you quickly realize that > C-derivative languages are the only viable option. Being mostly a > Linux shop we didn't really consider C# (plus back when we started, .Net > was a lot less far along, and Mono totally sucked). > > C++ with heavy use of STL and Boost is a different language than the C++ > we old-timers got used to back in the 90's. It's still a large and > cumbersome language but it's quite possible to use it elegantly and > intelligently. I am not such a whiz myself, but fortunately some of our > team members are. > Hehehe. That was the late 80's for me with another C++ stint from 96-99. STL and Boost give it some of the power of Lisp but in a much more difficult to understand and extend manner. :-) I can see the choice for tight management of memory for sure. I have some thought so using C# at least for optimizing a general cache I devised myself. I know less about the story of C, C++ utilization of newer multi-core processors. It is my understanding that most of the compilers still suck at taking advantage of such things. Lisp or Java should do fine with some tweaking at interprocess streaming of arbitrarily complex data. Thanks for the interesting and informative answer. ----- This list is sponsored by AGIRI: http://www.agiri.org/email To unsubscribe or change your options, please go to: http://v2.listbox.com/member/?list_id=303