Re: [nupic] Steps toward a distributed NuPic

Tim Boudreau Fri, 06 Feb 2015 12:07:09 -0800

I've been looking for an excuse to learn Erlang, so that option is kind of
appealing.


That said, you *can* do functional-style programming in Java - both
traditional Java [1] [2], and with JDK 8's lambdas.  Given that that
mutability is a requirement (you're not going to copy a big blob of memory
to set one bit), something that is pure functional might introduce some
non-intuitive hoops to jump through.  So perhaps, with a carefully designed
API and a *lot *of setting expectations for Java developers to unlearn some
bad habits your typical Java framework teaches, Java would be viable for
something like this.

For a serialization format, I wouldn't expect any off-the-shelf framework
to be a fabulous fit for it - for example, an SDR is most intuitively
represented as a vast array of bits most of which are off, but for a
wire-format, a list of the on-bits will be much more compact;  a set of
synapses could be compactly represented as a set of directions for walking
an imaginary topology of connected cells.  I think the main thing for a
wire format is that it should not be just shoveling some language's
in-memory representation of the data over the wire, ala Java serialization.

So I'd advise not being scared to just spec out what the bits should look
like, and write readers and writers for each language that take a "visitor"
which gets passed each element of the data and can turn it into whatever
in-memory structure makes sense for that language (take a look at javac's
tree api for inspiration).

-Tim

1. https://timboudreau.com/blog/Acteur/read
2. https://timboudreau.com/blog/ActeurPattern/read


On Fri, Feb 6, 2015 at 11:59 AM, Kevin Archie <[email protected]> wrote:

> Can’t believe I’m stumbling into a language war (carrying a flamethrower).
> I apologize and will attempt to bring this around to NuPIC in a clearly
> marked section below.
>
> Erlang/BEAM is an amazing environment for building distributed systems.
> Monitoring, debugging, distribution, interoperability, are all equal to or
> years ahead of JVM*. It’s this crazy hidden gem. I like Erlang the language
> a lot (less so Elixir, for not very good reasons) and BEAM the runtime even
> more and if I were building a startup where distribution and fault
> tolerance were key, I would absolutely write the system in Erlang** (and
> some C for performance-critical bits) with full confidence that I could
> build something amazing quickly by myself, and that if the business took
> off and I needed to hire 10 or 12 expert Erlangers to build it out, I could
> find them.
>
> That’s a much different problem from: I’m building an open-source
> system+community and I’m trying to attract people interested in the domain
> (biologically inspired intelligence) to come work on it in their free time.
> The intersection of {Erlang gurus} and {cortical computation nerds} is
> pretty darn small. (Back to the hypothetical startup: when we make it huge
> and Google buys us out, they’re going to rewrite everything in Java because
> they need to deploy and scale and maintain with an army of drones, and
> Costco carries economy packs of Java programmers. Me, I cash out before
> that point.)
>
> — DIRECT APPLICABILITY TO NUPIC FOLLOWS --
>
> Which I guess is me coming down on the side of writing everything
> performance-critical in C++ (ugh) and wrapping it up with Python (sigh)
> because you need very low entry barriers for recruiting. (What, that’s
> done? Hurrah!) By all means, port to Java or Elixir or whatever if you’re
> doing it for personal education, but if you want to change the world, focus
> on what's at hand rather than borrowing trouble from the future. Use NuPIC
> to solve cool problems, and those successes will feed and grow the
> community.
>
>   - Kevin
>
> * JVM probably wins on cloud deployment at enterprise scale once your
> application is bulletproof. Good luck getting it there.
>
> ** okay, I’d be awfully tempted by Cloud Haskell, but I’d wind up doing it
> in Erlang because the tools are much more mature.
>
> On Feb 6, 2015, at 9:42 AM, cogmission1 . <[email protected]>
> wrote:
>
> Your intuitions are correct about both the scalability and simplicity of
>> using Elixir/Erlang to do HTM. In my initial tests, I was able to spring up
>> 250k neuron processes in 20-25ms in Elixir on a laptop.
>
>
> How does Elixir/Erlang speak to the network issue? What about state
> monitoring across network nodes? What about debugging nodes on a network?
> What about ease of setup and distribution? What about interoperation on
> heterogenous architectures? We're talking about ease and maintainability
> here... The JVM's appropriateness for this task is as apparent as
> gravity... I don't see how this can be disputed? We want to take advantage
> of the wealth of development talent, and time-to-market ease of developing
> in Java - not introduce another level of obscurity?
>
> Other than that issue, the identification of what comprises a discrete
> computational unit I believe is agreed on... and Michael just raised a more
> poignant question as far as I'm concerned. Parallelism is an interesting
> topic. The nature of cortical inputs seems to involve parallel sensor input
> from multiple senses concurrently; however does a single sense (such as
> vision), have opportunities for parallelism? Do individual senses in
> general? Otherwise parallelism would have to be introduced algorithmically
> to break up processing of a single stream of sequential input into parallel
> tasks.
>
> Interesting discussion...
>
>
>
> On Fri, Feb 6, 2015 at 9:03 AM, Michael Klachko <[email protected]>
> wrote:
>
>> I think a better question is how can HTM computation be parallelized? For
>> example, can we map a whole region to a single GPU card? What functions in
>> a region can be isolated as kernels? What could be running as a (CUDA)
>> thread? How can these threads be partitioned into blocks and grids?
>>
>> On Fri, Feb 6, 2015 at 6:22 AM, Fergal Byrne <[email protected]
>> > wrote:
>>
>>> Hi Rich,
>>>
>>> Thanks for restarting this discussion.
>>>
>>> I started a project to reimplement HTM in Elixir in December 2013, but
>>> then switched to Clojure for mainly non-technical reasons. One of the
>>> "thought leaders" in the Elixir community is interested in being the
>>> lynchpin of the project to bring HTM to Elixir, so I'll be making some
>>> announcements on that in the near future (I have some code archaeology to
>>> perform first!).
>>>
>>> Your intuitions are correct about both the scalability and simplicity of
>>> using Elixir/Erlang to do HTM. In my initial tests, I was able to spring up
>>> 250k neuron processes in 20-25ms in Elixir on a laptop.
>>>
>>> On the general point of distributed HTM, Michael is correct to identify
>>> the granularity at which things can be split up, and Tim is on the money
>>> about the kernel of the issue - state (synapses, in particular). In a
>>> typical NuPIC-sized region, we have 2048 cols x 32 cells = 64k total cells,
>>> with in the neighbourhood of 1-300m synapses. The number of "messages"
>>> passed between these neurons (and their "state") is thus very large
>>> compared with the number of input and out messages between regions. It
>>> makes sense to have the processing for a contiguous "patch" of neurons such
>>> as this all contained within a single (OS level) process, and to have
>>> patches communicate using SDRs.
>>>
>>> Matt is correct when he describes the importance of this in the context
>>> of Temporal Pooling and hierarchy. With a multi-layer architecture for a
>>> single region, and a hierarchy of regions, we will very quickly hit the
>>> skids if we continue with a single-threaded, monolithic design for HTM. On
>>> the other hand, Matt is also correct that, once solved, we can take
>>> advantage of distributed processing to build HTM systems as large and
>>> powerful as we like.
>>>
>>> Within a patch, I think the jury is very much out on the performance of
>>> message-passing versus sparse vectors (as used in NuPIC). Due to sparseness
>>> both in space and time in real world data, it's not clear that
>>> message-passing (or some equivalent, functional reactive scheme) would not
>>> outperform the use of big sparse arrays.
>>>
>>> Regards,
>>>
>>> Fergal Byrne
>>>
>>> On Thu, Feb 5, 2015 at 7:07 PM, Rich Morin <[email protected]> wrote:
>>>
>>>> On Feb 5, 2015, at 05:18, Kevin Archie <[email protected]> wrote:
>>>> > https://github.com/nupic-community/comportex
>>>> >
>>>> > (I have no connection to the project, I’m just aware of it.)
>>>>
>>>> The Clojure ports are certainly worth a look, if only to see how they
>>>> decompose the problem.  Although scalability is a motivation, my real
>>>> interest has to do with seeing how Elixir (including Erlang and OTP)
>>>> can be used to simplify the model.  That is, can I model things like
>>>> neurons, columns, and regions using lightweight processes, leaving
>>>> the communication and management to OTP.
>>>>
>>>> -r
>>>>
>>>>  --
>>>> http://www.cfcl.com/rdm           Rich Morin           [email protected]
>>>> http://www.cfcl.com/rdm/resume    San Bruno, CA, USA   +1 650-873-7841
>>>>
>>>> Software system design, development, and documentation
>>>>
>>>>
>>>>
>>>>
>>>
>>>
>>> --
>>>
>>> Fergal Byrne, Brenter IT
>>>
>>> http://inbits.com - Better Living through Thoughtful Technology
>>> http://ie.linkedin.com/in/fergbyrne/ - https://github.com/fergalbyrne
>>>
>>> Founder of Clortex: HTM in Clojure -
>>> https://github.com/nupic-community/clortex
>>>
>>> Author, Real Machine Intelligence with Clortex and NuPIC
>>> Read for free or buy the book at https://leanpub.com/realsmartmachines
>>>
>>> Speaking on Clortex and HTM/CLA at euroClojure Krakow, June 2014:
>>> http://euroclojure.com/2014/
>>> and at LambdaJam Chicago, July 2014: http://www.lambdajam.com
>>>
>>> e:[email protected] t:+353 83 4214179
>>> Join the quest for Machine Intelligence at http://numenta.org
>>> Formerly of Adnet [email protected] http://www.adnet.ie
>>>
>>
>>
>
>
> --
> *We find it hard to hear what another is saying because of how loudly "who
> one is", speaks...*
>
>
>


-- 
http://timboudreau.com

Re: [nupic] Steps toward a distributed NuPic

Reply via email to