Hello,
I'm afraid this isn't a full answer but your email reminded me of something.
I remember reading a dissertation a while ago about parallllizing the HTM
alogorithms. I found it again here
<http://www.google.co.uk/url?sa=t&rct=j&q=&esrc=s&source=web&cd=1&ved=0CC0QFjAA&url=http%3A%2F%2Fpdxscholar.library.pdx.edu%2Fcgi%2Fviewcontent.cgi%3Farticle%3D1201%26context%3Dopen_access_etds&ei=pTqnU7fqE4Gc0AX0pIGwCA&usg=AFQjCNEGuErY31Z-3wVbL5UdMAiWCUrmFw&sig2=cKlBhIhyCPrOCz8yB2z7lQ&bvm=bv.69411363,d.d2k&cad=rja>.
(if the link doesn't work googling "Price CLA HTM Parallel")

The interesting thing I remember from it was that 90-98% of the execution
time was due to just two subroutines in the temporal pooler.

It was before NuPIC was open sourced, so the author built himself a version
of the CLA from the white paper. I think the NuPIC code has had various
optimisations so I'm not sure how relevant it will be now. But it might
have useful things in it.

Ruaridh


On Sun, Jun 22, 2014 at 9:15 PM, Alex Davos <[email protected]> wrote:

> Hello,
>
>
> We are a team of astrophysicists, interested in applying HTM/CLA to
> pattern recognition problems in our datasets, mostly time series of high
> energy physics and other related phenomena.  These datasets posses
> characteristics that seem to fit well to HTM's capabilities so we thought
> we should give it a shot.
>
>
> However, because of the size and complexity of the datasets, the current
> size of HTM/CLA is not adequate to capture the structure available. We
> intent to attempt parallelizing the code in order to increase the
> capabilities of the model as much as possible so we have a couple of
> questions before we attempt to do so:
>
>
> First of all, most of the references to the size of HTM/CLA up to this
> point have been limited up to 2048 columns. Jeff Hawkins has mentioned in a
> presentation that he could probably do "3X" that or so. However we haven't
> been able to locate the exact specification of the hardware used in either
> of these cases.
>
>
> To put it differently, what is exactly the hardware in which the 3X part
> was determined? Is it a run of the mill I7 with 32GB of RAM or something a
> bit more powerful?
>
>
> Is this a strictly serial implementation?
>
>
> What are the main bottlenecks at the moment? Our experience with CLA so
> far points to a CPU-RAM bus bottleneck rather than computational or RAM
> size limits, do your observations match ours?
>
>
>
> For example we have available to us in the lab about 10 nodes (each has 8
> socket motherboard with 10 core xeons per socket and 512 GB RAM per cpu,
> 4TB RAM per node) connected to each other with infiniband and high speed
> switches, can you provide us with a ballpark on what is achievable with
> that hardware?
>
>
> Second, what is preferable, from a theoretical standpoint at least, as a
> parallelization strategy, a column level one or a hierarchical? To clarify,
> is it better that we attempt to built a few big regions containing many
> many columns each, or we should instead create many smaller regions and
> connect each other hierarchically?
>
>
> What are the pros/cons of each approach and what datasets would benefit
> more from bigger regions rather than many smaller and vice versa.
>
>
>
> Thank you in advance for taking the time to answer our questions.
>
>
>
> _______________________________________________
> nupic mailing list
> [email protected]
> http://lists.numenta.org/mailman/listinfo/nupic_lists.numenta.org
>
>
_______________________________________________
nupic mailing list
[email protected]
http://lists.numenta.org/mailman/listinfo/nupic_lists.numenta.org

Reply via email to