Hello,
We are a team of astrophysicists, interested in applying HTM/CLA to pattern recognition problems in our datasets, mostly time series of high energy physics and other related phenomena. These datasets posses characteristics that seem to fit well to HTM's capabilities so we thought we should give it a shot. However, because of the size and complexity of the datasets, the current size of HTM/CLA is not adequate to capture the structure available. We intent to attempt parallelizing the code in order to increase the capabilities of the model as much as possible so we have a couple of questions before we attempt to do so: First of all, most of the references to the size of HTM/CLA up to this point have been limited up to 2048 columns. Jeff Hawkins has mentioned in a presentation that he could probably do "3X" that or so. However we haven't been able to locate the exact specification of the hardware used in either of these cases. To put it differently, what is exactly the hardware in which the 3X part was determined? Is it a run of the mill I7 with 32GB of RAM or something a bit more powerful? Is this a strictly serial implementation? What are the main bottlenecks at the moment? Our experience with CLA so far points to a CPU-RAM bus bottleneck rather than computational or RAM size limits, do your observations match ours? For example we have available to us in the lab about 10 nodes (each has 8 socket motherboard with 10 core xeons per socket and 512 GB RAM per cpu, 4TB RAM per node) connected to each other with infiniband and high speed switches, can you provide us with a ballpark on what is achievable with that hardware? Second, what is preferable, from a theoretical standpoint at least, as a parallelization strategy, a column level one or a hierarchical? To clarify, is it better that we attempt to built a few big regions containing many many columns each, or we should instead create many smaller regions and connect each other hierarchically? What are the pros/cons of each approach and what datasets would benefit more from bigger regions rather than many smaller and vice versa. Thank you in advance for taking the time to answer our questions.
_______________________________________________ nupic mailing list [email protected] http://lists.numenta.org/mailman/listinfo/nupic_lists.numenta.org
