I'm new to this list and to nupic - it's a fascinating project that confirms a few intuitions I've had, and might let me try out a few ideas I've had in the back of my mind for a couple of decades, so I'm very excited to play with it.
I find that to understand a problem deeply enough to *really* get what's going on, I have to solve it at least once myself. So I've been poking around the nupic source code, but also read through the white paper, took copious notes, and from that have been attempting to implement some of the data structures and algorithms in Java (yes, I found the GitHub project where someone else did that). It helps to get my head around the realistic memory requirements of such a system (plus my C skills are 18 years atrophied :-/), what's really going on under the hood, and understand its limitations. So if you don't mind, can I ask a few questions about data structures, to resolve a couple of ambiguity and figure out if I'm modelling it right? Specifically: 1. In some places, the white paper implies that input bits are wired directly to columns; elsewhere it suggests that there is a network of synapses between input bits and columns in the layer that directly receives input. 2. It's implied several times that layers feed *back* as well as forward, but no mechanism is really made explicit. Is that correct or am I misunderstanding it? 3. It's mentioned several times that "strongly activated" columns inhibit nearby columns, to achieve sparseness. But it's also implied that the input to a cell is a single bit, which has no concept of strong or not-so-strong. Is "strongly activated" the count of cells in a column which are active due to feed-forward input, or is the input actually scalar and I just misunderstood that? 4. I may be misunderstanding the nature of distal dendrite segments; I'm also trying to model the data efficiently. Is it sane to model a distal dendrite segment as - A column coordinate in a 2D topology - x,y coordinates - A list of path-traversal instructions - (i.e. up-left, down, down, right) in which no coordinate is traversed more than once - Permanence values mapped to instructions, to model synapses or am I misunderstanding something, and the set of potential synapses should be *all* adjacent and near-adjacent cells? Or is randomly generating some paths sufficient? 5. It's not clear how distal dendrites relate to columns - right now I'm treating all cells within a column a dendrite's "path" intercects with as potential synapses. Not clear if that's the correct approach, or if the "instructions" should include a spec for which cell in the intersected column should be a potential synapse. 6. Boost factor - this seems like something that could be modelled as a single byte, scaled to 0.0-1.0 - or is that likely to be insufficient resolution? 7. It's not clear how the initial state of the system (before any input) is established - i.e. you have to have some distal dendrite segments and potential synapses 8. A friend of mine I mentioned the project to said that he couldn't find a crisp definition of "sparse distributed representation" anywhere. I'd indeed figured it was one of those things I'd pick up intuitively with enough reading, and did to a degree. Can I check my understanding with you? Specifically, it seems like the high-order feature is that you have a lot of bits, but only a small percentage are active. 9. A number of things, such as column inhibition, are affected by what is adjacent. This brings up a problem that also exists in image processing - edges and corners are special, and you either a. simulate a continuous topology by wrapping around, or b. interpolate values outside the bounds, c. ignore values from some inner perimiter, or d. just live with it. So, for example, a column in a corner should be less inhibited by neighboring columns because it has fewer neighbors (it also has more limited choices for potential synapses). Is this a problem, and if so, how do you deal with it? I realize that's a lot of questions. Any suggestions appreciated, or feel free to tell me to just go read the code :-) Thanks, Tim Boudreau -- http://timboudreau.com
_______________________________________________ nupic mailing list [email protected] http://lists.numenta.org/mailman/listinfo/nupic_lists.numenta.org
