Yes, now that you mention it using topology in the SP might be essential for your word-SDRs. Cepts word-SDRs are pretty big and have a 2D topology so a basic SP without topology will be limited.
The way I think of this in regards to the retina is the chance of finding a correlation between two bits from the retina is high if the two bits are close to each other and quickly becomes low if the two bits are further apart. This property is true across the entire retina. In the case of Cept word-SDRs there might be discontinuities in the 2D map but the SP should handle this automatically and not form coincidence detectors that span two regions of the map if there are minimal correlations between the two regions. It is even cooler. Say one region of the 2D map was all noise, no patterns to detect. What will happen is the coincident detectors that would normally look at the noise region will migrate their connections to the adjacent region where patterns can be found. The SP will learn to ignore the noise region and instead form better representations of the region with patterns. When the SP has topology all the coincident detectors are constantly competing and jostling to find good patterns to represent. I dont know if topology is still in the NuPIC code base. I think it is but I am not certain. What we did was assign potential synapses to each coincident detector. The potential synapses were in a circular region of the input space. We assigned initial permanence values in with a Gaussian. Inputs that are closest in the 2D map had the highest permanence values. We set the circular region of potential synapses fairly large. In the real brain dendrites and axons are always growing from the where they have made good connections. The potential synapse pool is dynamic and constantly trying to expand in the area where good connections have already been found. We didnt model this but it is a more efficient approach than just keeping a large pool of potential synapses. Jeff From: nupic [mailto:[email protected]] On Behalf Of Francisco Webber Sent: Tuesday, October 22, 2013 3:19 PM To: NuPIC general mailing list. Subject: Re: [nupic-dev] Looking for help in understanding part of the HTM white paper Jeff, the SP with dynamic locality sounds exactly like the kind of configuration I was hoping to feed our word SDRs into. A position within a 2D word-SDR (described by two coordinates) corresponds to a specific semantic environment. Lets say the upper left area stands for nature, animals, plants etc and the lower right quadrant stands for sports and TV-events. This topology of the semantic space is captured within our CEPT Retina. If we now feed text about lions through the cept-retina into such a dynamic SP and if the column-inputs are organized in the same 2D way as the retina, the columns corresponding to the upper left area will become subdivided into smaller and smaller input fields while "reading" through the material. Same would be true in the lower right quadrant while reading the football news. The CLA layer will end up reflecting, with its input densities, the "semantic topology" that the retina has captured. A topological CLA layer can easily be extended to become a "natural" or implicit hierarchy, if one just symmetrically reduces the number of available columns in each higher layer while maintaining the given projection topology. Francisco On 22.10.2013, at 22:22, "Jeff Hawkins" <[email protected]> wrote: We dont have any definitive numbers on this. In general the SP is tolerant to a large range of sparsity in the input, but the actual numbers depend on several things. On the sparser end it is important that there are enough active input bits for the SP recognize patterns. The more input bits you have the sparser the patterns can be and still have a sufficient number of active input bits. On the denser side of the scale I would expect the SP to start breaking down by 50% active input bits, maybe earlier. We have a method of determining if a trained SP is working well. Recall that the individual bits of the spatial pooler are trying to learn common spatial patterns in the input. We often refer to them as coincidence detectors. After training the SP you can look at how many valid synapses each SP coincidence detector has. The number of valid synapses tells you how rich a spatial pattern this coincidence detector has learned. For example 5 or fewer valid synapses is not much of a spatial pattern and the SP output will not be very stable. This represents a system that has too few SP coincidence detectors or a system where the input doesnt contain many repeating patterns. For example, if you feed random patterns into the SP it wont find anything to learn and the coincidence detectors will have few synapses. The more valid synapses you find in the trained SP coincidence detectors the better the job the SP is doing. If there are not enough total active input bits the SP wont find rich patterns. If the input is too dense then the input patterns will likely overlap a lot and the SP will have trouble separating them. Again, in practice we found the SP to be tolerant, I am just talking about the extremes. Advanced topic: Imagine the million bits coming from the retina, assume they have some reasonable sparse activity, say 5%. If we feed this into a plain vanilla spatial pooler it wont work well at all. Even if we have 1M columns in the SP, the number of patterns coming from the retina is so huge that each column in the SP will be overwhelmed. There are WAY more than 1M patterns coming from the retina, it will look like noise. However, we can fix this problem by using topology. When we implement the SP with topology the individual coincidence detectors will limit the area of the input they look at until they find rich spatial patterns. The size of the area they look at varies. If the input patterns become less varied the input area of a coincidence detector will expand. If the input patterns become more varied the input area will contract. This is the basis of plasticity. We tested this and it worked beautifully. Jeff From: nupic [mailto:[email protected]] On Behalf Of Pedro Tabacof Sent: Tuesday, October 22, 2013 4:30 AM To: NuPIC general mailing list. Subject: Re: [nupic-dev] Looking for help in understanding part of the HTM white paper Hello, Is there a recommended level for input sparsity? What is the minimum and maximum sparsity it can work functionally with? Thanks, Pedro. On Mon, Oct 21, 2013 at 6:28 PM, Jeff Hawkins < <mailto:[email protected]> [email protected]> wrote: Perhaps this wasnt written as well as it should have been. The spatial pooler converts one sparse representation into another sparse representation. The output of a spatial pooler has a fixed number of bits (equal to the column number) and has a relatively fixed sparsity, say 2%. The spatial pooler works just fine with a range in the number of input bits and a range in sparsity. In some ways the goal of the SP is handle any amount of input, convert it to a fixed size and sparsity output. The other thing it does is learn the common spatial patterns in the input and make sure to represent those well. The output sparsity of the SP needs to be relatively fixed for the temporal pooler (sequence memory) to work. The number of output bits, equal to the number of columns, also has to be fixed for the TP to work. Why is it important that the input can vary? In a real brain the hierarchy of the neocortex is complicated and messy. Multiple regions converge onto destination regions as you ascend the hierarchy. By allowing the number of input bits to vary over a wide range, evolution could wire up the hierarchy lots of different ways and the cortex continues to work ok. If we took an existing brain and then added a connection between two regions that previously were not connected the SP in the destination region wouldnt break. For example, in normal humans the size of primary visual cortex varies by a factor of 3, but the size of the output of the retina is always about 1M fibers. The SP in V1 can handle a broad range in the ratio of the number of input bits and the number of output bits. The sparsity level of the input can vary due to multiple reasons. Lack of sensory input, change in attention (which effectively turns off input bits), and due to temporal pooling itself. So it is important that the spatial pooler take whatever it is given and converts it into a relatively fixed output. This is why the SP does what it does and why it is important. Do you need help understanding how the SP does this? Jeff From: nupic [mailto: <mailto:[email protected]> [email protected]] On Behalf Of Jeff Fohl Sent: Sunday, October 20, 2013 6:41 PM To: <mailto:[email protected]> [email protected] Subject: [nupic-dev] Looking for help in understanding part of the HTM white paper Hello - I hope this is not being posted to the wrong list. This is my first post here. Please let me know if there is a more appropriate place for this question. In preparation for learning NuPIC, I have read "On Intelligence", and I am now reading the HTM white paper put out by Numenta. Making my way through the white paper, I got stuck on one passage, which I can't really make sense of. Wondering if anyone can help me through this part. The passage in question is on pages 11-12 of the white paper PDF - specifically the second paragraph included below. HTM regions also use sparse distributed representations. In fact, the memory mechanisms within an HTM region are dependent on using sparse distributed representations, and wouldnt work otherwise. The input to an HTM region is always a distributed representation, but it may not be sparse, so the first thing an HTM region does is to convert its input into a sparse distributed representation. For example, a region might receive 20,000 input bits. The percentage of input bits that are 1 and 0 might vary significantly over time. One time there might be 5,000 1 bits and another time there might be 9,000 1 bits. The HTM region could convert this input into an internal representation of 10,000 bits of which 2%, or 200, are active at once, regardless of how many of the input bits are 1. As the input to the HTM region varies over time, the internal representation also will change, but there always will be about 200 bits out of 10,000 active. So, what exactly is going on here? How does a fluctuating input flow of 20,000 bits get converted into 200 bits? Obviously there is something important going on here, but I don't understand what it is. Any help illuminating this would be greatly appreciated! Many thanks, Jeff _______________________________________________ nupic mailing list <mailto:[email protected]> [email protected] <http://lists.numenta.org/mailman/listinfo/nupic_lists.numenta.org> http://lists.numenta.org/mailman/listinfo/nupic_lists.numenta.org -- Pedro Tabacof, Unicamp - Eng. de Computação 08. _______________________________________________ nupic mailing list [email protected] http://lists.numenta.org/mailman/listinfo/nupic_lists.numenta.org
_______________________________________________ nupic mailing list [email protected] http://lists.numenta.org/mailman/listinfo/nupic_lists.numenta.org
