I'm trying to figure out how to write a digit recognizer implementation
using a spatial pooler and had a few questions.

The data would be similar to mnist, but I would start with a much simpler
data set.  The input vector would be a 1d vector of 784 input elements that
represented the 2d image array of 28x28 pixels.  As a simplification, the
elements of the vectors would be 0 or 1 (as opposed to greyscale values
used in mnist).

The data sets I was planning to use:

* For training (online learning), I would create ideal versions of the 0-9
digits

* For testing (eg, online learning turned off), I would create noisy
versions of those same digits


The approach I was planning to take:

1) Create a spatial pooler instance

2) Turn on online learning

3) Repeatedly present ideal input vectors of 0-9 to the spatial pooler

4) Record the final SDR's of each input vector

5) Turn off online learning

6) Present noisy input vectors of 0-9 to the spatial pooler

7) Find the "closest" SDR recorded in step 4, and consider that the
inferred SDR of the spatial pooler.   I would compare the inferred SDR with
a known expected SDR, which would be used to calculate the overall error.


Does that sound like a reasonable approach?  If not, what's a recommended
approach to go about building a simple digit recognizer using the spatial
pooler?

Also I had some specific questions about how to use a spatial pooler:

* Are there any examples that directly use a spatial pooler that I can look
at?  The closest I could find was
https://github.com/allanino/nupic-classifier-mnist.git, but that uses OPF
and I want to go straight to the lower level code and use the spatial
pooler.

* After looking at the code for the spatial pooler implementations, the
"py" and "cpp" implementations don't seem to return any values.  Only the
"oldpy" implementation seemed to return a list of the active columns.
 Here's the method signature of the cpp spatial pooler I'm looking at in
spatial_pooler.cpp: void SpatialPooler::compute(UInt inputArray[], bool
learn,  UInt activeArray[]) { }  If they don't return any values, how are
they supposed to be used?

* Is there anything special to do with regards to recognizing 2d image
patterns?  Or will the columns naturally form 2d receptive fields of the
input vector?

* Since I want to basically classify digits, should I be looking at the
CLAClassifierRegion for guidance?  The idea I came up with for comparing
the SDR's feels a bit janky, and it seems like there should be a cleaner
way.  Eg, something helps take care of the classification work but that's
still lower level than the OPF?

* I am expecting to see exact matches for SDR's when presenting input
vectors that only have a little bit of noise, and "close" matches for input
vectors that have a lot of noise.  Is this a correct assumption?  Is there
any built-in machinery for measuring how close two SDR's are to each other?


Thanks in advance for any help, I'm really excited to get started building
this.  I'm planning to contribute it back in a pull request once it's
working.
_______________________________________________
nupic mailing list
[email protected]
http://lists.numenta.org/mailman/listinfo/nupic_lists.numenta.org

Reply via email to