Hello Jim, Thank you for your work and report, we need more investigations like yours. A few suggestions:
1. Since you're using a KNN classifier, it'd be nice to use it directly on the pixels as a baseline. It's an important benchmark to show that NuPIC indeed is doing the heavy work. 2. Have you tried a more balanced division between training and testing sets? Using 100% or 1% of the data to train seems a bit to extreme to me. 3. Did you look at the MNIST dataset? It's probably the most widely used benchmark for computer vision. It's gonna be computationally demanding (50-60K images), but we will have results that can be compared to other machine learning approaches. 4. Did you use swarming or grid search to find out the best meta-parameters? A long time ago I used the previous NuPIC implementation for static classification (just the spatial pooler) and it was competitive with SVMs. Pedro. On Tue, Aug 19, 2014 at 12:24 AM, Jim Bridgewater <[email protected]> wrote: > Hi everyone, > > I've written up a summary of the work I did this summer as part of > Season of NuPIC that includes the most recent results. This summary > is attached along with a separate file that contains 8,928 images from > 144 fonts. These images were used to test the spatial pooler. The > gist of it is that the SP does very well (>97% accuracy) when you > train it on all of the images you test it on which is good, but very > time consuming and doesn't require any ability to generalize. When I > trained the SP on a much smaller data set of 186 images containing > normal, bold, and italic characters not included in the larger data > set the accuracy fell to about 32%. There are several ways to improve > this. One is reducing the potential radius so columns learn features > rather than entire characters. I tried this, but there appears to be > a bug in the SP's potential mapping that currently prevents this > technique from helping. Another way is to try different potential > mappings, like lines with different orientations, again in an effort > to get the SP's columns to learn features rather than entire > characters. I've written a mapping for this but have not tried it. > And yet another way to improve these results would be to add > additional SP regions in an effort to get more generalization. > > I look forward to hearing your comments! > > -- > James Bridgewater, PhD > Arizona State University > 480-227-9592 > > _______________________________________________ > nupic mailing list > [email protected] > http://lists.numenta.org/mailman/listinfo/nupic_lists.numenta.org > > -- Pedro Tabacof
_______________________________________________ nupic mailing list [email protected] http://lists.numenta.org/mailman/listinfo/nupic_lists.numenta.org
