Hi Dennis, Welcome to the list, nice to meet you too.
I'll preface this by saying I'm an expert on precisely none of what I'm about to talk about, but I'd welcome any input from people who are. The whole area of machine learning in vision is both huge and balkanised. Numenta began by looking at things like vision but quickly retreated as NuPIC was nowhere near ready to take it on. Jeff has concentrated instead on building a model of a single layer of a cortical region and seeing how far that would take us, keeping as close as possible to the neuroscience for the design. The result is that NuPIC has some visual capabilities, but they are very limited, in terms of what you're talking about, by the lack of hierarchy in NuPIC at present. Other schemes (Geoff Hinton's spring to mind) have successfully implemented all three parts of your requirements, as follows: 1. Unsupervised training of a single layer Restricted Boltzmann Machine to create a layer of feature detectors. 2. Stack a few more layers on top, again each is trained unsupervised on the input from the layer below. 3. Connect a "label" layer to a joint top associative layer to learn categories (or objects), use simple backpropagation to tune the RBM stack. Hinton reports excellent performance doing this, and says that using things like OpenCV actually impede feature detection compared with his unsupervised learning in the first layer. Programmed feature extractors suffer badly when the images are cropped, rotated (as you've noticed), and when the objects are occluded or conjoined. Learning feature detectors (such as HTM and RBM's) on the other hand, perform much better in real-world conditions. They also self-categorise and generalise on their own, reducing the need for having huge corpuses of labelled data. I don't see the logic in looking at NuPIC at the top of your pipeline, to be honest. You're already worried about the quality of the feature detection coming in from OpenCV, so I'd suggest considering either RBM's or HTM's at the feature detection and categorisation level first. Regards, Fergal Byrne On Thu, Nov 14, 2013 at 7:23 PM, Dennis Stark <[email protected]> wrote: > Hello everyone, > > This is my first time writing to the mailing list, so nice to meet you > > I'm trying to solve a problem of content analysis the following way: > > 1) Break image into object > 2) Ask user to categorize objects > 3) Learn categories for those objects and use this knowledge for future > inference of new visual input. > > The biggest problem I'm facing at the moment is to make sure that object > is remembered in invariant state. Once I got my image through OpenCV, I can > fix the size of the object, but not the rotation, so in my case I need HTM > to remember rotation invariant representation of the object. HTM should > also know that this is the same object. > > So my problem at the moment is that I don't think I quite understand what > the structure of HTM should be to allow for this. Scale and rotation would > be even better. If someone has any thoughts on that - I'd really appreciate > it. > _______________________________________________ > nupic mailing list > [email protected] > http://lists.numenta.org/mailman/listinfo/nupic_lists.numenta.org > -- Fergal Byrne, Brenter IT <http://www.examsupport.ie>http://inbits.com - Better Living through Thoughtful Technology e:[email protected] t:+353 83 4214179 Formerly of Adnet [email protected] http://www.adnet.ie
_______________________________________________ nupic mailing list [email protected] http://lists.numenta.org/mailman/listinfo/nupic_lists.numenta.org
