Hi Dennis,

Welcome to the list, nice to meet you too.

I'll preface this by saying I'm an expert on precisely none of what I'm
about to talk about, but I'd welcome any input from people who are.

The whole area of machine learning in vision is both huge and balkanised.
Numenta began by looking at things like vision but quickly retreated as
NuPIC was nowhere near ready to take it on. Jeff has concentrated instead
on building a model of a single layer of a cortical region and seeing how
far that would take us, keeping as close as possible to the neuroscience
for the design.

The result is that NuPIC has some visual capabilities, but they are very
limited, in terms of what you're talking about, by the lack of hierarchy in
NuPIC at present.

Other schemes (Geoff Hinton's spring to mind) have successfully implemented
all three parts of your requirements, as follows:

1. Unsupervised training of a single layer Restricted Boltzmann Machine to
create a layer of feature detectors.
2. Stack a few more layers on top, again each is trained unsupervised on
the input from the layer below.
3. Connect a "label" layer to a joint top associative layer to learn
categories (or objects), use simple backpropagation to tune the RBM stack.

Hinton reports excellent performance doing this, and says that using things
like OpenCV actually impede feature detection compared with his
unsupervised learning in the first layer. Programmed feature extractors
suffer badly when the images are cropped, rotated (as you've noticed), and
when the objects are occluded or conjoined. Learning feature detectors
(such as HTM and RBM's) on the other hand, perform much better in
real-world conditions. They also self-categorise and generalise on their
own, reducing the need for having huge corpuses of labelled data.

I don't see the logic in looking at NuPIC at the top of your pipeline, to
be honest. You're already worried about the quality of the feature
detection coming in from OpenCV, so I'd suggest considering either RBM's or
HTM's at the feature detection and categorisation level first.

Regards,

Fergal Byrne



On Thu, Nov 14, 2013 at 7:23 PM, Dennis Stark <[email protected]> wrote:

> Hello everyone,
>
> This is my first time writing to the mailing list, so nice to meet you
>
> I'm trying to solve a problem of content analysis the following way:
>
> 1) Break image into object
> 2) Ask user to categorize objects
> 3) Learn categories for those objects and use this knowledge for future
> inference of new visual input.
>
> The biggest problem I'm facing at the moment is to make sure that object
> is remembered in invariant state. Once I got my image through OpenCV, I can
> fix the size of the object, but not the rotation, so in my case I need HTM
> to remember rotation invariant representation of the object. HTM should
> also know that this is the same object.
>
> So my problem at the moment is that I don't think I quite understand what
> the structure of HTM should be to allow for this. Scale and rotation would
> be even better. If someone has any thoughts on that - I'd really appreciate
> it.
> _______________________________________________
> nupic mailing list
> [email protected]
> http://lists.numenta.org/mailman/listinfo/nupic_lists.numenta.org
>



-- 

Fergal Byrne, Brenter IT

<http://www.examsupport.ie>http://inbits.com - Better Living through
Thoughtful Technology

e:[email protected] t:+353 83 4214179
Formerly of Adnet [email protected] http://www.adnet.ie
_______________________________________________
nupic mailing list
[email protected]
http://lists.numenta.org/mailman/listinfo/nupic_lists.numenta.org

Reply via email to