The control problem always reminds me of this classic motor learning
experiment:

Imagine you're asked to hold and shift a lever from one position to another
in a straight line, but the lever keeps trying to drift sideways, or first
left, then right. Neuroscientists do this with monkeys to watch them learn
motor tasks in a new, dynamic environment.

See this short paper for a good example:
http://www.pnas.org/content/97/5/2259.full.pdf

So basically you have a representation of a goal that doesn't change, but
how you coordinate your muscles during the movement has to be relearned to
get to that goal. Some cells will always be in charge of dynamically
reacting to the world, and others will learn about these new world dynamics
and remember them for next time.

To properly implement this in CLA I think there needs to be a goal SDR, a
series of predictions given the current input and the goal, and a mechanism
by which the goal and the predictions are compared moment to moment to the
input from muscles and the world. Moment to moment corrections would
probably be random at first, then gross movements reinforced for reducing
the error between input and prediction, and finally refined.

One aspect I *really* don't have a good answer for though is how does goal
+ world state == action? Is it the intensity of the goal representation? Is
it a goal plus some global trigger that says "go?"

Anyway, getting off topic a bit. I think what you're trying to do will
eventually work, but it needs to be tightly coupled to the way CLA learns
to be effective.

Ian


On Mon, Sep 2, 2013 at 3:38 PM, Pedro Tabacof <[email protected]> wrote:

> Ian,
>
> Your 10 points are all spot on, great job on understanding my mess!
>
> I'm not sure if feeding the CLA open-loop (no kind of control on) data
> would be useful on practice, because in this case you're probably better
> off with standard MPC, but this is probably the right way to start to
> tackle this problem. It's actually not that hard to simulate a complicated
> dynamic system with noise and disturbances and gather "experimental" data.
> If I have time I will look into this and share my findings here.
>
> It'd be cool to see how the CLA would respond to things such as large time
> constants (slow dynamic response) and/or considerable deadtime (time-delay)
> before trying to actually control a system with it. The main difference
> from this to typical CLA applications is that the system inputs are
> independent and thus their prediction is meaningless. Would this make the
> prediction of the system outputs (which depend on their past values and on
> current and past inputs) harder or just the same?
>
> From your explanation it seems the optimization time is not an issue,
> especially considering you could turn it off after a while because the CLA
> would probably have already learned the correct control patterns. It could
> be turned on only when needed to improve the control and perhaps be done
> offline.
>
> If I recall correctly, Jeff wrote on one email that multiple step
> prediction is actually made by an external classifier, so it is not
> actually inherent to the CLA. Can someone clarify this point? Multiple step
> prediction is essential to MPC so I'd like to understand it better.
>
> Anyways, I've been pondering about my MPC idea and more and more I tend to
> believe that it is just too convoluted to work - I always favor simple
> solutions over complex ones. If we had motor control CLA I think this could
> be a great target for application, but it seems this is nowhere near our
> present.
>
> Perhaps training NuPIC on data from a classical controller such as PID or
> even manual control and then using a simple reinforcement learning
> procedure to train NuPIC's predictions in order to improve the control
> scheme (squared error and smoothness as you put it) would be a better
> solution, but I'm not clear on how this could be done.
>
> []'s
> Pedro.
>
>
> On Mon, Sep 2, 2013 at 3:32 PM, Matthew Taylor <[email protected]> wrote:
>
>> On Sep 2, 2013, at 11:05 AM, Ian Danforth <[email protected]>
>> wrote:
>>
>> >  I'm going to be stupid in public...
>>
>> If only everyone were so fearless. :)
>>
>> Matt
>>
>> _______________________________________________
>> nupic mailing list
>> [email protected]
>> http://lists.numenta.org/mailman/listinfo/nupic_lists.numenta.org
>>
>
>
>
> --
> Pedro Tabacof,
> Unicamp - Eng. de Computação 08.
>
> _______________________________________________
> nupic mailing list
> [email protected]
> http://lists.numenta.org/mailman/listinfo/nupic_lists.numenta.org
>
>
_______________________________________________
nupic mailing list
[email protected]
http://lists.numenta.org/mailman/listinfo/nupic_lists.numenta.org

Reply via email to