On 6/24/07, Bob Mottram <[EMAIL PROTECTED]> wrote:
I have one of Richard Sutton's books, and RL methods are useful but I
also have some reservations about them.  Often in this sort of
approach a strict behaviorist position is adopted where the system is
simply trying to find an appropriate function mapping inputs to
outputs.  The internals of the system are usually treated as a black
box with a homogenous structure, and it's this zero architecture or
trivial architecture approach which can make the learning problem
exceptionally hard.

But they don't need to be, there is always place to accomodate
knowledge. You can use structured value function approximators, use
off-policy methods for supervised learning, etc.

BTW, has anyone tried value estimates with an uncertainty dimension,
and with a prior that favors more certain estimates but degrades
smoothly for less certain estimates, for both action selection and
value backup, or at least for action selection? This kind of action
selection seems to be smarter than e-greedy strategies.
(More certain value estimate is less optimistic, smaller, but is based
more experience.)

-----
This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/?member_id=231415&user_secret=e9e40a7e

Reply via email to