On 6/24/07, Bob Mottram <[EMAIL PROTECTED]> wrote:
I have one of Richard Sutton's books, and RL methods are useful but I also have some reservations about them. Often in this sort of approach a strict behaviorist position is adopted where the system is simply trying to find an appropriate function mapping inputs to outputs. The internals of the system are usually treated as a black box with a homogenous structure, and it's this zero architecture or trivial architecture approach which can make the learning problem exceptionally hard.
But they don't need to be, there is always place to accomodate knowledge. You can use structured value function approximators, use off-policy methods for supervised learning, etc. BTW, has anyone tried value estimates with an uncertainty dimension, and with a prior that favors more certain estimates but degrades smoothly for less certain estimates, for both action selection and value backup, or at least for action selection? This kind of action selection seems to be smarter than e-greedy strategies. (More certain value estimate is less optimistic, smaller, but is based more experience.) ----- This list is sponsored by AGIRI: http://www.agiri.org/email To unsubscribe or change your options, please go to: http://v2.listbox.com/member/?member_id=231415&user_secret=e9e40a7e