> That doesn't seem to directly support deriving information from random
> trials. For computer go tuning, would you play multiple games with each
> parameter set in order to get a meaningful figure? That seems likely to
> be less efficient than treating it as a bandit problem.

you'd decide how many experiments you wanted to run, take a stab at
what you thought the interactions between parameters were (i.e.
independent between #2 and #3, and no worse than quadratic between #1
and #4), generate an optimal design, run enough experiments at each
setting of the parameters (as specified by the design) to keep error
low, then fit the specified model with the resulting data.

the way to consider the model is that you want to model winrate versus
(whomever -- self-play, on kgs, whatever you want), and you have some
idea about interactions between parameters (i think that this cutoff
is only appropriate between these two ranges, and have reason to
believe has only a very weak interaction with this other parameter,
whereas these 12 parameters might all be related in some horrible
quadratic way), but you don't want to blindly run thousands of random
tests.  you'd rather run thousands of tests that were specifically
designed to maximize the amount of information that you can get about
the model that you're trying to fit.

alternatively, it does sphere packing over the direct product of open
or closed (but bounded) intervals and discrete sets, so you can get a
set of points that is slightly better than a random set of experiments
(i.e. guaranteed to cover the space well).

s.
_______________________________________________
computer-go mailing list
[email protected]
http://www.computer-go.org/mailman/listinfo/computer-go/

Reply via email to