On Thu, 24 Aug 2017, tchow wrote:
On 2017-08-24 16:28, Philippe Michel wrote:
With your proposal used for the usual 2-ply rollouts the first few
steps would be 20 times slower, the following ones unchanged. The
total number of steps would depend of the position but the final cost
may be in the
x 1.5 to x 2 range, similar to the gain in accuracy when *all* steps
use 2-ply variance reduction.
Thanks for taking a look. I thought of a related idea, which is to increase
the ply-level for the VR occasionally, but not necessarily for the first
play; instead, one increases the ply-level for the VR level "as needed." I'm
not sure exactly what "as needed" should mean, but one possibility is that if
the roll chosen in the rollout trial is extremely lucky or extremely unlucky,
then we invest some extra effort to make sure that the luck estimate is
accurate for that roll. If the threshold for "extremely lucky or unlucky" is
chosen so that the 20x slowdown is invoked only 1/20 of the time then the
overall time penalty should be in the 2x range. Of course it's not clear
whether my intuition is correct that the extremely lucky/unlucky rolls are
the ones that are in most need of accurate VR.
I now think it couldn't work, no matter how accurately you are able to
select the rolls that would get a deeper-ply VR. First few, most volatile,
whatever.
Let's assume a 2-ply rollout and that, as suggested in my previous sample,
by doing a 2-ply instead of 1-ply VR decrease the SD in the same
proportion doing twice as many trials would, but at the cost of being 20
times slower.
Correcting the 1-ply vs. 2-ply inaccuracy of the VR of all moves does
this, and your idea amounts to hope that by fixing the VR a small
fraction (less than 5%) of the moves you could get most of the benefit.
That could only work if the 1-ply vs. 2-ply inaccuracy was concentrated in
a few spikes and this is simply not the case.
The only possibly worthwhile case could be to do this on the first one or
two rolls because, by recording the results, you could do the VR part only
21 or 441 times whatever the number of rollout trials. That would entail
some significant coding effort for, at best, a very limited gain.
_______________________________________________
Bug-gnubg mailing list
[email protected]
https://lists.gnu.org/mailman/listinfo/bug-gnubg