I am pretty sure it is an MCTS problem and I suspect not something that
could be easily solved with a policy network (could be wrong hree). My
opinon is that DCNN is not
a miracle worker (as somebody already mentioned here) and it is going to
fail resolving tactics. I would be more than happy
valky...@phmp.se: <19f31e7e5cdf310b9afa91f577997...@phmp.se>:
>I think you misunderstood what I wrote,
>if perfect play on 9x9 is 6000 Elo, then if the value function is 3000
>Elo and MC eval is 2000 Elo with 1 second thinking time then it might
>be that the combination of a value function and
I did a quick test with my MCTS chess engine wth two different
implementations.
A standard MCTS with averaging, and MCTS with alpha-beta rollouts. The
result is like a 600 elo difference
Finished game 44 (scorpio-pmcts vs scorpio-mcts): 1/2-1/2 {Draw by 3-fold
repetition}
Score of scorpio-mcts vs
Hi Remi, hi friends,
> For the moment, my main objective is shogi. I
> will participate in the World Computer Shogi
> Championship in May.
Good luck! Please, keep us informed when
the tournament is running.
> So I am developing a game-independent AlphaZero framework.
I am hoping several
Alpha-beta rollouts is like MCTS without playouts (as in AlphaZero), and
something that can also do alpha-beta pruning.
With standard MCTS, the tree converges to a minmax tree not an alpha-beta
tree, so as you know there is a huge branching factor difference there.
For MCTS to become competitive
Summarizing the objections to my (non-evidence-based, but hand-wavy
observationally-based) assertion that 9x9 is going down anytime someone
really wants it to go down, I get the following:
* value networks can't hack it (okay, maybe? does this make it less likely?
-- we shouldn't expect to
Sorry, I haven't been paying enough attention lately to know what
"alpha-beta rollouts" means precisely. Can you either describe them or give
me a reference?
Thanks,
Álvaro.
On Tue, Mar 6, 2018 at 1:49 PM, Dan wrote:
> I did a quick test with my MCTS chess engine wth two
Well, AlphaZero did fine at chess tactics, and the papers are clear on the
details. There must be an error in your deductions somewhere.
From: Computer-go [mailto:computer-go-boun...@computer-go.org] On Behalf Of Dan
Sent: Tuesday, March 6, 2018 1:46 PM
To: computer-go@computer-go.org
Training on Stockfish games is guaranteed to produce a blunder-fest, because
there are no blunders in the training set and therefore the policy network
never learns how to refute blunders.
This is not a flaw in MCTS, but rather in the policy network. MCTS will
eventually search every move
I think you misunderstood what I wrote,
if perfect play on 9x9 is 6000 Elo, then if the value function is 3000
Elo and MC eval is 2000 Elo with 1 second thinking time then it might
be that the combination of a value function and MC eval ends up being
2700 Elo. It could also be that it ends up
10 matches
Mail list logo