Re: [Computer-go] 9x9 is last frontier?

2018-03-07 Thread Brian Sheppard via Computer-go
The algorithms presented there continue to the end of the game tree, no? (I am 
looking specifically at Algorithm 12, which seems to be the paper’s recommended 
algorithm.)

 

I am trying to get my head around what happens in practical settings, were we 
decide to limit search effort. E.g., if the search is limited by depth, then I 
think the UCT bounds that are present as tiebreakers would seldom be used? 
(Because moves at the root would not have “infinities.”) That is, the approach 
would simplify to MT-SSS*.

 

It is my impression that iterative-deepening/aspiration-window is the winner 
among the alpha-beta family of algorithms, so I am curious to learn how 
algorithm 12 compares. The problem is that iterative-deepening and 
aspiration-window are hard to analyze. SSS* is more amenable to proofs. A 
comparison to SSS* is maybe not relevant.

 

Another aspect that I am trying to understand is whether the associative memory 
is really an improvement over a transposition table. Specifically: Algorithm 12 
says “traverse the move that has the highest upper bound”, where a chess 
program would have stored a move in the xref table.

 

All in all, if I wanted to use alpha-beta in a Go program, then I would first 
train a value/priority network as described in the AGZ/AZ papers. Then use it 
in an alpha-beta program and start researching game-specific search techniques. 
Singular extensions, late-move-reductions, null-move, …

 

From: Computer-go [mailto:computer-go-boun...@computer-go.org] On Behalf Of Dan
Sent: Tuesday, March 6, 2018 5:55 PM
To: computer-go@computer-go.org
Subject: Re: [Computer-go] 9x9 is last frontier?

 

Alpha-beta rollouts is like MCTS without playouts (as in AlphaZero), and 
something that can also do alpha-beta pruning.

 

With standard MCTS, the tree converges to a minmax tree not an alpha-beta tree, 
so as you know there is a huge branching factor difference there.

 

For MCTS to become competitive with alpha-beta, one has to use either 
alpha-beta rollouts OR as in AlphaZero's case

pick accurately, say the best 3 moves with its PUCT bias, and then search those 
with MCTS that would converge to minimax (not alpha-beta).

 

I went for the latter (recasting alphabeta in rollouts form which can be 
combined with standard MCTS as well as standard recursive alpha-beta 
implementation)

 

The paper is here 
https://www.microsoft.com/en-us/research/wp-content/uploads/2014/11/huang_rollout.pdf

 

If it was upto me, I would have used standard alpha-beta search with 
policy/value networks to improve move sorting and evaluation resp.

 

It seems AlphaZero's MCTS approach hinged on the policy network accurately 
picking the best n moves (while still using minmax) AND then

ofcourse not making blunders. I think what "shallow traps" are some bad looking 
moves turning out to be brilliant tactically later. How the policy

nework able to identify those moves ? A full width like alpha-beta does not 
miss those.

 

Daniel

 

 

 

On Tue, Mar 6, 2018 at 1:23 PM, Álvaro Begué <alvaro.be...@gmail.com 
<mailto:alvaro.be...@gmail.com> > wrote:

Sorry, I haven't been paying enough attention lately to know what "alpha-beta 
rollouts" means precisely. Can you either describe them or give me a reference?

Thanks,
Álvaro.



 

On Tue, Mar 6, 2018 at 1:49 PM, Dan <dsha...@gmail.com 
<mailto:dsha...@gmail.com> > wrote:

I did a quick test with my MCTS chess engine wth two different implementations.

A standard MCTS with averaging, and MCTS with alpha-beta rollouts. The result 
is like a 600 elo difference

 

Finished game 44 (scorpio-pmcts vs scorpio-mcts): 1/2-1/2 {Draw by 3-fold 
repetition}

Score of scorpio-mcts vs scorpio-pmcts: 41 - 1 - 2  [0.955] 44

Elo difference: 528.89 +/- nan

 

scorpio-mcts uses alpha-beta rollouts

scorpio-pmcts is "pure" mcts with averaging and UCB formula.

 

Daniel

 

On Tue, Mar 6, 2018 at 11:46 AM, Dan <dsha...@gmail.com 
<mailto:dsha...@gmail.com> > wrote:

I am pretty sure it is an MCTS problem and I suspect not something that could 
be easily solved with a policy network (could be wrong hree). My opinon is that 
DCNN is not

a miracle worker (as somebody already mentioned here) and it is going to fail  
resolving tactics.  I would be more than happy with it if it has same power as 
a qsearch to be honest.

 

Search traps are the major problem with games like Chess, and what makes 
transitioning the success of DCNN from Go to Chess non trivial.

The following paper discusses shallow traps that are prevalent in chess. ( 
https://www.aaai.org/ocs/index.php/ICAPS/ICAPS10/paper/download/1458/1571 )

They mention traps make MCTS very inefficient.  Even if the MCTS is given 50x 
more time is needed by an exhaustive minimax tree, it could fail to find a 
level-5 or level-7 trap.

It will spend, f.i, 95% of its time searching an asymetric tree of depth > 7 
when a shallow trap of depth

Re: [Computer-go] 9x9 is last frontier?

2018-03-07 Thread Dan
Hi Ingo,

There is actually no randomness in the algorithm, just like AlphaZero's.
It is the same algorithm as a recursive alpha-beta searcher, with the only
difference being the rollouts version examines
one leaf  per episode (one path from root to leaf). This opens the door for
mixing alpha-beta with MCTS; some have already
used is an  MCTS-ab algorithm that does exactly this. At the leaves, I call
a qsearch() for evaluation whereas AlphaZero uses
Value Network (no playouts ).

Daniel

On Wed, Mar 7, 2018 at 2:20 AM, "Ingo Althöfer" <3-hirn-ver...@gmx.de>
wrote:

> Hi Dan,
>
> I find your definition of "Alpha-Beta rollouts" somewhat puzzling.
>
> > Alpha-beta rollouts is like MCTS without playouts (as in
> > AlphaZero), and something that can also do alpha-beta pruning.
>
> I would instead define "Alpha-Beta rollout" in the following way:
> You have a fast alpha-beta program (with some randomness) for
> that game, and a rollout means a quick selfplay game of this
> program.
>
> Ingo.
> ___
> Computer-go mailing list
> Computer-go@computer-go.org
> http://computer-go.org/mailman/listinfo/computer-go
>
___
Computer-go mailing list
Computer-go@computer-go.org
http://computer-go.org/mailman/listinfo/computer-go

Re: [Computer-go] 9x9 is last frontier?

2018-03-07 Thread Ingo Althöfer
Hi Dan,

I find your definition of "Alpha-Beta rollouts" somewhat puzzling.

> Alpha-beta rollouts is like MCTS without playouts (as in 
> AlphaZero), and something that can also do alpha-beta pruning.
 
I would instead define "Alpha-Beta rollout" in the following way:
You have a fast alpha-beta program (with some randomness) for 
that game, and a rollout means a quick selfplay game of this
program.

Ingo.
___
Computer-go mailing list
Computer-go@computer-go.org
http://computer-go.org/mailman/listinfo/computer-go

Re: [Computer-go] 9x9 is last frontier?

2018-03-06 Thread Dan
cal falls left and right
>>> too as I expected.
>>>
>>> So the only explanation left is the policy network able to avoid traps
>>> which I find hard to believe it can identify more than a qsearch level
>>> tactics.
>>>
>>> All I am saying is that my experience (as well as many others) with MCTS
>>> for tactical dominated games is bad, and there must be some breakthrough in
>>> that regard in AlphaZero
>>> for it to be able to compete with Stockfish on a tactical level.
>>>
>>> I am curious how Remi's attempt at Shogi using AlphaZero's method will
>>> turnout.
>>>
>>> regards,
>>> Daniel
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>> On Tue, Mar 6, 2018 at 9:41 AM, Brian Sheppard via Computer-go <
>>> computer-go@computer-go.org> wrote:
>>>
>>>> Training on Stockfish games is guaranteed to produce a blunder-fest,
>>>> because there are no blunders in the training set and therefore the policy
>>>> network never learns how to refute blunders.
>>>>
>>>>
>>>>
>>>> This is not a flaw in MCTS, but rather in the policy network. MCTS will
>>>> eventually search every move infinitely often, producing asymptotically
>>>> optimal play. But if the policy network does not provide the guidance
>>>> necessary to rapidly refute the blunders that occur in the search, then
>>>> convergence of MCTS to optimal play will be very slow.
>>>>
>>>>
>>>>
>>>> It is necessary for the network to train on self-play games using MCTS.
>>>> For instance, the AGZ approach samples next states during training games by
>>>> sampling from the distribution of visits in the search. Specifically: not
>>>> by choosing the most-visited play!
>>>>
>>>>
>>>>
>>>> You see how this policy trains both search and evaluation to be
>>>> internally consistent? The policy head is trained to refute the bad moves
>>>> that will come up in search, and the value head is trained to the value
>>>> observed by the full tree.
>>>>
>>>>
>>>>
>>>> *From:* Computer-go [mailto:computer-go-boun...@computer-go.org] *On
>>>> Behalf Of *Dan
>>>> *Sent:* Monday, March 5, 2018 4:55 AM
>>>> *To:* computer-go@computer-go.org
>>>> *Subject:* Re: [Computer-go] 9x9 is last frontier?
>>>>
>>>>
>>>>
>>>> Actually prior to this it was trained with hundreds of thousands of
>>>> stockfish games and didn’t do well on tactics (the games were actually a
>>>> blunder fest). I believe this is a problem of the MCTS used and not due to
>>>> for lack of training.
>>>>
>>>>
>>>>
>>>> Go is a strategic game so that is different from chess that is full of
>>>> traps.
>>>>
>>>> I m not surprised Lela zero did well in go.
>>>>
>>>>
>>>>
>>>> On Mon, Mar 5, 2018 at 2:16 AM Gian-Carlo Pascutto <g...@sjeng.org>
>>>> wrote:
>>>>
>>>> On 02-03-18 17:07, Dan wrote:
>>>> > Leela-chess is not performing well enough
>>>>
>>>> I don't understand how one can say that given that they started with the
>>>> random network last week only and a few clients. Of course it's bad!
>>>> That doesn't say anything about the approach.
>>>>
>>>> Leela Zero has gotten strong but it has been learning for *months* with
>>>> ~400 people. It also took a while to get to 30 kyu.
>>>>
>>>> --
>>>> GCP
>>>> ___
>>>> Computer-go mailing list
>>>> Computer-go@computer-go.org
>>>> http://computer-go.org/mailman/listinfo/computer-go
>>>>
>>>>
>>>> ___
>>>> Computer-go mailing list
>>>> Computer-go@computer-go.org
>>>> http://computer-go.org/mailman/listinfo/computer-go
>>>>
>>>
>>>
>>
>> ___
>> Computer-go mailing list
>> Computer-go@computer-go.org
>> http://computer-go.org/mailman/listinfo/computer-go
>>
>
>
> ___
> Computer-go mailing list
> Computer-go@computer-go.org
> http://computer-go.org/mailman/listinfo/computer-go
>
___
Computer-go mailing list
Computer-go@computer-go.org
http://computer-go.org/mailman/listinfo/computer-go

Re: [Computer-go] 9x9 is last frontier?

2018-03-06 Thread uurtamo .
Summarizing the objections to my (non-evidence-based, but hand-wavy
observationally-based) assertion that 9x9 is going down anytime someone
really wants it to go down, I get the following:

* value networks can't hack it (okay, maybe? does this make it less likely?
-- we shouldn't expect to cut-and-paste.)
* double ko is some kind of super special problem on 9x9
* the margins are way narrower

the second issue (ko in general, but multi-ko) is exactly why go is
pspace-complete (I made a rough argument for the fact that under slight
relaxation it isn't, but didn't flesh it out or publish it).

I am not feeling strong arguments about the overall picture (i.e. it's
super much harder than 19x19 to beat humans at) other than that the margins
are narrower.

Does anyone else have a different synopsis?

Thanks,

Steve




On Tue, Mar 6, 2018 at 12:17 PM, Brian Sheppard via Computer-go <
computer-go@computer-go.org> wrote:

> Well, AlphaZero did fine at chess tactics, and the papers are clear on the
> details. There must be an error in your deductions somewhere.
>
>
>
> *From:* Computer-go [mailto:computer-go-boun...@computer-go.org] *On
> Behalf Of *Dan
> *Sent:* Tuesday, March 6, 2018 1:46 PM
>
> *To:* computer-go@computer-go.org
> *Subject:* Re: [Computer-go] 9x9 is last frontier?
>
>
>
> I am pretty sure it is an MCTS problem and I suspect not something that
> could be easily solved with a policy network (could be wrong hree). My
> opinon is that DCNN is not
>
> a miracle worker (as somebody already mentioned here) and it is going to
> fail  resolving tactics.  I would be more than happy with it if it has same
> power as a qsearch to be honest.
>
>
>
> Search traps are the major problem with games like Chess, and what makes
> transitioning the success of DCNN from Go to Chess non trivial.
>
> The following paper discusses shallow traps that are prevalent in chess. (
> https://www.aaai.org/ocs/index.php/ICAPS/ICAPS10/paper/download/1458/1571
> )
>
> They mention traps make MCTS very inefficient.  Even if the MCTS is given
> 50x more time is needed by an exhaustive minimax tree, it could fail to
> find a level-5 or level-7 trap.
>
> It will spend, f.i, 95% of its time searching an asymetric tree of depth >
> 7 when a shallow trap of depth-7 exists, thus, missing to find the level-7
> trap.
>
> This is very hard to solve even if you have unlimited power.
>
>
>
> The plain MCTS as used by AlphaZero is the most ill-suited MCTS version in
> my opinion and i have hard a hard time seeing how it can be competitive
> with Stockfish tactically.
>
>
>
> My MCTS chess engine with  AlphaZero like MCTS was averaging was missing a
> lot of tactics. I don't use policy or eval networks but qsearch() for eval,
> and the policy is basically
>
> choosing which ever moves leads to a higher eval.
>
>
>
> a) My first improvement to the MCTS is to use minimax backups instead of
> averaging. This was an improvmenet but not something that would solve the
> traps
>
>
>
> b) My second improvment is to use alphabeta rollouts. This is a rollouts
> version that can do nullmove and LMR etc... This is a huge improvment and
> none of the MCTS
>
> versons can match it. More on alpha-beta rollouts here (
> https://www.microsoft.com/en-us/research/wp-content/
> uploads/2014/11/huang_rollout.pdf )
>
>
>
> So AlphaZero used none of the above improvements and yet it seems to be
> tactically strong. Leela-Zero suffered from tactical falls left and right
> too as I expected.
>
>
>
> So the only explanation left is the policy network able to avoid traps
> which I find hard to believe it can identify more than a qsearch level
> tactics.
>
>
>
> All I am saying is that my experience (as well as many others) with MCTS
> for tactical dominated games is bad, and there must be some breakthrough in
> that regard in AlphaZero
>
> for it to be able to compete with Stockfish on a tactical level.
>
>
>
> I am curious how Remi's attempt at Shogi using AlphaZero's method will
> turnout.
>
>
>
> regards,
>
> Daniel
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
> On Tue, Mar 6, 2018 at 9:41 AM, Brian Sheppard via Computer-go <
> computer-go@computer-go.org> wrote:
>
> Training on Stockfish games is guaranteed to produce a blunder-fest,
> because there are no blunders in the training set and therefore the policy
> network never learns how to refute blunders.
>
>
>
> This is not a flaw in MCTS, but rather in the policy network. MCTS will
> eventually search every move infinitely often, producing asymptotically
> optimal play. But if the policy network does not provide the guid

Re: [Computer-go] 9x9 is last frontier?

2018-03-06 Thread Álvaro Begué
instance, the AGZ approach samples next states during training games by
>>> sampling from the distribution of visits in the search. Specifically: not
>>> by choosing the most-visited play!
>>>
>>>
>>>
>>> You see how this policy trains both search and evaluation to be
>>> internally consistent? The policy head is trained to refute the bad moves
>>> that will come up in search, and the value head is trained to the value
>>> observed by the full tree.
>>>
>>>
>>>
>>> *From:* Computer-go [mailto:computer-go-boun...@computer-go.org] *On
>>> Behalf Of *Dan
>>> *Sent:* Monday, March 5, 2018 4:55 AM
>>> *To:* computer-go@computer-go.org
>>> *Subject:* Re: [Computer-go] 9x9 is last frontier?
>>>
>>>
>>>
>>> Actually prior to this it was trained with hundreds of thousands of
>>> stockfish games and didn’t do well on tactics (the games were actually a
>>> blunder fest). I believe this is a problem of the MCTS used and not due to
>>> for lack of training.
>>>
>>>
>>>
>>> Go is a strategic game so that is different from chess that is full of
>>> traps.
>>>
>>> I m not surprised Lela zero did well in go.
>>>
>>>
>>>
>>> On Mon, Mar 5, 2018 at 2:16 AM Gian-Carlo Pascutto <g...@sjeng.org>
>>> wrote:
>>>
>>> On 02-03-18 17:07, Dan wrote:
>>> > Leela-chess is not performing well enough
>>>
>>> I don't understand how one can say that given that they started with the
>>> random network last week only and a few clients. Of course it's bad!
>>> That doesn't say anything about the approach.
>>>
>>> Leela Zero has gotten strong but it has been learning for *months* with
>>> ~400 people. It also took a while to get to 30 kyu.
>>>
>>> --
>>> GCP
>>> ___
>>> Computer-go mailing list
>>> Computer-go@computer-go.org
>>> http://computer-go.org/mailman/listinfo/computer-go
>>>
>>>
>>> ___
>>> Computer-go mailing list
>>> Computer-go@computer-go.org
>>> http://computer-go.org/mailman/listinfo/computer-go
>>>
>>
>>
>
> ___
> Computer-go mailing list
> Computer-go@computer-go.org
> http://computer-go.org/mailman/listinfo/computer-go
>
___
Computer-go mailing list
Computer-go@computer-go.org
http://computer-go.org/mailman/listinfo/computer-go

Re: [Computer-go] 9x9 is last frontier?

2018-03-06 Thread Brian Sheppard via Computer-go
Well, AlphaZero did fine at chess tactics, and the papers are clear on the 
details. There must be an error in your deductions somewhere.

 

From: Computer-go [mailto:computer-go-boun...@computer-go.org] On Behalf Of Dan
Sent: Tuesday, March 6, 2018 1:46 PM
To: computer-go@computer-go.org
Subject: Re: [Computer-go] 9x9 is last frontier?

 

I am pretty sure it is an MCTS problem and I suspect not something that could 
be easily solved with a policy network (could be wrong hree). My opinon is that 
DCNN is not

a miracle worker (as somebody already mentioned here) and it is going to fail  
resolving tactics.  I would be more than happy with it if it has same power as 
a qsearch to be honest.

 

Search traps are the major problem with games like Chess, and what makes 
transitioning the success of DCNN from Go to Chess non trivial.

The following paper discusses shallow traps that are prevalent in chess. ( 
https://www.aaai.org/ocs/index.php/ICAPS/ICAPS10/paper/download/1458/1571 )

They mention traps make MCTS very inefficient.  Even if the MCTS is given 50x 
more time is needed by an exhaustive minimax tree, it could fail to find a 
level-5 or level-7 trap.

It will spend, f.i, 95% of its time searching an asymetric tree of depth > 7 
when a shallow trap of depth-7 exists, thus, missing to find the level-7 trap.

This is very hard to solve even if you have unlimited power.

 

The plain MCTS as used by AlphaZero is the most ill-suited MCTS version in my 
opinion and i have hard a hard time seeing how it can be competitive with 
Stockfish tactically.

 

My MCTS chess engine with  AlphaZero like MCTS was averaging was missing a lot 
of tactics. I don't use policy or eval networks but qsearch() for eval, and the 
policy is basically

choosing which ever moves leads to a higher eval.

 

a) My first improvement to the MCTS is to use minimax backups instead of 
averaging. This was an improvmenet but not something that would solve the traps

 

b) My second improvment is to use alphabeta rollouts. This is a rollouts 
version that can do nullmove and LMR etc... This is a huge improvment and none 
of the MCTS

versons can match it. More on alpha-beta rollouts here ( 
https://www.microsoft.com/en-us/research/wp-content/uploads/2014/11/huang_rollout.pdf
 )

 

So AlphaZero used none of the above improvements and yet it seems to be 
tactically strong. Leela-Zero suffered from tactical falls left and right too 
as I expected.

 

So the only explanation left is the policy network able to avoid traps which I 
find hard to believe it can identify more than a qsearch level tactics.

 

All I am saying is that my experience (as well as many others) with MCTS for 
tactical dominated games is bad, and there must be some breakthrough in that 
regard in AlphaZero

for it to be able to compete with Stockfish on a tactical level.

 

I am curious how Remi's attempt at Shogi using AlphaZero's method will turnout.

 

regards,

Daniel

 

 

 

 

 

 

 

 

On Tue, Mar 6, 2018 at 9:41 AM, Brian Sheppard via Computer-go 
<computer-go@computer-go.org <mailto:computer-go@computer-go.org> > wrote:

Training on Stockfish games is guaranteed to produce a blunder-fest, because 
there are no blunders in the training set and therefore the policy network 
never learns how to refute blunders.

 

This is not a flaw in MCTS, but rather in the policy network. MCTS will 
eventually search every move infinitely often, producing asymptotically optimal 
play. But if the policy network does not provide the guidance necessary to 
rapidly refute the blunders that occur in the search, then convergence of MCTS 
to optimal play will be very slow.

 

It is necessary for the network to train on self-play games using MCTS. For 
instance, the AGZ approach samples next states during training games by 
sampling from the distribution of visits in the search. Specifically: not by 
choosing the most-visited play!

 

You see how this policy trains both search and evaluation to be internally 
consistent? The policy head is trained to refute the bad moves that will come 
up in search, and the value head is trained to the value observed by the full 
tree. 

 

From: Computer-go [mailto:computer-go-boun...@computer-go.org 
<mailto:computer-go-boun...@computer-go.org> ] On Behalf Of Dan
Sent: Monday, March 5, 2018 4:55 AM
To: computer-go@computer-go.org <mailto:computer-go@computer-go.org> 
Subject: Re: [Computer-go] 9x9 is last frontier?

 

Actually prior to this it was trained with hundreds of thousands of stockfish 
games and didn’t do well on tactics (the games were actually a blunder fest). I 
believe this is a problem of the MCTS used and not due to for lack of training. 

 

Go is a strategic game so that is different from chess that is full of traps.   
  

I m not surprised Lela zero did well in go.

 

On Mon, Mar 5, 2018 at 2:16 AM Gian-Carlo Pascutto <g...@sjeng.org 
<mailto:g...@sjeng.org> > wrote:

On 0

Re: [Computer-go] 9x9 is last frontier?

2018-03-06 Thread Dan
I did a quick test with my MCTS chess engine wth two different
implementations.
A standard MCTS with averaging, and MCTS with alpha-beta rollouts. The
result is like a 600 elo difference

Finished game 44 (scorpio-pmcts vs scorpio-mcts): 1/2-1/2 {Draw by 3-fold
repetition}
Score of scorpio-mcts vs scorpio-pmcts: 41 - 1 - 2  [0.955] 44
Elo difference: 528.89 +/- nan

scorpio-mcts uses alpha-beta rollouts
scorpio-pmcts is "pure" mcts with averaging and UCB formula.

Daniel

On Tue, Mar 6, 2018 at 11:46 AM, Dan <dsha...@gmail.com> wrote:

> I am pretty sure it is an MCTS problem and I suspect not something that
> could be easily solved with a policy network (could be wrong hree). My
> opinon is that DCNN is not
> a miracle worker (as somebody already mentioned here) and it is going to
> fail  resolving tactics.  I would be more than happy with it if it has same
> power as a qsearch to be honest.
>
> Search traps are the major problem with games like Chess, and what makes
> transitioning the success of DCNN from Go to Chess non trivial.
> The following paper discusses shallow traps that are prevalent in chess. (
> https://www.aaai.org/ocs/index.php/ICAPS/ICAPS10/paper/download/1458/1571
> )
> They mention traps make MCTS very inefficient.  Even if the MCTS is given
> 50x more time is needed by an exhaustive minimax tree, it could fail to
> find a level-5 or level-7 trap.
> It will spend, f.i, 95% of its time searching an asymetric tree of depth >
> 7 when a shallow trap of depth-7 exists, thus, missing to find the level-7
> trap.
> This is very hard to solve even if you have unlimited power.
>
> The plain MCTS as used by AlphaZero is the most ill-suited MCTS version in
> my opinion and i have hard a hard time seeing how it can be competitive
> with Stockfish tactically.
>
> My MCTS chess engine with  AlphaZero like MCTS was averaging was missing a
> lot of tactics. I don't use policy or eval networks but qsearch() for eval,
> and the policy is basically
> choosing which ever moves leads to a higher eval.
>
> a) My first improvement to the MCTS is to use minimax backups instead of
> averaging. This was an improvmenet but not something that would solve the
> traps
>
> b) My second improvment is to use alphabeta rollouts. This is a rollouts
> version that can do nullmove and LMR etc... This is a huge improvment and
> none of the MCTS
> versons can match it. More on alpha-beta rollouts here (
> https://www.microsoft.com/en-us/research/wp-content/
> uploads/2014/11/huang_rollout.pdf )
>
> So AlphaZero used none of the above improvements and yet it seems to be
> tactically strong. Leela-Zero suffered from tactical falls left and right
> too as I expected.
>
> So the only explanation left is the policy network able to avoid traps
> which I find hard to believe it can identify more than a qsearch level
> tactics.
>
> All I am saying is that my experience (as well as many others) with MCTS
> for tactical dominated games is bad, and there must be some breakthrough in
> that regard in AlphaZero
> for it to be able to compete with Stockfish on a tactical level.
>
> I am curious how Remi's attempt at Shogi using AlphaZero's method will
> turnout.
>
> regards,
> Daniel
>
>
>
>
>
>
>
>
> On Tue, Mar 6, 2018 at 9:41 AM, Brian Sheppard via Computer-go <
> computer-go@computer-go.org> wrote:
>
>> Training on Stockfish games is guaranteed to produce a blunder-fest,
>> because there are no blunders in the training set and therefore the policy
>> network never learns how to refute blunders.
>>
>>
>>
>> This is not a flaw in MCTS, but rather in the policy network. MCTS will
>> eventually search every move infinitely often, producing asymptotically
>> optimal play. But if the policy network does not provide the guidance
>> necessary to rapidly refute the blunders that occur in the search, then
>> convergence of MCTS to optimal play will be very slow.
>>
>>
>>
>> It is necessary for the network to train on self-play games using MCTS.
>> For instance, the AGZ approach samples next states during training games by
>> sampling from the distribution of visits in the search. Specifically: not
>> by choosing the most-visited play!
>>
>>
>>
>> You see how this policy trains both search and evaluation to be
>> internally consistent? The policy head is trained to refute the bad moves
>> that will come up in search, and the value head is trained to the value
>> observed by the full tree.
>>
>>
>>
>> *From:* Computer-go [mailto:computer-go-boun...@computer-go.org] *On
>> Behalf Of *Dan
>> *Sent:* Monday, March 5, 2018 4:55 A

Re: [Computer-go] 9x9 is last frontier?

2018-03-06 Thread Dan
I am pretty sure it is an MCTS problem and I suspect not something that
could be easily solved with a policy network (could be wrong hree). My
opinon is that DCNN is not
a miracle worker (as somebody already mentioned here) and it is going to
fail  resolving tactics.  I would be more than happy with it if it has same
power as a qsearch to be honest.

Search traps are the major problem with games like Chess, and what makes
transitioning the success of DCNN from Go to Chess non trivial.
The following paper discusses shallow traps that are prevalent in chess. (
https://www.aaai.org/ocs/index.php/ICAPS/ICAPS10/paper/download/1458/1571 )
They mention traps make MCTS very inefficient.  Even if the MCTS is given
50x more time is needed by an exhaustive minimax tree, it could fail to
find a level-5 or level-7 trap.
It will spend, f.i, 95% of its time searching an asymetric tree of depth >
7 when a shallow trap of depth-7 exists, thus, missing to find the level-7
trap.
This is very hard to solve even if you have unlimited power.

The plain MCTS as used by AlphaZero is the most ill-suited MCTS version in
my opinion and i have hard a hard time seeing how it can be competitive
with Stockfish tactically.

My MCTS chess engine with  AlphaZero like MCTS was averaging was missing a
lot of tactics. I don't use policy or eval networks but qsearch() for eval,
and the policy is basically
choosing which ever moves leads to a higher eval.

a) My first improvement to the MCTS is to use minimax backups instead of
averaging. This was an improvmenet but not something that would solve the
traps

b) My second improvment is to use alphabeta rollouts. This is a rollouts
version that can do nullmove and LMR etc... This is a huge improvment and
none of the MCTS
versons can match it. More on alpha-beta rollouts here (
https://www.microsoft.com/en-us/research/wp-content/uploads/2014/11/huang_rollout.pdf
)

So AlphaZero used none of the above improvements and yet it seems to be
tactically strong. Leela-Zero suffered from tactical falls left and right
too as I expected.

So the only explanation left is the policy network able to avoid traps
which I find hard to believe it can identify more than a qsearch level
tactics.

All I am saying is that my experience (as well as many others) with MCTS
for tactical dominated games is bad, and there must be some breakthrough in
that regard in AlphaZero
for it to be able to compete with Stockfish on a tactical level.

I am curious how Remi's attempt at Shogi using AlphaZero's method will
turnout.

regards,
Daniel








On Tue, Mar 6, 2018 at 9:41 AM, Brian Sheppard via Computer-go <
computer-go@computer-go.org> wrote:

> Training on Stockfish games is guaranteed to produce a blunder-fest,
> because there are no blunders in the training set and therefore the policy
> network never learns how to refute blunders.
>
>
>
> This is not a flaw in MCTS, but rather in the policy network. MCTS will
> eventually search every move infinitely often, producing asymptotically
> optimal play. But if the policy network does not provide the guidance
> necessary to rapidly refute the blunders that occur in the search, then
> convergence of MCTS to optimal play will be very slow.
>
>
>
> It is necessary for the network to train on self-play games using MCTS.
> For instance, the AGZ approach samples next states during training games by
> sampling from the distribution of visits in the search. Specifically: not
> by choosing the most-visited play!
>
>
>
> You see how this policy trains both search and evaluation to be internally
> consistent? The policy head is trained to refute the bad moves that will
> come up in search, and the value head is trained to the value observed by
> the full tree.
>
>
>
> *From:* Computer-go [mailto:computer-go-boun...@computer-go.org] *On
> Behalf Of *Dan
> *Sent:* Monday, March 5, 2018 4:55 AM
> *To:* computer-go@computer-go.org
> *Subject:* Re: [Computer-go] 9x9 is last frontier?
>
>
>
> Actually prior to this it was trained with hundreds of thousands of
> stockfish games and didn’t do well on tactics (the games were actually a
> blunder fest). I believe this is a problem of the MCTS used and not due to
> for lack of training.
>
>
>
> Go is a strategic game so that is different from chess that is full of
> traps.
>
> I m not surprised Lela zero did well in go.
>
>
>
> On Mon, Mar 5, 2018 at 2:16 AM Gian-Carlo Pascutto <g...@sjeng.org> wrote:
>
> On 02-03-18 17:07, Dan wrote:
> > Leela-chess is not performing well enough
>
> I don't understand how one can say that given that they started with the
> random network last week only and a few clients. Of course it's bad!
> That doesn't say anything about the approach.
>
> Leela Zero has gotten strong but it has been learning for *months* with
> ~40

Re: [Computer-go] 9x9 is last frontier?

2018-03-06 Thread Brian Sheppard via Computer-go
Training on Stockfish games is guaranteed to produce a blunder-fest, because 
there are no blunders in the training set and therefore the policy network 
never learns how to refute blunders.

 

This is not a flaw in MCTS, but rather in the policy network. MCTS will 
eventually search every move infinitely often, producing asymptotically optimal 
play. But if the policy network does not provide the guidance necessary to 
rapidly refute the blunders that occur in the search, then convergence of MCTS 
to optimal play will be very slow.

 

It is necessary for the network to train on self-play games using MCTS. For 
instance, the AGZ approach samples next states during training games by 
sampling from the distribution of visits in the search. Specifically: not by 
choosing the most-visited play!

 

You see how this policy trains both search and evaluation to be internally 
consistent? The policy head is trained to refute the bad moves that will come 
up in search, and the value head is trained to the value observed by the full 
tree. 

 

From: Computer-go [mailto:computer-go-boun...@computer-go.org] On Behalf Of Dan
Sent: Monday, March 5, 2018 4:55 AM
To: computer-go@computer-go.org
Subject: Re: [Computer-go] 9x9 is last frontier?

 

Actually prior to this it was trained with hundreds of thousands of stockfish 
games and didn’t do well on tactics (the games were actually a blunder fest). I 
believe this is a problem of the MCTS used and not due to for lack of training. 

 

Go is a strategic game so that is different from chess that is full of traps.   
  

I m not surprised Lela zero did well in go.

 

On Mon, Mar 5, 2018 at 2:16 AM Gian-Carlo Pascutto <g...@sjeng.org 
<mailto:g...@sjeng.org> > wrote:

On 02-03-18 17:07, Dan wrote:
> Leela-chess is not performing well enough

I don't understand how one can say that given that they started with the
random network last week only and a few clients. Of course it's bad!
That doesn't say anything about the approach.

Leela Zero has gotten strong but it has been learning for *months* with
~400 people. It also took a while to get to 30 kyu.

--
GCP
___
Computer-go mailing list
Computer-go@computer-go.org <mailto:Computer-go@computer-go.org> 
http://computer-go.org/mailman/listinfo/computer-go

___
Computer-go mailing list
Computer-go@computer-go.org
http://computer-go.org/mailman/listinfo/computer-go

Re: [Computer-go] 9x9 is last frontier?

2018-03-05 Thread Gian-Carlo Pascutto
On 5/03/2018 10:54, Dan wrote:
> I believe this is a problem of the MCTS used and not due
> to for lack of training. 
> 
> Go is a strategic game so that is different from chess that is full of
> traps.     

Does the Alpha Zero result not indicate the opposite, i.e. that MCTS is
workable?

-- 
GCP
___
Computer-go mailing list
Computer-go@computer-go.org
http://computer-go.org/mailman/listinfo/computer-go

Re: [Computer-go] 9x9 is last frontier?

2018-03-05 Thread Dan
Actually prior to this it was trained with hundreds of thousands of
stockfish games and didn’t do well on tactics (the games were actually a
blunder fest). I believe this is a problem of the MCTS used and not due to
for lack of training.

Go is a strategic game so that is different from chess that is full of
traps.
I m not surprised Lela zero did well in go.

On Mon, Mar 5, 2018 at 2:16 AM Gian-Carlo Pascutto  wrote:

> On 02-03-18 17:07, Dan wrote:
> > Leela-chess is not performing well enough
>
> I don't understand how one can say that given that they started with the
> random network last week only and a few clients. Of course it's bad!
> That doesn't say anything about the approach.
>
> Leela Zero has gotten strong but it has been learning for *months* with
> ~400 people. It also took a while to get to 30 kyu.
>
> --
> GCP
> ___
> Computer-go mailing list
> Computer-go@computer-go.org
> http://computer-go.org/mailman/listinfo/computer-go
___
Computer-go mailing list
Computer-go@computer-go.org
http://computer-go.org/mailman/listinfo/computer-go

Re: [Computer-go] 9x9 is last frontier?

2018-03-05 Thread Gian-Carlo Pascutto
On 02-03-18 17:07, Dan wrote:
> Leela-chess is not performing well enough 

I don't understand how one can say that given that they started with the
random network last week only and a few clients. Of course it's bad!
That doesn't say anything about the approach.

Leela Zero has gotten strong but it has been learning for *months* with
~400 people. It also took a while to get to 30 kyu.

-- 
GCP
___
Computer-go mailing list
Computer-go@computer-go.org
http://computer-go.org/mailman/listinfo/computer-go

Re: [Computer-go] 9x9 is last frontier?

2018-03-03 Thread Aja Huang
2018-03-02 16:07 GMT+00:00 Dan :

> Hello Aja,
>
> Could you enlighten me on how AlphaZero handles tactics in chess ?
>
> It seems the mcts approach as described in the paper does not perform well
> enough.
>
> Leela-chess is not performing well enough even though leela-go seems to be
> doing well.
>
> Daniel
>
>
Hi, I've moved to other project so I can't really your question.
Remi's latest version of Crazy Stone seems to be much stronger than the
current version of Leela Zero. You might want to ask him for some insights
on your question.

Regards,
Aja


>
>
>
> On Fri, Mar 2, 2018 at 4:52 AM, Aja Huang  wrote:
>
>>
>>
>> 2018-03-02 6:50 GMT+00:00 "Ingo Althöfer" <3-hirn-ver...@gmx.de>:
>>
>>> Von: "David Doshay" 
>>> > Go is hard.
>>> > Programming is hard.
>>> >
>>> > Programming Go is hard squared.
>>> > ;^)
>>>
>>> And that on square boards.
>>> Mama mia!
>>>
>>
>> Go is hard for humans, but in my own opinion I think Go seems to be too
>> easy for deep learning. So is programming Go now. :)
>>
>> Aja
>>
>>
>>>
>>> ;-) Ingo.
>>> ___
>>> Computer-go mailing list
>>> Computer-go@computer-go.org
>>> http://computer-go.org/mailman/listinfo/computer-go
>>>
>>
>>
>> ___
>> Computer-go mailing list
>> Computer-go@computer-go.org
>> http://computer-go.org/mailman/listinfo/computer-go
>>
>
>
> ___
> Computer-go mailing list
> Computer-go@computer-go.org
> http://computer-go.org/mailman/listinfo/computer-go
>
___
Computer-go mailing list
Computer-go@computer-go.org
http://computer-go.org/mailman/listinfo/computer-go

Re: [Computer-go] 9x9 is last frontier?

2018-03-02 Thread Dan
Leela chess is here https://github.com/glinscott/leela-chess

It uses the exact MCTS algorithm as described in AlphaZero, with value and
policy networks, but performs really badly in tactics (often missing 2-3
ply shallow tactics)

To get a somewhat strong MCTS chess engine, I had to use alpha-beta
rollouts instead, as described in
https://www.microsoft.com/en-us/research/wp-content/uploads/2014/11/huang_rollout.pdf

I think this could be suitable for 9x9 Go too.

It has been a mystery for me how AlphaZero avoided into falling for shallow
traps in chess.

Daniel

On Fri, Mar 2, 2018 at 10:22 AM, Xavier Combelle 
wrote:

> Where is leela chess. How many games it is trained on?
>
> Le 2 mars 2018 18:20, "Dan"  a écrit :
>
>> Hello Aja,
>>
>> Could you enlighten me on how AlphaZero handles tactics in chess ?
>>
>> It seems the mcts approach as described in the paper does not perform
>> well enough.
>>
>> Leela-chess is not performing well enough even though leela-go seems to
>> be doing well.
>>
>> Daniel
>>
>>
>>
>>
>>
>> On Fri, Mar 2, 2018 at 4:52 AM, Aja Huang  wrote:
>>
>>>
>>>
>>> 2018-03-02 6:50 GMT+00:00 "Ingo Althöfer" <3-hirn-ver...@gmx.de>:
>>>
 Von: "David Doshay" 
 > Go is hard.
 > Programming is hard.
 >
 > Programming Go is hard squared.
 > ;^)

 And that on square boards.
 Mama mia!

>>>
>>> Go is hard for humans, but in my own opinion I think Go seems to be too
>>> easy for deep learning. So is programming Go now. :)
>>>
>>> Aja
>>>
>>>

 ;-) Ingo.
 ___
 Computer-go mailing list
 Computer-go@computer-go.org
 http://computer-go.org/mailman/listinfo/computer-go

>>>
>>>
>>> ___
>>> Computer-go mailing list
>>> Computer-go@computer-go.org
>>> http://computer-go.org/mailman/listinfo/computer-go
>>>
>>
>>
>> ___
>> Computer-go mailing list
>> Computer-go@computer-go.org
>> http://computer-go.org/mailman/listinfo/computer-go
>>
>
> ___
> Computer-go mailing list
> Computer-go@computer-go.org
> http://computer-go.org/mailman/listinfo/computer-go
>
___
Computer-go mailing list
Computer-go@computer-go.org
http://computer-go.org/mailman/listinfo/computer-go

Re: [Computer-go] 9x9 is last frontier?

2018-03-02 Thread Xavier Combelle
Where is leela chess. How many games it is trained on?

Le 2 mars 2018 18:20, "Dan"  a écrit :

> Hello Aja,
>
> Could you enlighten me on how AlphaZero handles tactics in chess ?
>
> It seems the mcts approach as described in the paper does not perform well
> enough.
>
> Leela-chess is not performing well enough even though leela-go seems to be
> doing well.
>
> Daniel
>
>
>
>
>
> On Fri, Mar 2, 2018 at 4:52 AM, Aja Huang  wrote:
>
>>
>>
>> 2018-03-02 6:50 GMT+00:00 "Ingo Althöfer" <3-hirn-ver...@gmx.de>:
>>
>>> Von: "David Doshay" 
>>> > Go is hard.
>>> > Programming is hard.
>>> >
>>> > Programming Go is hard squared.
>>> > ;^)
>>>
>>> And that on square boards.
>>> Mama mia!
>>>
>>
>> Go is hard for humans, but in my own opinion I think Go seems to be too
>> easy for deep learning. So is programming Go now. :)
>>
>> Aja
>>
>>
>>>
>>> ;-) Ingo.
>>> ___
>>> Computer-go mailing list
>>> Computer-go@computer-go.org
>>> http://computer-go.org/mailman/listinfo/computer-go
>>>
>>
>>
>> ___
>> Computer-go mailing list
>> Computer-go@computer-go.org
>> http://computer-go.org/mailman/listinfo/computer-go
>>
>
>
> ___
> Computer-go mailing list
> Computer-go@computer-go.org
> http://computer-go.org/mailman/listinfo/computer-go
>
___
Computer-go mailing list
Computer-go@computer-go.org
http://computer-go.org/mailman/listinfo/computer-go

Re: [Computer-go] 9x9 is last frontier?

2018-03-02 Thread Hideki Kato
Do you think deep learning can understand and solve double 
ko, for example?

Hideki

Aja Huang: 

Re: [Computer-go] 9x9 is last frontier?

2018-03-02 Thread Dan
Hello Aja,

Could you enlighten me on how AlphaZero handles tactics in chess ?

It seems the mcts approach as described in the paper does not perform well
enough.

Leela-chess is not performing well enough even though leela-go seems to be
doing well.

Daniel





On Fri, Mar 2, 2018 at 4:52 AM, Aja Huang  wrote:

>
>
> 2018-03-02 6:50 GMT+00:00 "Ingo Althöfer" <3-hirn-ver...@gmx.de>:
>
>> Von: "David Doshay" 
>> > Go is hard.
>> > Programming is hard.
>> >
>> > Programming Go is hard squared.
>> > ;^)
>>
>> And that on square boards.
>> Mama mia!
>>
>
> Go is hard for humans, but in my own opinion I think Go seems to be too
> easy for deep learning. So is programming Go now. :)
>
> Aja
>
>
>>
>> ;-) Ingo.
>> ___
>> Computer-go mailing list
>> Computer-go@computer-go.org
>> http://computer-go.org/mailman/listinfo/computer-go
>>
>
>
> ___
> Computer-go mailing list
> Computer-go@computer-go.org
> http://computer-go.org/mailman/listinfo/computer-go
>
___
Computer-go mailing list
Computer-go@computer-go.org
http://computer-go.org/mailman/listinfo/computer-go

Re: [Computer-go] 9x9 is last frontier?

2018-03-02 Thread Aja Huang
2018-03-02 6:50 GMT+00:00 "Ingo Althöfer" <3-hirn-ver...@gmx.de>:

> Von: "David Doshay" 
> > Go is hard.
> > Programming is hard.
> >
> > Programming Go is hard squared.
> > ;^)
>
> And that on square boards.
> Mama mia!
>

Go is hard for humans, but in my own opinion I think Go seems to be too
easy for deep learning. So is programming Go now. :)

Aja


>
> ;-) Ingo.
> ___
> Computer-go mailing list
> Computer-go@computer-go.org
> http://computer-go.org/mailman/listinfo/computer-go
>
___
Computer-go mailing list
Computer-go@computer-go.org
http://computer-go.org/mailman/listinfo/computer-go

Re: [Computer-go] 9x9 is last frontier?

2018-03-01 Thread Ingo Althöfer
Von: "David Doshay" 
> Go is hard.
> Programming is hard.
> 
> Programming Go is hard squared. 
> ;^)

And that on square boards.
Mama mia!

;-) Ingo.
___
Computer-go mailing list
Computer-go@computer-go.org
http://computer-go.org/mailman/listinfo/computer-go

Re: [Computer-go] 9x9 is last frontier?

2018-03-01 Thread David Doshay
Go is hard.
Programming is hard.

Programming Go is hard squared. 
;^)

Cheers,
David G Doshay

ddos...@mac.com


> On 28, Feb 2018, at 5:43 PM, Hideki Kato  wrote:
> 
> Go is still hard for both human and computers :).

___
Computer-go mailing list
Computer-go@computer-go.org
http://computer-go.org/mailman/listinfo/computer-go

Re: [Computer-go] 9x9 is last frontier?

2018-02-28 Thread uurtamo .
Thank you for being so kind in your response. I truly appreciate it.

s.

On Feb 28, 2018 6:32 PM, "Hideki Kato"  wrote:

> uurtamo .:  mail.gmail.com>:
> >I didn't mean to suggest that I can or will solve this problem tomorrow.
> >
> >What I meant to say is that it is clearly obvious that 9x9 is not immune
> to
> >being destroyed -- it's not what people play professionally (or at least
> is
> >not what is most famous for being played professionally), so it is going
> to
> >stand alone for a little while; it hasn't been the main focus yet. I
> >understand that it technically has features such as: very tiny point
> >differences; mostly being tactical. I don't think or have reason to
> believe
> >that that makes it somehow immune.
> >
> >What concerns me is pseudo-technical explanations for why it's harder to
> >beat humans at 9x9 than at 19x19. Saying that it's harder at 9x9 seems
> like
> >an excuse to explain (or hopefully justify) how the game is still in the
> >hands of humans. This feels very strongly like a justification for how "go
> >is still really hard for computers". Which, I suppose, we can break down
> >into lots of little subcases and worry about. The tiny point difference
> >issue is interesting; it means that things need to be super tight (less
> >room for sloppy play). Checkers also has this feature.
> >
> >The reality, in my unjustified opinion, is that this will be a solved
> >problem once it has obtained enough focus.
>
> I'm suspecious.  The value network (VN) is not enough for
> 9x9 because VN can't approximate value functions at enough
> detail.  This is also a problem on 19x19 but the advantages
> VN gives at silent positions is big enough (actually a few
> points) to beat top level human players.  I believe another
> idea is necessary for 9x9.
> #One possible (?) simple solution: if the inference speed of
> the policy network gets 100 or more times faster then we can
> use PN directly in rollouts.  This may make VN useless.
>
> Go is still hard for both human and computers :).
>
> Hideki
>
> >s.
> >
> >
> >On Fri, Feb 23, 2018 at 6:12 PM, Hideki Kato 
> wrote:
> >
> >> uurtamo .:  >> 1vhk7t...@mail.gmail.com>:
> >> >Slow down there, hombre.
> >> >
> >> >There's no secret sauce to 9x9 other than that it isn't the current
> focus
> >> >of people.
> >> >
> >> >Just like 7x7 isn't immune.
> >> >
> >> >A computer program for 9x9, funded, backed by halfway serious people,
> and
> >> >focused on the task, will *destroy* human opponents at any time it
> needs
> >> to.
> >>
> >> Why do you think (or believe) so?  I'd like to say there
> >> is no evidence so far.
> >>
> >> >If you believe that there is a special reason that 9x9 is harder than
> >> >19x19, then I'm super interested to hear that. But it's not harder for
> >> >computers. It's just not what people have been focusing on.
> >>
> >> 9x9 is not harder than 19x19 as a game.  However:  (1) Value
> >> networks, the key components to beat human on 19x19, work
> >> fine only on static positions but 9x9 has almost no such
> >> positions.   (2) Humans can play much better on 9x9
> >> than 19x19.  Top level professionals can read-out at near
> >> end of the middle stage of a game in less than 30 min with
> >> one point accuracy of the score, for example.
> >>
> >> Humans are not good at global evaluation of larger boards so
> >> bots can beat top professionals on 19x19 but this does not
> >> apply 9x9.  The size of the board is important because
> >> value networks are not universal, ie, approximate the
> >> value function not so presicely, mainly due to
> >> the number of training data is limited in practice (up to
> >> 10^8 while the number of possible input positions is greater
> >> than, at least, 10^20).  One more reason, there are no
> >> algorithm to solve double ko. This is not so big problem on
> >> 19x19 but 9x9.
> >>
> >> Best, Hideki
> >>
> >> >s.
> >> >
> >> >On Feb 23, 2018 4:49 PM, "Hideki Kato"  wrote:
> >> >
> >> >> That's not the point, Petri.  9x9 has almost no "silent"
> >> >> or "static" positons which value networks superb humans.
> >> >> On 9x9 boards, Kos, especially double Kos and two step Kos
> >> >> are important but MCTS still works worse for them, for
> >> >> examples.  Human professionals are much better at life
> >> >> and complex local fights which dominate small board games
> >> >> because they can read deterministically and deeper than
> >> >> current MCTS bots in standard time settings (not blitz).
> >> >> Also it's well known that MCTS is not good at finding narrow
> >> >> and deep paths to win due to "averaging".  Ohashi 6p said
> >> >> that he couldn't lose against statiscal algorithms after the
> >> >> event in 2012.
> >> >>
> >> >> Best,
> >> >> Hideki
> >> >>
> >> >> Petri Pitkanen:  >> >> 3zrby3k9kjvmzah...@mail.gmail.com>:
> >> >> 

Re: [Computer-go] 9x9 is last frontier?

2018-02-28 Thread Hideki Kato
uurtamo .: :
>I didn't mean to suggest that I can or will solve this problem tomorrow.
>
>What I meant to say is that it is clearly obvious that 9x9 is not immune to
>being destroyed -- it's not what people play professionally (or at least is
>not what is most famous for being played professionally), so it is going to
>stand alone for a little while; it hasn't been the main focus yet. I
>understand that it technically has features such as: very tiny point
>differences; mostly being tactical. I don't think or have reason to believe
>that that makes it somehow immune.
>
>What concerns me is pseudo-technical explanations for why it's harder to
>beat humans at 9x9 than at 19x19. Saying that it's harder at 9x9 seems like
>an excuse to explain (or hopefully justify) how the game is still in the
>hands of humans. This feels very strongly like a justification for how "go
>is still really hard for computers". Which, I suppose, we can break down
>into lots of little subcases and worry about. The tiny point difference
>issue is interesting; it means that things need to be super tight (less
>room for sloppy play). Checkers also has this feature.
>
>The reality, in my unjustified opinion, is that this will be a solved
>problem once it has obtained enough focus.

I'm suspecious.  The value network (VN) is not enough for 
9x9 because VN can't approximate value functions at enough 
detail.  This is also a problem on 19x19 but the advantages 
VN gives at silent positions is big enough (actually a few 
points) to beat top level human players.  I believe another 
idea is necessary for 9x9.  
#One possible (?) simple solution: if the inference speed of 
the policy network gets 100 or more times faster then we can 
use PN directly in rollouts.  This may make VN useless.

Go is still hard for both human and computers :).

Hideki

>s.
>
>
>On Fri, Feb 23, 2018 at 6:12 PM, Hideki Kato  wrote:
>
>> uurtamo .: > 1vhk7t...@mail.gmail.com>:
>> >Slow down there, hombre.
>> >
>> >There's no secret sauce to 9x9 other than that it isn't the current focus
>> >of people.
>> >
>> >Just like 7x7 isn't immune.
>> >
>> >A computer program for 9x9, funded, backed by halfway serious people, and
>> >focused on the task, will *destroy* human opponents at any time it needs
>> to.
>>
>> Why do you think (or believe) so?  I'd like to say there
>> is no evidence so far.
>>
>> >If you believe that there is a special reason that 9x9 is harder than
>> >19x19, then I'm super interested to hear that. But it's not harder for
>> >computers. It's just not what people have been focusing on.
>>
>> 9x9 is not harder than 19x19 as a game.  However:  (1) Value
>> networks, the key components to beat human on 19x19, work
>> fine only on static positions but 9x9 has almost no such
>> positions.   (2) Humans can play much better on 9x9
>> than 19x19.  Top level professionals can read-out at near
>> end of the middle stage of a game in less than 30 min with
>> one point accuracy of the score, for example.
>>
>> Humans are not good at global evaluation of larger boards so
>> bots can beat top professionals on 19x19 but this does not
>> apply 9x9.  The size of the board is important because
>> value networks are not universal, ie, approximate the
>> value function not so presicely, mainly due to
>> the number of training data is limited in practice (up to
>> 10^8 while the number of possible input positions is greater
>> than, at least, 10^20).  One more reason, there are no
>> algorithm to solve double ko. This is not so big problem on
>> 19x19 but 9x9.
>>
>> Best, Hideki
>>
>> >s.
>> >
>> >On Feb 23, 2018 4:49 PM, "Hideki Kato"  wrote:
>> >
>> >> That's not the point, Petri.  9x9 has almost no "silent"
>> >> or "static" positons which value networks superb humans.
>> >> On 9x9 boards, Kos, especially double Kos and two step Kos
>> >> are important but MCTS still works worse for them, for
>> >> examples.  Human professionals are much better at life
>> >> and complex local fights which dominate small board games
>> >> because they can read deterministically and deeper than
>> >> current MCTS bots in standard time settings (not blitz).
>> >> Also it's well known that MCTS is not good at finding narrow
>> >> and deep paths to win due to "averaging".  Ohashi 6p said
>> >> that he couldn't lose against statiscal algorithms after the
>> >> event in 2012.
>> >>
>> >> Best,
>> >> Hideki
>> >>
>> >> Petri Pitkanen: > >> 3zrby3k9kjvmzah...@mail.gmail.com>:
>> >> >elo-range in 9x9 smaller than 19x19. One just cannot be hugelyl better
>> >> than
>> >> >the other is such limitted game
>> >> >
>> >> >2018-02-23 21:15 GMT+02:00 Hiroshi Yamashita :
>> >> >
>> >> >> Hi,
>> >> >>
>> >> >> Top 19x19 program reaches 4200 BayesElo on CGOS. But 3100 in 9x9.
>> >> >> Maybe it is because people don't have much interest in 

Re: [Computer-go] 9x9 is last frontier?

2018-02-28 Thread uurtamo .
I didn't mean to suggest that I can or will solve this problem tomorrow.

What I meant to say is that it is clearly obvious that 9x9 is not immune to
being destroyed -- it's not what people play professionally (or at least is
not what is most famous for being played professionally), so it is going to
stand alone for a little while; it hasn't been the main focus yet. I
understand that it technically has features such as: very tiny point
differences; mostly being tactical. I don't think or have reason to believe
that that makes it somehow immune.

What concerns me is pseudo-technical explanations for why it's harder to
beat humans at 9x9 than at 19x19. Saying that it's harder at 9x9 seems like
an excuse to explain (or hopefully justify) how the game is still in the
hands of humans. This feels very strongly like a justification for how "go
is still really hard for computers". Which, I suppose, we can break down
into lots of little subcases and worry about. The tiny point difference
issue is interesting; it means that things need to be super tight (less
room for sloppy play). Checkers also has this feature.

The reality, in my unjustified opinion, is that this will be a solved
problem once it has obtained enough focus.

s.


On Fri, Feb 23, 2018 at 6:12 PM, Hideki Kato  wrote:

> uurtamo .:  1vhk7t...@mail.gmail.com>:
> >Slow down there, hombre.
> >
> >There's no secret sauce to 9x9 other than that it isn't the current focus
> >of people.
> >
> >Just like 7x7 isn't immune.
> >
> >A computer program for 9x9, funded, backed by halfway serious people, and
> >focused on the task, will *destroy* human opponents at any time it needs
> to.
>
> Why do you think (or believe) so?  I'd like to say there
> is no evidence so far.
>
> >If you believe that there is a special reason that 9x9 is harder than
> >19x19, then I'm super interested to hear that. But it's not harder for
> >computers. It's just not what people have been focusing on.
>
> 9x9 is not harder than 19x19 as a game.  However:  (1) Value
> networks, the key components to beat human on 19x19, work
> fine only on static positions but 9x9 has almost no such
> positions.   (2) Humans can play much better on 9x9
> than 19x19.  Top level professionals can read-out at near
> end of the middle stage of a game in less than 30 min with
> one point accuracy of the score, for example.
>
> Humans are not good at global evaluation of larger boards so
> bots can beat top professionals on 19x19 but this does not
> apply 9x9.  The size of the board is important because
> value networks are not universal, ie, approximate the
> value function not so presicely, mainly due to
> the number of training data is limited in practice (up to
> 10^8 while the number of possible input positions is greater
> than, at least, 10^20).  One more reason, there are no
> algorithm to solve double ko. This is not so big problem on
> 19x19 but 9x9.
>
> Best, Hideki
>
> >s.
> >
> >On Feb 23, 2018 4:49 PM, "Hideki Kato"  wrote:
> >
> >> That's not the point, Petri.  9x9 has almost no "silent"
> >> or "static" positons which value networks superb humans.
> >> On 9x9 boards, Kos, especially double Kos and two step Kos
> >> are important but MCTS still works worse for them, for
> >> examples.  Human professionals are much better at life
> >> and complex local fights which dominate small board games
> >> because they can read deterministically and deeper than
> >> current MCTS bots in standard time settings (not blitz).
> >> Also it's well known that MCTS is not good at finding narrow
> >> and deep paths to win due to "averaging".  Ohashi 6p said
> >> that he couldn't lose against statiscal algorithms after the
> >> event in 2012.
> >>
> >> Best,
> >> Hideki
> >>
> >> Petri Pitkanen:  >> 3zrby3k9kjvmzah...@mail.gmail.com>:
> >> >elo-range in 9x9 smaller than 19x19. One just cannot be hugelyl better
> >> than
> >> >the other is such limitted game
> >> >
> >> >2018-02-23 21:15 GMT+02:00 Hiroshi Yamashita :
> >> >
> >> >> Hi,
> >> >>
> >> >> Top 19x19 program reaches 4200 BayesElo on CGOS. But 3100 in 9x9.
> >> >> Maybe it is because people don't have much interest in 9x9.
> >> >> But it seems value network does not work well in 9x9.
> >> >> Weights_33_400 is maybe made by selfplay network. But it is 2946 in
> >9x9.
> >> >> Weights_31_3200 is 4069 in 19x19 though.
> >> >>
> >> >> In year 2012, Zen played 6 games against 3 Japanese Pros, and lost by
> >> 0-6.
> >> >> And it seems Zen's 9x9 strength does not change big even now.
> >> >> http://computer-go.org/pipermail/computer-go/2012-
> November/005556.html
> >> >>
> >> >> I feel there is still enough chance that human can beat best program
> in
> >> >> 9x9.
> >> >>
> >> >> Thanks,
> >> >> Hiroshi Yamashita
> >> >>
> >> >> ___
> >> >> Computer-go mailing list
> >> >> Computer-go@computer-go.org
> >> >> 

Re: [Computer-go] 9x9 is last frontier?

2018-02-25 Thread Petr Baudis
  I think Hideki-san makes a great point.  To rephrase it my way,
AlphaGo-related advancements never really solved the MCTS limitation to
properly read out precise "one way street" sequences and deal with the
horizon effect.  The value network amazingly compensates enough for this
limitation on 19x19, but it inevitably does get into play on 9x9.

On Fri, Feb 23, 2018 at 05:01:50PM -0800, uurtamo . wrote:
> Slow down there, hombre.
> 
> There's no secret sauce to 9x9 other than that it isn't the current focus
> of people.
> 
> Just like 7x7 isn't immune.
> 
> A computer program for 9x9, funded, backed by halfway serious people, and
> focused on the task, will *destroy* human opponents at any time it needs to.
> 
> If you believe that there is a special reason that 9x9 is harder than
> 19x19, then I'm super interested to hear that. But it's not harder for
> computers. It's just not what people have been focusing on.
> 
> s.
> 
> On Feb 23, 2018 4:49 PM, "Hideki Kato"  wrote:
> 
> > That's not the point, Petri.  9x9 has almost no "silent"
> > or "static" positons which value networks superb humans.
> > On 9x9 boards, Kos, especially double Kos and two step Kos
> > are important but MCTS still works worse for them, for
> > examples.  Human professionals are much better at life
> > and complex local fights which dominate small board games
> > because they can read deterministically and deeper than
> > current MCTS bots in standard time settings (not blitz).
> > Also it's well known that MCTS is not good at finding narrow
> > and deep paths to win due to "averaging".  Ohashi 6p said
> > that he couldn't lose against statiscal algorithms after the
> > event in 2012.
> >
> > Best,
> > Hideki
> >
> > Petri Pitkanen:  > 3zrby3k9kjvmzah...@mail.gmail.com>:
> > >elo-range in 9x9 smaller than 19x19. One just cannot be hugelyl better
> > than
> > >the other is such limitted game
> > >
> > >2018-02-23 21:15 GMT+02:00 Hiroshi Yamashita :
> > >
> > >> Hi,
> > >>
> > >> Top 19x19 program reaches 4200 BayesElo on CGOS. But 3100 in 9x9.
> > >> Maybe it is because people don't have much interest in 9x9.
> > >> But it seems value network does not work well in 9x9.
> > >> Weights_33_400 is maybe made by selfplay network. But it is 2946 in 9x9.
> > >> Weights_31_3200 is 4069 in 19x19 though.
> > >>
> > >> In year 2012, Zen played 6 games against 3 Japanese Pros, and lost by
> > 0-6.
> > >> And it seems Zen's 9x9 strength does not change big even now.
> > >> http://computer-go.org/pipermail/computer-go/2012-November/005556.html
> > >>
> > >> I feel there is still enough chance that human can beat best program in
> > >> 9x9.
> > >>
> > >> Thanks,
> > >> Hiroshi Yamashita
> > >>
> > >> ___
> > >> Computer-go mailing list
> > >> Computer-go@computer-go.org
> > >> http://computer-go.org/mailman/listinfo/computer-go
> > > inline file
> > >___
> > >Computer-go mailing list
> > >Computer-go@computer-go.org
> > >http://computer-go.org/mailman/listinfo/computer-go
> > --
> > Hideki Kato 
> > ___
> > Computer-go mailing list
> > Computer-go@computer-go.org
> > http://computer-go.org/mailman/listinfo/computer-go

> ___
> Computer-go mailing list
> Computer-go@computer-go.org
> http://computer-go.org/mailman/listinfo/computer-go


-- 
Petr Baudis, Rossum
Run before you walk! Fly before you crawl! Keep moving forward!
If we fail, I'd rather fail really hugely.  -- Moist von Lipwig
___
Computer-go mailing list
Computer-go@computer-go.org
http://computer-go.org/mailman/listinfo/computer-go

Re: [Computer-go] 9x9 is last frontier?

2018-02-23 Thread Hideki Kato
uurtamo .: :
>Slow down there, hombre.
>
>There's no secret sauce to 9x9 other than that it isn't the current focus
>of people.
>
>Just like 7x7 isn't immune.
>
>A computer program for 9x9, funded, backed by halfway serious people, and
>focused on the task, will *destroy* human opponents at any time it needs to.

Why do you think (or believe) so?  I'd like to say there 
is no evidence so far.

>If you believe that there is a special reason that 9x9 is harder than
>19x19, then I'm super interested to hear that. But it's not harder for
>computers. It's just not what people have been focusing on.

9x9 is not harder than 19x19 as a game.  However:  (1) Value 
networks, the key components to beat human on 19x19, work 
fine only on static positions but 9x9 has almost no such 
positions.   (2) Humans can play much better on 9x9 
than 19x19.  Top level professionals can read-out at near 
end of the middle stage of a game in less than 30 min with 
one point accuracy of the score, for example.

Humans are not good at global evaluation of larger boards so 
bots can beat top professionals on 19x19 but this does not 
apply 9x9.  The size of the board is important because 
value networks are not universal, ie, approximate the 
value function not so presicely, mainly due to 
the number of training data is limited in practice (up to 
10^8 while the number of possible input positions is greater 
than, at least, 10^20).  One more reason, there are no 
algorithm to solve double ko. This is not so big problem on 
19x19 but 9x9.

Best, Hideki

>s.
>
>On Feb 23, 2018 4:49 PM, "Hideki Kato"  wrote:
>
>> That's not the point, Petri.  9x9 has almost no "silent"
>> or "static" positons which value networks superb humans.
>> On 9x9 boards, Kos, especially double Kos and two step Kos
>> are important but MCTS still works worse for them, for
>> examples.  Human professionals are much better at life
>> and complex local fights which dominate small board games
>> because they can read deterministically and deeper than
>> current MCTS bots in standard time settings (not blitz).
>> Also it's well known that MCTS is not good at finding narrow
>> and deep paths to win due to "averaging".  Ohashi 6p said
>> that he couldn't lose against statiscal algorithms after the
>> event in 2012.
>>
>> Best,
>> Hideki
>>
>> Petri Pitkanen: > 3zrby3k9kjvmzah...@mail.gmail.com>:
>> >elo-range in 9x9 smaller than 19x19. One just cannot be hugelyl better
>> than
>> >the other is such limitted game
>> >
>> >2018-02-23 21:15 GMT+02:00 Hiroshi Yamashita :
>> >
>> >> Hi,
>> >>
>> >> Top 19x19 program reaches 4200 BayesElo on CGOS. But 3100 in 9x9.
>> >> Maybe it is because people don't have much interest in 9x9.
>> >> But it seems value network does not work well in 9x9.
>> >> Weights_33_400 is maybe made by selfplay network. But it is 2946 in 
>9x9.
>> >> Weights_31_3200 is 4069 in 19x19 though.
>> >>
>> >> In year 2012, Zen played 6 games against 3 Japanese Pros, and lost by
>> 0-6.
>> >> And it seems Zen's 9x9 strength does not change big even now.
>> >> http://computer-go.org/pipermail/computer-go/2012-November/005556.html
>> >>
>> >> I feel there is still enough chance that human can beat best program in
>> >> 9x9.
>> >>
>> >> Thanks,
>> >> Hiroshi Yamashita
>> >>
>> >> ___
>> >> Computer-go mailing list
>> >> Computer-go@computer-go.org
>> >> http://computer-go.org/mailman/listinfo/computer-go
>> > inline file
>> >___
>> >Computer-go mailing list
>> >Computer-go@computer-go.org
>> >http://computer-go.org/mailman/listinfo/computer-go
>> --
>> Hideki Kato 
>> ___
>> Computer-go mailing list
>> Computer-go@computer-go.org
>> http://computer-go.org/mailman/listinfo/computer-go
> inline file
>___
>Computer-go mailing list
>Computer-go@computer-go.org
>http://computer-go.org/mailman/listinfo/computer-go
-- 
Hideki Kato 
___
Computer-go mailing list
Computer-go@computer-go.org
http://computer-go.org/mailman/listinfo/computer-go

Re: [Computer-go] 9x9 is last frontier?

2018-02-23 Thread uurtamo .
Slow down there, hombre.

There's no secret sauce to 9x9 other than that it isn't the current focus
of people.

Just like 7x7 isn't immune.

A computer program for 9x9, funded, backed by halfway serious people, and
focused on the task, will *destroy* human opponents at any time it needs to.

If you believe that there is a special reason that 9x9 is harder than
19x19, then I'm super interested to hear that. But it's not harder for
computers. It's just not what people have been focusing on.

s.

On Feb 23, 2018 4:49 PM, "Hideki Kato"  wrote:

> That's not the point, Petri.  9x9 has almost no "silent"
> or "static" positons which value networks superb humans.
> On 9x9 boards, Kos, especially double Kos and two step Kos
> are important but MCTS still works worse for them, for
> examples.  Human professionals are much better at life
> and complex local fights which dominate small board games
> because they can read deterministically and deeper than
> current MCTS bots in standard time settings (not blitz).
> Also it's well known that MCTS is not good at finding narrow
> and deep paths to win due to "averaging".  Ohashi 6p said
> that he couldn't lose against statiscal algorithms after the
> event in 2012.
>
> Best,
> Hideki
>
> Petri Pitkanen:  3zrby3k9kjvmzah...@mail.gmail.com>:
> >elo-range in 9x9 smaller than 19x19. One just cannot be hugelyl better
> than
> >the other is such limitted game
> >
> >2018-02-23 21:15 GMT+02:00 Hiroshi Yamashita :
> >
> >> Hi,
> >>
> >> Top 19x19 program reaches 4200 BayesElo on CGOS. But 3100 in 9x9.
> >> Maybe it is because people don't have much interest in 9x9.
> >> But it seems value network does not work well in 9x9.
> >> Weights_33_400 is maybe made by selfplay network. But it is 2946 in 9x9.
> >> Weights_31_3200 is 4069 in 19x19 though.
> >>
> >> In year 2012, Zen played 6 games against 3 Japanese Pros, and lost by
> 0-6.
> >> And it seems Zen's 9x9 strength does not change big even now.
> >> http://computer-go.org/pipermail/computer-go/2012-November/005556.html
> >>
> >> I feel there is still enough chance that human can beat best program in
> >> 9x9.
> >>
> >> Thanks,
> >> Hiroshi Yamashita
> >>
> >> ___
> >> Computer-go mailing list
> >> Computer-go@computer-go.org
> >> http://computer-go.org/mailman/listinfo/computer-go
> > inline file
> >___
> >Computer-go mailing list
> >Computer-go@computer-go.org
> >http://computer-go.org/mailman/listinfo/computer-go
> --
> Hideki Kato 
> ___
> Computer-go mailing list
> Computer-go@computer-go.org
> http://computer-go.org/mailman/listinfo/computer-go
___
Computer-go mailing list
Computer-go@computer-go.org
http://computer-go.org/mailman/listinfo/computer-go

Re: [Computer-go] 9x9 is last frontier?

2018-02-23 Thread Hideki Kato
That's not the point, Petri.  9x9 has almost no "silent" 
or "static" positons which value networks superb humans. 
On 9x9 boards, Kos, especially double Kos and two step Kos 
are important but MCTS still works worse for them, for 
examples.  Human professionals are much better at life 
and complex local fights which dominate small board games 
because they can read deterministically and deeper than 
current MCTS bots in standard time settings (not blitz).  
Also it's well known that MCTS is not good at finding narrow 
and deep paths to win due to "averaging".  Ohashi 6p said 
that he couldn't lose against statiscal algorithms after the 
event in 2012.

Best,
Hideki

Petri Pitkanen: 
:
>elo-range in 9x9 smaller than 19x19. One just cannot be hugelyl better than
>the other is such limitted game
>
>2018-02-23 21:15 GMT+02:00 Hiroshi Yamashita :
>
>> Hi,
>>
>> Top 19x19 program reaches 4200 BayesElo on CGOS. But 3100 in 9x9.
>> Maybe it is because people don't have much interest in 9x9.
>> But it seems value network does not work well in 9x9.
>> Weights_33_400 is maybe made by selfplay network. But it is 2946 in 9x9.
>> Weights_31_3200 is 4069 in 19x19 though.
>>
>> In year 2012, Zen played 6 games against 3 Japanese Pros, and lost by 0-6.
>> And it seems Zen's 9x9 strength does not change big even now.
>> http://computer-go.org/pipermail/computer-go/2012-November/005556.html
>>
>> I feel there is still enough chance that human can beat best program in
>> 9x9.
>>
>> Thanks,
>> Hiroshi Yamashita
>>
>> ___
>> Computer-go mailing list
>> Computer-go@computer-go.org
>> http://computer-go.org/mailman/listinfo/computer-go
> inline file
>___
>Computer-go mailing list
>Computer-go@computer-go.org
>http://computer-go.org/mailman/listinfo/computer-go
-- 
Hideki Kato 
___
Computer-go mailing list
Computer-go@computer-go.org
http://computer-go.org/mailman/listinfo/computer-go

Re: [Computer-go] 9x9 is last frontier?

2018-02-23 Thread Petri Pitkanen
elo-range in 9x9 smaller than 19x19. One just cannot be hugelyl better than
the other is such limitted game

2018-02-23 21:15 GMT+02:00 Hiroshi Yamashita :

> Hi,
>
> Top 19x19 program reaches 4200 BayesElo on CGOS. But 3100 in 9x9.
> Maybe it is because people don't have much interest in 9x9.
> But it seems value network does not work well in 9x9.
> Weights_33_400 is maybe made by selfplay network. But it is 2946 in 9x9.
> Weights_31_3200 is 4069 in 19x19 though.
>
> In year 2012, Zen played 6 games against 3 Japanese Pros, and lost by 0-6.
> And it seems Zen's 9x9 strength does not change big even now.
> http://computer-go.org/pipermail/computer-go/2012-November/005556.html
>
> I feel there is still enough chance that human can beat best program in
> 9x9.
>
> Thanks,
> Hiroshi Yamashita
>
> ___
> Computer-go mailing list
> Computer-go@computer-go.org
> http://computer-go.org/mailman/listinfo/computer-go
___
Computer-go mailing list
Computer-go@computer-go.org
http://computer-go.org/mailman/listinfo/computer-go