Re: [Computer-go] Training the value network (a possibly more efficient approach)

2017-01-10 Thread Brian Sheppard
I was writing code along those lines when AlphaGo debuted. When it became clear 
that AlphaGo had succeeded, then I ceased work.

 

So I don’t know whether this strategy will succeed, but the theoretical merits 
were good enough to encourage me.

 

Best of luck,

Brian

 

From: Computer-go [mailto:computer-go-boun...@computer-go.org] On Behalf Of Bo 
Peng
Sent: Tuesday, January 10, 2017 5:25 PM
To: computer-go@computer-go.org
Subject: [Computer-go] Training the value network (a possibly more efficient 
approach)

 

Hi everyone. It occurs to me there might be a more efficient method to train 
the value network directly (without using the policy network).

 

You are welcome to check my method: http://withablink.com/GoValueFunction.pdf

 

Let me know if there is any silly mistakes :)

 

___
Computer-go mailing list
Computer-go@computer-go.org
http://computer-go.org/mailman/listinfo/computer-go

Re: [Computer-go] Training the value network (a possibly more efficient approach)

2017-01-10 Thread John Tromp
hi Bo,

> Let me know if there is any silly mistakes :)

You say "the perfect policy network can be
derived from the perfect value network (the best next move is the move
that maximises the value for the player, if the value function is
perfect), but not vice versa.", but a perfect policy for both players
can be used to generate a perfect playout which yields the perfect
value...

regards,
-John
___
Computer-go mailing list
Computer-go@computer-go.org
http://computer-go.org/mailman/listinfo/computer-go

[Computer-go] Training the value network (a possibly more efficient approach)

2017-01-10 Thread Bo Peng
Hi everyone. It occurs to me there might be a more efficient method to train 
the value network directly (without using the policy network).

You are welcome to check my method: http://withablink.com/GoValueFunction.pdf


Let me know if there is any silly mistakes :)



___
Computer-go mailing list
Computer-go@computer-go.org
http://computer-go.org/mailman/listinfo/computer-go

Re: [Computer-go] Golois5 is KGS 4d

2017-01-10 Thread Detlef Schmicker
Very interesting,

but lets wait some days for getting an idea of the strength,
4d it reached due to games against AyaBotD3, now it is 3d again...


Detlef

Am 10.01.2017 um 15:29 schrieb Gian-Carlo Pascutto:
> On 10-01-17 15:05, Hiroshi Yamashita wrote:
>> Hi,
>>
>> Golois5 is KGS 4d.
>> I think it is a first bot that gets 4d by using DCNN without search.
> 
> I found this paper:
> 
> https://openreview.net/pdf?id=Bk67W4Yxl
> 
> They are using residual layers in the DCNN.
> 
___
Computer-go mailing list
Computer-go@computer-go.org
http://computer-go.org/mailman/listinfo/computer-go

Re: [Computer-go] Golois5 is KGS 4d

2017-01-10 Thread Gian-Carlo Pascutto
On 10-01-17 15:05, Hiroshi Yamashita wrote:
> Hi,
> 
> Golois5 is KGS 4d.
> I think it is a first bot that gets 4d by using DCNN without search.

I found this paper:

https://openreview.net/pdf?id=Bk67W4Yxl

They are using residual layers in the DCNN.

-- 
GCP
___
Computer-go mailing list
Computer-go@computer-go.org
http://computer-go.org/mailman/listinfo/computer-go

[Computer-go] Golois5 is KGS 4d

2017-01-10 Thread Hiroshi Yamashita

Hi,

Golois5 is KGS 4d.
I think it is a first bot that gets 4d by using DCNN without search.

Golois5 (4d) info says
"I use a policy network trained on Gogod games."

Golois4 (3d) info says
"I use a policy network trained on KGS games played by 6 dan or more."

If both network are same, GoGoD games is better than KGS games for DCNN?

Thanks,
Hiroshi Yamashita

___
Computer-go mailing list
Computer-go@computer-go.org
http://computer-go.org/mailman/listinfo/computer-go