Re: [Computer-go] AMAF/RAVE + heavy playouts - is it save?

2015-11-05 Thread Robert Finking

  
  
You are welcome. Figure 1 in [2] is the diagram I was thinking
  of. 

On 03-Nov-15 20:39, Tobias Pfeiffer
  wrote:


  
  This helps very much, thank you for taking the time to answer!
  
  You might be looking for for "Combining Online and Offline
  Knowledge in UCT" [1] by Gelly and Silver. Silver Tesauroreference
  it in "Monte-carlo Simulation Balancing" [2] with "Unfortunately,
  a stronger simulation policy can actually lead to a weaker
  Monte-Carlo search (Gelly & Silver,
  2007), a paradox that we explore further in this paper."
  
  I'll make it a priority to read both papers in detail thank you!
  If you meant another paper, someone else knows one I'm happy to
  see more references.
  
  Thanks!
  Tobi
  
  
  [1] http://www.machinelearning.org/proceedings/icml2007/papers/387.pdf
  [2] http://www.machinelearning.org/archive/icml2009/papers/500.pdf
  
  
  On 03.11.2015 21:03, robertfinkng...@o2.co.uk
wrote:
  
  

You have to be careful what heuristics you apply. This was a
  surprising result: using a playout policy which in itself is a
  stronger go player can actually make MCTS/AMAF weaker. The
  reason is that MCTS depends entirely on accurate estimations
  of the value of each position in the tree. Any playout policy
  which introduces a bias therefore weakens MCTS. It may
  increase precision (lower standard deviation) but gives a less
  accurate assessment of the value (an incorrect mean). Most
  playouts at the moment (at least published ones) are based on
  Remi's Mogo playout policy, which increases precision without
  sacrificing accuracy.
  
  There's a really nice diagram in one of David Silver's papers
  illustrating the effect that bias can have on playouts. As
  soon as you see it you understand the problem. Unfortunately I
  don't have it to hand and have unfortunately run out of time
  looking for it, otherwise I'd reference it. Hopefully somebody
  else can give the reference. I suspect David probably
  co-authored the paper in which case apologies to the other
  author for not crediting them here!
  
  I hope this helps
  
  Regards
  
  Raffles

On 03-Nov-15 19:38, Tobias Pfeiffer
  wrote:


  Hi everyone,

I haven't yet caught up on most recent go papers. If what I ask is
answered in one of these, please point there.

It seems everyone is using quite heavy playouts these days (nxn
patterns, atari escapes, opening libraris, lots of stuff that I don't
know yet, ...) - my question is how does that mix with AMAF/RAVE? I
remember from the early papers, that they said it'd be dangerous to do
it with non random playouts and that they shouldn't have too much logic.

Which, well, makes sense (to me) because the argument is that we play
random moves so they are order independent. With patterns that doesn't
hold true anymore.

What's the experience out there? Does it just still work? Does it not
matter because you just "warm up" the tree? Or do you need to be careful
with what heuristics you apply not too break RAVE/AMAF?

Thank you!
Tobi


  
  
  
  ___
Computer-go mailing list
Computer-go@computer-go.org
http://computer-go.org/mailman/listinfo/computer-go
  
  
  
  -
No virus found in this message.
Checked by AVG - www.avg.com
Version: 2016.0.7163 / Virus Database: 4457/10906 - Release Date: 10/28/15






___
Computer-go mailing list
Computer-go@computer-go.org
http://computer-go.org/mailman/listinfo/computer-go
  
  
  -- 
www.pragtob.info


  

___
Computer-go mailing list
Computer-go@computer-go.org
http://computer-go.org/mailman/listinfo/computer-go

Re: [Computer-go] AMAF/RAVE + heavy playouts - is it save?

2015-11-04 Thread Urban Hafner
To make matters more difficult I assume that this also depends on the exact
node evaluation you’re using. There’s UCT + RAVE, then there’s just RAVE
(as used by Michi). And then you can add other things in there as well like
criticality (like Pachi, and at least at one point CrazyStone).

I personally saw a definite strength increase when adding RAVE to the
exploration strategy with heavy playouts but then my bot isn’t that strong
yet so it may be different for you. As always, there’s no replacement for
benchmarks.

Urban

On Tue, Nov 3, 2015 at 8:38 PM, Tobias Pfeiffer  wrote:

> Hi everyone,
>
> I haven't yet caught up on most recent go papers. If what I ask is
> answered in one of these, please point there.
>
> It seems everyone is using quite heavy playouts these days (nxn
> patterns, atari escapes, opening libraris, lots of stuff that I don't
> know yet, ...) - my question is how does that mix with AMAF/RAVE? I
> remember from the early papers, that they said it'd be dangerous to do
> it with non random playouts and that they shouldn't have too much logic.
>
> Which, well, makes sense (to me) because the argument is that we play
> random moves so they are order independent. With patterns that doesn't
> hold true anymore.
>
> What's the experience out there? Does it just still work? Does it not
> matter because you just "warm up" the tree? Or do you need to be careful
> with what heuristics you apply not too break RAVE/AMAF?
>
> Thank you!
> Tobi
>
> --
> www.pragtob.info
>
>
>
> ___
> Computer-go mailing list
> Computer-go@computer-go.org
> http://computer-go.org/mailman/listinfo/computer-go
>



-- 
Blog: http://bettong.net/
Twitter: https://twitter.com/ujh
Homepage: http://www.urbanhafner.com/
___
Computer-go mailing list
Computer-go@computer-go.org
http://computer-go.org/mailman/listinfo/computer-go

Re: [Computer-go] AMAF/RAVE + heavy playouts - is it save?

2015-11-04 Thread Stefan Kaitschick
The name "Monte Carlo" strongly seems to suggest, that randomness it at the
core of the method. And randomness does play a role.
But what really happend in the shift to MC, was that bots didn't try to
evaluate intermediate positions anymore. Instead, all game knowledge was
put into selecting candidate moves. It turns out, that, for bots, it's much
easier to suggest promising moves, than to say who is ahead in an ongoing
game.
The tree of possible go games is so vast, that trying to explore it with
pure randomness fails. Even with statistical feedback. It's already a minor
miracle, that it works as well as it does with good move generators.
___
Computer-go mailing list
Computer-go@computer-go.org
http://computer-go.org/mailman/listinfo/computer-go

[Computer-go] AMAF/RAVE + heavy playouts - is it save?

2015-11-03 Thread Tobias Pfeiffer
Hi everyone,

I haven't yet caught up on most recent go papers. If what I ask is
answered in one of these, please point there.

It seems everyone is using quite heavy playouts these days (nxn
patterns, atari escapes, opening libraris, lots of stuff that I don't
know yet, ...) - my question is how does that mix with AMAF/RAVE? I
remember from the early papers, that they said it'd be dangerous to do
it with non random playouts and that they shouldn't have too much logic.

Which, well, makes sense (to me) because the argument is that we play
random moves so they are order independent. With patterns that doesn't
hold true anymore.

What's the experience out there? Does it just still work? Does it not
matter because you just "warm up" the tree? Or do you need to be careful
with what heuristics you apply not too break RAVE/AMAF?

Thank you!
Tobi

-- 
www.pragtob.info




signature.asc
Description: OpenPGP digital signature
___
Computer-go mailing list
Computer-go@computer-go.org
http://computer-go.org/mailman/listinfo/computer-go

Re: [Computer-go] AMAF/RAVE + heavy playouts - is it save?

2015-11-03 Thread David Fotland
Many Faces of Go doesn’t use Remi’s playout policy and I don’t think Zen does 
either.  I don’t think Remi’s and Mogo’s are similar either, since they were in 
some ways competing developments.  The bias issue is very real, so as you add 
knowledge to the playouts you have to be careful to add (for example) both 
attack and defense moves in a situation.

 

David

 

From: Computer-go [mailto:computer-go-boun...@computer-go.org] On Behalf Of 
Tobias Pfeiffer
Sent: Tuesday, November 03, 2015 12:39 PM
To: r...@ffles.com; computer-go@computer-go.org
Subject: Re: [Computer-go] AMAF/RAVE + heavy playouts - is it save?

 

This helps very much, thank you for taking the time to answer!

You might be looking for for "Combining Online and Offline Knowledge in UCT" 
[1] by Gelly and Silver. Silver Tesauroreference it in "Monte-carlo Simulation 
Balancing" [2] with "Unfortunately, a stronger simulation policy can actually 
lead to a weaker Monte-Carlo search (Gelly & Silver, 2007), a paradox that we 
explore further in this paper."

I'll make it a priority to read both papers in detail thank you! If you meant 
another paper, someone else knows one I'm happy to see more references.

Thanks!
Tobi


[1] http://www.machinelearning.org/proceedings/icml2007/papers/387.pdf
[2] http://www.machinelearning.org/archive/icml2009/papers/500.pdf



On 03.11.2015 21:03, robertfinkng...@o2.co.uk wrote:

You have to be careful what heuristics you apply. This was a surprising result: 
using a playout policy which in itself is a stronger go player can actually 
make MCTS/AMAF weaker. The reason is that MCTS depends entirely on accurate 
estimations of the value of each position in the tree. Any playout policy which 
introduces a bias therefore weakens MCTS. It may increase precision (lower 
standard deviation) but gives a less accurate assessment of the value (an 
incorrect mean). Most playouts at the moment (at least published ones) are 
based on Remi's Mogo playout policy, which increases precision without 
sacrificing accuracy.

There's a really nice diagram in one of David Silver's papers illustrating the 
effect that bias can have on playouts. As soon as you see it you understand the 
problem. Unfortunately I don't have it to hand and have unfortunately run out 
of time looking for it, otherwise I'd reference it. Hopefully somebody else can 
give the reference. I suspect David probably co-authored the paper in which 
case apologies to the other author for not crediting them here!

I hope this helps

Regards

Raffles

On 03-Nov-15 19:38, Tobias Pfeiffer wrote:

Hi everyone,
 
I haven't yet caught up on most recent go papers. If what I ask is
answered in one of these, please point there.
 
It seems everyone is using quite heavy playouts these days (nxn
patterns, atari escapes, opening libraris, lots of stuff that I don't
know yet, ...) - my question is how does that mix with AMAF/RAVE? I
remember from the early papers, that they said it'd be dangerous to do
it with non random playouts and that they shouldn't have too much logic.
 
Which, well, makes sense (to me) because the argument is that we play
random moves so they are order independent. With patterns that doesn't
hold true anymore.
 
What's the experience out there? Does it just still work? Does it not
matter because you just "warm up" the tree? Or do you need to be careful
with what heuristics you apply not too break RAVE/AMAF?
 
Thank you!
Tobi
 






___
Computer-go mailing list
Computer-go@computer-go.org
http://computer-go.org/mailman/listinfo/computer-go






-
No virus found in this message.
Checked by AVG - www.avg.com
Version: 2016.0.7163 / Virus Database: 4457/10906 - Release Date: 10/28/15







___
Computer-go mailing list
Computer-go@computer-go.org
http://computer-go.org/mailman/listinfo/computer-go





-- 
www.pragtob.info
___
Computer-go mailing list
Computer-go@computer-go.org
http://computer-go.org/mailman/listinfo/computer-go

Re: [Computer-go] AMAF/RAVE + heavy playouts - is it save?

2015-11-03 Thread robertfinkng...@o2.co.uk

  
  
You have to be careful what heuristics you apply. This was a
  surprising result: using a playout policy which in itself is a
  stronger go player can actually make MCTS/AMAF weaker. The reason
  is that MCTS depends entirely on accurate estimations of the value
  of each position in the tree. Any playout policy which introduces
  a bias therefore weakens MCTS. It may increase precision (lower
  standard deviation) but gives a less accurate assessment of the
  value (an incorrect mean). Most playouts at the moment (at least
  published ones) are based on Remi's Mogo playout policy, which
  increases precision without sacrificing accuracy.
  
  There's a really nice diagram in one of David Silver's papers
  illustrating the effect that bias can have on playouts. As soon as
  you see it you understand the problem. Unfortunately I don't have
  it to hand and have unfortunately run out of time looking for it,
  otherwise I'd reference it. Hopefully somebody else can give the
  reference. I suspect David probably co-authored the paper in which
  case apologies to the other author for not crediting them here!
  
  I hope this helps
  
  Regards
  
  Raffles

On 03-Nov-15 19:38, Tobias Pfeiffer
  wrote:


  Hi everyone,

I haven't yet caught up on most recent go papers. If what I ask is
answered in one of these, please point there.

It seems everyone is using quite heavy playouts these days (nxn
patterns, atari escapes, opening libraris, lots of stuff that I don't
know yet, ...) - my question is how does that mix with AMAF/RAVE? I
remember from the early papers, that they said it'd be dangerous to do
it with non random playouts and that they shouldn't have too much logic.

Which, well, makes sense (to me) because the argument is that we play
random moves so they are order independent. With patterns that doesn't
hold true anymore.

What's the experience out there? Does it just still work? Does it not
matter because you just "warm up" the tree? Or do you need to be careful
with what heuristics you apply not too break RAVE/AMAF?

Thank you!
Tobi


  
  
  
  ___
Computer-go mailing list
Computer-go@computer-go.org
http://computer-go.org/mailman/listinfo/computer-go
  
  
  
  -
No virus found in this message.
Checked by AVG - www.avg.com
Version: 2016.0.7163 / Virus Database: 4457/10906 - Release Date: 10/28/15



  

___
Computer-go mailing list
Computer-go@computer-go.org
http://computer-go.org/mailman/listinfo/computer-go

Re: [Computer-go] AMAF/RAVE + heavy playouts - is it save?

2015-11-03 Thread Tobias Pfeiffer
This helps very much, thank you for taking the time to answer!

You might be looking for for "Combining Online and Offline Knowledge in
UCT" [1] by Gelly and Silver. Silver Tesauroreference it in "Monte-carlo
Simulation Balancing" [2] with "Unfortunately, a stronger simulation
policy can actually lead to a weaker Monte-Carlo search (Gelly & Silver,
2007), a paradox that we explore further in this paper."

I'll make it a priority to read both papers in detail thank you! If you
meant another paper, someone else knows one I'm happy to see more
references.

Thanks!
Tobi


[1] http://www.machinelearning.org/proceedings/icml2007/papers/387.pdf
[2] http://www.machinelearning.org/archive/icml2009/papers/500.pdf


On 03.11.2015 21:03, robertfinkng...@o2.co.uk wrote:
> You have to be careful what heuristics you apply. This was a
> surprising result: using a playout policy which in itself is a
> stronger go player can actually make MCTS/AMAF weaker. The reason is
> that MCTS depends entirely on accurate estimations of the value of
> each position in the tree. Any playout policy which introduces a bias
> therefore weakens MCTS. It may increase precision (lower standard
> deviation) but gives a less accurate assessment of the value (an
> incorrect mean). Most playouts at the moment (at least published ones)
> are based on Remi's Mogo playout policy, which increases precision
> without sacrificing accuracy.
>
> There's a really nice diagram in one of David Silver's papers
> illustrating the effect that bias can have on playouts. As soon as you
> see it you understand the problem. Unfortunately I don't have it to
> hand and have unfortunately run out of time looking for it, otherwise
> I'd reference it. Hopefully somebody else can give the reference. I
> suspect David probably co-authored the paper in which case apologies
> to the other author for not crediting them here!
>
> I hope this helps
>
> Regards
>
> Raffles
>
> On 03-Nov-15 19:38, Tobias Pfeiffer wrote:
>> Hi everyone,
>>
>> I haven't yet caught up on most recent go papers. If what I ask is
>> answered in one of these, please point there.
>>
>> It seems everyone is using quite heavy playouts these days (nxn
>> patterns, atari escapes, opening libraris, lots of stuff that I don't
>> know yet, ...) - my question is how does that mix with AMAF/RAVE? I
>> remember from the early papers, that they said it'd be dangerous to do
>> it with non random playouts and that they shouldn't have too much logic.
>>
>> Which, well, makes sense (to me) because the argument is that we play
>> random moves so they are order independent. With patterns that doesn't
>> hold true anymore.
>>
>> What's the experience out there? Does it just still work? Does it not
>> matter because you just "warm up" the tree? Or do you need to be careful
>> with what heuristics you apply not too break RAVE/AMAF?
>>
>> Thank you!
>> Tobi
>>
>>
>>
>> ___
>> Computer-go mailing list
>> Computer-go@computer-go.org
>> http://computer-go.org/mailman/listinfo/computer-go
>>
>>
>> -
>> No virus found in this message.
>> Checked by AVG - www.avg.com
>> Version: 2016.0.7163 / Virus Database: 4457/10906 - Release Date: 10/28/15
>
>
>
> ___
> Computer-go mailing list
> Computer-go@computer-go.org
> http://computer-go.org/mailman/listinfo/computer-go

-- 
www.pragtob.info

___
Computer-go mailing list
Computer-go@computer-go.org
http://computer-go.org/mailman/listinfo/computer-go