Re: [computer-go] Why not forums?

2007-02-07 Thread Raymond Wold

Dmitry Kamenetsky wrote:


I have been reading this list for nearly a year now and it is very discouraging 
to receive so much criticism for my first post.

The yahoo groups was merely an example to show how easy it is to get a forum 
started. I also agree that yahoo appends too much spam to its forums and I am 
sure there are many much much better free forums out there.

The forums that I really like are the TopCoder forums 
(http://forums.topcoder.com/). I like them for these reasons:

* One can post in various sections. The sections we can have here could be: 
Monte Carlo Go, Search in Go, Learning in Go, CGOS, KGS, Human Go.
* Threads are easy to find and each thread has a post count. The post count is 
a good indication of how interesting that thread is. For example if there are 
many threads that I haven't had time to read, then I will first read the ones 
with the most post count.
* Different viewing options: flat (newest first), threaded or tree. These can 
be useful for various purposes.
* Each post has a '+' and '-' associated with it. This means that if you agree 
with the post then you simply press the '+' button and the plus count goes up, 
similarly if you disagree you press the '-' button. This serves two purposes: 
you don't have to post extra posts just to show your agreement/disagreement, 
which saves space and your time; also this is a great way to make votes - those 
in favour press '+', those against press '-'.
* Each post is associated with a date and time. Also it is easy to ressurect 
threads that are years old.
* If you had a typo or a mistake in your post, you can easily edit it. This is 
extremely useful.
* It is not necessary, but it is always nice to see who you are talking to.
* There is a very powerful message searching engine, which incorporates: 
section type, date range and member name.
* You can watch threads that are of interest to you.

I hope I have given some good reasons for having a forum. Since so many people 
here are against losing the list, why not the following: we keep the list, but 
give members the option of using a forum? This way we can all be happy :)
 

The reason you are receiving so much critisism is a fundamental 
missunderstanding on your part. Having a mailing list does not exclude 
any of the things you mention, but rather grants exactly the freedom you 
point out in your last sentence. Most of the features you mention, I 
already have in my client, and the ones I do not have, I lack mainly 
because I do not want them. If you want more features, the solution is 
not to change the back end; it is to change your personal front end. Get 
a better client. Don't enforce your preferences on everyone.


Furthermore it is a common attitude that web forums have a much lower 
quality of discussion than mailing lists and newsgroup. Some of this 
probably comes from the availability of and easy of setting up web 
forums over mailing lists or worse, a newsgroup, so an average forum 
will have less competent admins, moderators, and in the end, users. But 
some of it can definitely be ascribed to some of the points you mention 
in favor of forums; the ability to edit posts and the ability to simply 
agree or disagree without adding any content, for instance, have an 
effect on what kind of debate is held, and I do not think it is an 
overall positive one.

___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/


Re: Re[4]: [computer-go] Why not forums?

2007-02-07 Thread Urban Hafner

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1


On Feb 7, 2007, at 5:20 , Dmitry Kamenetsky wrote:


I hope I have given some good reasons for having a forum. Since so
many people here are against losing the list, why not the following:
we keep the list, but give members the option of using a forum? This
way we can all be happy :)


Like I said, we could get both if we have a two way gateway. This way
every post from the mailing list ends up in the forum and every forum
post ends up on the mailing list.

If everyone (especially our dear list moderator) is OK with it, I'd try
to set it up.

Urban
- --
http://bettong.net | Urban's Blog
http://computer-go.bettong.net | Planet Computer-Go



-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.6 (Darwin)

iD8DBQFFyYpOggNuVCIrEyURArwIAJ9avTgfuaHcnIkBr9SFjvzPyonJLwCfRB1V
4xG6BnPtEIkHVC32eJKLzBg=
=8YyO
-END PGP SIGNATURE-
___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/


Re: [computer-go] Monte Carlo (MC) vs Quasi-Monte Carlo (QMC)

2007-02-07 Thread Olivier Teytaud
As I have spent a lot of time trying to guess what could be done for 
Quasi-Monte-Carlo
or other standard forms of Monte-Carlo-improvements in computer-go, I 
write below

my (humble and pessimistic :-) ) opinion about that.

Let's formalize Monte-Carlo.
Consider P a distribution of probability. Consider that you
want to estimate Pf, the expectation of f under distribution P. You can 
get values
of f at some points; note hP ($\hat P$ is a usual notation) the 
empirical distribution

of probability, i.e.

 Pf = unknown expectation of f for distribution P
 hPf = (\sum_{i=1}^n f(x_i) )/n   
In standard naive Monte-Carlo, the x_i are independent identically 
distributed according
to distribution P. One can sample without replacement in the discrete 
case (no more indep.

then, but almost :-) ).

Biasing Monte-Carlo (importance sampling) consists in replacing P by 
some P',

which samples more often points in more important areas. Standard tools
consist in using P' with density proportional to the product
  (i) density of P
  (ii) local standard deviation of f (that is unknown and must be 
approximated)

This does not introduce a bias (on average, hPf remains close to Pf), as we
change the weighting of points; hPf is now:

 hPf = (\sum_{i=1}^n P(x_i)f(x_i)/P'(x_i)  )/n
and the x_i are i.i.d according to P' in the most simple case (but other 
better cases

are possible, e.g. sampling in strata).

Let's consider how this can be applied to Monte-Carlo-Go.

Designing a good random player is not importance sampling as above. There's
no reweighting. We change what is estimated; we do not want to know what
is the probability of winning for a uniform random player; we just want 
a better
player, so that probabilities estimated by this player will lead to a 
better MC-based
player. Improving the random player is very important, but this is not 
importance sampling.


Why not adding importance sampling ?
Because this requires a good estimate of important
areas, in the sense e.g. areas with high standard deviation. This is, 
in my humble opinion,
nearly as hard as designing a good Go-evaluation-function. Of course, I 
might miss something;
but I think that if you can quickly find areas with significantly higher 
of lower probabilities
of winning, then you have improved your random player, and you don't 
want to reweight anything,
and you are not interested in the probability of winning for the basic 
random player - you want

to move to the new random player.
This is only improving the random player; you change the distribution, 
because you guess that it will
be a better random player (and you hope that your MC-based player will 
be better).


Quasi-Monte-Carlo methods consist in replacing hP by a sum on a 
deterministic

or moderately randomized sample so that the convergence
from hPf to Pf is faster (than the O(1/\sqrt{n}) usual in MC-methods). 
This is somewhat
orthogonal to importance sampling (but both techniques are difficult to 
merge in one technique,

by the way).
In standard QMC, the distribution is not modified, but points are not 
chosen
by independent identically distributed sampling (by the way, this is 
also not the case
with most importance-samplings - but in QMC the focus is on reducing the 
random-part).


(It is now often considered that QMC must be only partially 
derandomized, as
randomized QMC (as opposed to strictly deterministic QMC) is better, 
especially for
large dimension; but, nevermind, let's consider a QMC method, 
independently of
how it is designed; we can just pick up a QMC-sequence from the state of 
the art,

without (at first) trying to understand how it works.)

But, QMC is very efficient in continuous domains, sometimes in specific 
discrete
domains also, but I don't see how to plug QMC in tree-search, or even 
just in a Go-ban sampling
for introducing QMC in a particular node. It looks appealing, but you 
need something like a distance
between moves that is related to the probability of winning, and even if 
you have such a distance, designing

the QMC-sequence is far from straightforward.

However, a simple form of derandomization, the so-called 
jittered-sampling, can perhaps be applied.


  instead of randomly sampling a go-ban,

  split (partition) the go-ban in k zones, and pick the i^{th} point in 
the 1+(i%k)  zone;

  (% means modulo) zone.
  (this in the case of a partition in areas with same area, but this 
can be adapted to

  unequal areas)

Perhaps it is a good idea, but I have not enough Go-knowledge for 
designing such zones. 
Another possible tool for improving Monte-Carlo comparisons

is introducing correlations between explorations of various nodes.
This possibility, consisting in correlating choices in nodes B and C.
should improve the stability of the comparison between B and C, 
therefore improving

the choice performed in node A if A has sons B and C.
However, this introduce a strong variance-increase at node A if A is the 
father of B and C (much

Re: [computer-go] Why not forums?

2007-02-07 Thread Stefan Nobis
Dmitry Kamenetsky [EMAIL PROTECTED] writes:

 The forums that I really like are the TopCoder forums
 (http://forums.topcoder.com/). I like them for these reasons:

And really everything of these reason are features of good E-Mail
clients (except the +/- voting). So with the right client you get all
this out of the box plus much more flexibility (mail is more
standardized than web formus so you have much more options like
converters, gateways, IMAP, clients,...).

So this is all about flexibility -- nearly all web forum software are
really jails with no easy alternative access, but with mailing lists
you have plenty of alternatives (including your own web server with
your forum software into which you can inject all these mails).

-- 
Until the next mail...,
Stefan.
___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/


Re: [computer-go] Why not forums?

2007-02-07 Thread Benjamin Teuber

Hi,

So this is all about flexibility -- nearly all web forum software are
really jails with no easy alternative access, but with mailing lists
you have plenty of alternatives (including your own web server with
your forum software into which you can inject all these mails).
  
I agree. Setting up a separated forum would only harm the community - so 
if people want a (quasi-)forum it should just be an interface to the 
normal list.


Cheers,
Benjamin
___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/


Re: [computer-go] Monte Carlo (MC) vs Quasi-Monte Carlo (QMC)

2007-02-07 Thread Tapani Raiko
  I could see a case where it is possible to reduce a variance of a single
  variable even in the 0-1 case. Let us say that black has about 5% chances of
  winning. If we could (exactly) double the chances of black winning by
  changing the nonuniform sampling somehow (say, enforce bad moves by white),
  we could sample from that and divide the estimated black's winning chance in
  the end by 2. This would of course be very difficult in practice. (A binary
  random variable gives more information when the chances are closer to
  50-50.) This could be useful in practice in handicap games, by for instance
  enforcing a black pass with 1% chance every move. Sampling would be
  distorted towards white win, which is realistic since white is assumed to be
  a stronger player, anyway.

 I don't understand this line of reasoning.

Let my try again using the handicap example. Let's say MC player is given 
a huge handicap. In the simulations, it is winning all of its games, so 
there is no information helping to select the next move. Using information 
theory, each play-out gives one bit of information if the chances are 
50:50, but if the chances are unbalanced, the information content is 
lower. (see http://en.wikipedia.org/wiki/Binary_entropy_function ) In the 
extreme case, there is no information at all.

Now, let us use distorted MC where we enforce black to pass with a few 
percent chance every move. White begins to win some of the simulations, so 
MC is useful again.

How this is related to reducing the variance?

Let us say that a black move leads to a white win with probability p very 
close to zero. Let us also assume that distorting the simulations doubles 
white's chances to 2p.
Using normal MC, the variance of our estimate of p using N samples is
p*(1-p)/N
and using distroted MC, the variance of 2p is
2p*(1-2p)/N
estimating p by using the estimate of 2p, the variance is divided by 4:
p*(1-2p)/2N which is less than p*(1-p)/N.

In practice, we cannot know that distorting would increase the chances 
exactly by doubling them, but if we use the same distortion to estimate 
all moves, we can still compare them.

Tapani
___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/


[computer-go] MC approach (was: Monte Carlo (MC) vs Quasi-Monte Carlo (QMC))

2007-02-07 Thread Heikki Levanto
On Wed, Feb 07, 2007 at 12:06:40PM +0200, Tapani Raiko wrote:
 Let my try again using the handicap example. Let's say MC player is given 
 a huge handicap. In the simulations, it is winning all of its games, so 
 there is no information helping to select the next move. 

This situation happens in normal games too, once one player is so much
ahead that it wins almost no matter what. It leads into really
stupid-looking endgames, where live groups are allowed to die, and dead
ones are allowed to be rescued.

All this could be avoided by a simple rule: Instead of using +1 and -1
as the results, use +1000 and -1000, and add the final score to this.

The purpose of the large constant (1000) is to make sure that it prefers
any win to any loss (so that large_win + small_loss  small_win +
small_win). One could even add another term in the result, favouring
games that end early (for the winner) or postpone them (for the looser),
in hope of allowing the opponent more chances to make mistakes.

As far as I can see, this ought to fit straight in to any MC or UCT
program. It may not improve the winning chances, but it sure should make
the programs play look more reasonable.


Just my humble idea. Feel free to shoot down (with serious arguments),
and/or use where ever you like. I would like to hear if this makes any
practical difference, if anyone tries.

   - Heikki



-- 
Heikki Levanto   In Murphy We Turst heikki (at) lsd (dot) dk

___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/


Re: [computer-go] MC Go Effectiveness

2007-02-07 Thread Jacques Basaldúa

Matt Gokey wrote:


But what are some of the reasons MC is not even better?
1-Since MC engines don't deal with tactics directly, they're not likely
going to play tactical sequences well for low liberty strings, securing
eye space, cutting and connecting, ko fights, or ladders, etc.
2-Also because most of the play-outs are usually nonsense, they may
have trouble dealing with meaningful nuances because the positions that
will lead to these distinctions just don't arise with enough statistical
frequency in the play-outs to affect the result.  Yet when very
selective moves are used in the play-outs, too many possibilities can be
missed.
3-Finally, with 19x19 anyway, the size of the board and game tree
probably limits the practical effectiveness of the sampling and move
ordering. I don't try to address this last point any further in this
message.


Very good analysis and I would like to contribute a 4th reason:

As Luke Gustafson pointed out, we have to contemplate the simulation
as a _stochastic process_. We want to determine the conditional 
probability of a win given a particular move is made. And that depends
on the _length of the simulation_. Dramatically! This is a reason 
against scalability of global search MC/UCT. If the simulation is

500 moves long (Chinese rules with recaptures, etc.) the observed
variance at an early move blurs out everything.

Just a simple stochastic process: Count a dollar each time you
correctly predict a p=1/2 coin thrown n=500 times. The expected
average is (obviously) 250, but the expected variance of that 
measure is n·p·(1-p) = 125 proportional to n.


Jacques.


___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/


Re: Re[4]: [computer-go] Why not forums?

2007-02-07 Thread steve uurtamo
 I have been reading this list for nearly a year now and it is very 
 discouraging to
 receive so much criticism for my first post.

it means that you've touched upon an important issue to the readers of the
list.  this, in and of itself, means that you are getting valuable information 
from
the list readers about what's important to them.  keep in mind that any 
criticism
that you received wasn't criticism of you, just of the idea of a forum.  it 
could be
that many of the list members are a bunch of barnacles who prefer to receive 
most
of their information through a single interface.  i'd certainly count myself in 
the barnacle
crowd -- i think that the information in this list doesn't need much 'jazzing 
up', and that
an arbitrary mail client gives me excellent capacity to participate in the list.

low-ascii serves as a great carrier of information for me, and keeps the 
distraction
level down.  without the easy ability to communicate with other list members 
using
shockwave flash or the urgent need to create 5.1 HD movies of computers battling
it out mano-a-mano, i find myself more rapidly focused on the topic of finding 
good
go playing algorithms.

with that having been said, let me try to address some of the ways in which
email lists can be finessed to service many of the needs that you bring up.
it could be that with a decent mail or news client, all of your worries will be
like yesterday's mashed potatoes.

 * One can post in various sections. The sections we can have here could be:
 Monte Carlo Go, Search in Go, Learning in Go, CGOS, KGS, Human Go.

Subject:  lines work for this purpose, and additionally allow for the easy 
creation
of new sections.

 * Threads are easy to find and each thread has a post count. The post count is
 a good indication of how interesting that thread is. For example if there are 
 many
 threads that I haven't had time to read, then I will first read the ones with 
 the most
 post count.

You might want to use a mail client that allows you to thread messages by From:
and by Subject:.  Many usenet news readers do this automagically, but 
autosorting
your inbound mail by sender and subsorting by subject line will get you 90% of 
the
way there.

 * Different viewing options: flat (newest first), threaded or tree. These can 
 be useful
 for various purposes.

Again, most usenet news clients and many mail readers will allow you to do this 
--
you can treat the threads as a way to diminish the importance of old topics, or
expand them and subsort in order to find exactly what you're looking for, 
quickly.

 * Each post has a '+' and '-' associated with it. This means that if you 
 agree with
 the post then you simply press the '+' button and the plus count goes up, 
 similarly
 if you disagree you press the '-' button. This serves two purposes: you don't 
 have to
 post extra posts just to show your agreement/disagreement, which saves space 
 and
 your time; also this is a great way to make votes - those in favour press 
 '+', those
 against press '-'.

this is indeed an advantage.

 * Each post is associated with a date and time. Also it is easy to ressurect 
 threads
 that are years old.

the From  field of all mail messages contains this information, and
many mail clients will allow you to see this information, sort your messages 
using it,
selectively subsort messages inside threads by date, etc., etc.  the Date:  
field
is another one that many mail and news clients will let you thread, sort and 
subsort
with.  the world is your oyster.  an IMAP mail client is one way to keep track 
of
very old messages in a convenient way, and usually has threading and sorting
capabilities built-in as a necessary function for most users.

 * If you had a typo or a mistake in your post, you can easily edit it. This 
 is extremely
 useful.

here i might disagree.  i can see the advantage of being able to fix something 
that
you later realize was wrong, but being required to take the extra effort to get 
it
right in the first place can lead to more thoughtful posting.  it doesn't 
mandate it by
any means, but it certainly seems to encourage it.  it's a kind of 
self-moderation
imposed by the interface.

 * It is not necessary, but it is always nice to see who you are talking to.

a pasty, potato chip smeared face greets you with lips stretched into a wide,
badly socialized grin and eyes that stare blankly into dimly-lit monitor space
as unbrushed, poorly-aligned teeth wedged uncomfortably against a crimson
gumline provide the main visual contrast to the field of empty soda bottles and
cans that litter the backdrop of a room desperately in need of a good vacuuming.

that's the horrorshow you're looking forward to, my friend.

 * There is a very powerful message searching engine, which incorporates:
 section type, date range and member name.

a surprising number of mail and usenet news clients have this same
functionality.

 * You can watch threads that are of interest to you.

i'm sorry to 

Re: [computer-go] MC Go Effectiveness

2007-02-07 Thread Matt Gokey

Jacques Basaldúa wrote:

Very good analysis and I would like to contribute a 4th reason:

As Luke Gustafson pointed out, we have to contemplate the simulation
as a _stochastic process_. We want to determine the conditional 
probability of a win given a particular move is made. And that depends
on the _length of the simulation_. Dramatically! This is a reason 
against scalability of global search MC/UCT. If the simulation is

500 moves long (Chinese rules with recaptures, etc.) the observed
variance at an early move blurs out everything.

Just a simple stochastic process: Count a dollar each time you
correctly predict a p=1/2 coin thrown n=500 times. The expected
average is (obviously) 250, but the expected variance of that measure is 
n·p·(1-p) = 125 proportional to n.
Good point.  This leads to another thought that I have been wondering 
about.  That is I question whether using more time to search more 
simulations in the opening is the best approach.  For the opening, 
selecting reasonable robust moves that tend to lead to more favorable 
options is probably a good objective.  The lengths of the simulation are 
perhaps too long to expect anything better.  Later towards the 
pre-middle to middle game it is very critical to play such that the 
positions tactical potential is exploited such to secure connections and 
eye space, etc.  It would seem to me that focusing the highest 
concentration of time and number of simulations during this part of the 
game might be most advantageous.


It would be interesting for someone with a decent MC player to do an 
experiment like this with one version concentrating highest number of 
simulations in the opening and one concentrating in the middle game, but 
otherwise equal and see which version wins more often.

___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/


Re: [computer-go] MC Go Effectiveness

2007-02-07 Thread Magnus Persson

Two qick comments:

Quoting Matt Gokey [EMAIL PROTECTED]:


Jacques Basaldúa wrote:

Very good analysis and I would like to contribute a 4th reason:
against scalability of global search MC/UCT. If the simulation is
500 moves long (Chinese rules with recaptures, etc.) the observed
variance at an early move blurs out everything.


This is not a problem for scalability for MC/UCT. It just means that
a program
that is 5 kyu with 10 minutes on 9x9 might be 35 kyu with 10 minutes on 19x19.

Yet for both 9x9 and 19x19 it holds that a bugfree implementation of
MC/UCT will
assymptotically reach perfect play when thinking time goes towards infinity.

True is that going from 9x9 to 19x19 is frustrating. But this is because 19x19
is more complex than 9x9. An MC/UCT program is still very much better than
random play because random plays need more than 100 free handicap stones to
have a chance on 19x19.


Good point.  This leads to another thought that I have been wondering
about.  That is I question whether using more time to search more
simulations in the opening is the best approach.  For the opening,
selecting reasonable robust moves that tend to lead to more favorable
options is probably a good objective.  The lengths of the simulation
are perhaps too long to expect anything better.  Later towards the
pre-middle to middle game it is very critical to play such that the
positions tactical potential is exploited such to secure connections
and eye space, etc.  It would seem to me that focusing the highest
concentration of time and number of simulations during this part of
the game might be most advantageous.

It would be interesting for someone with a decent MC player to do an
experiment like this with one version concentrating highest number of
simulations in the opening and one concentrating in the middle game,
but otherwise equal and see which version wins more often.


My experience with Valkyria is that most time must be allocated towards early
moves. If you make a mistake on 9x9 it is very hard to come back. MC/UCT
programs such as MoGo are very good at defending a won position. Therefore it
is important to get a won position as early as possible. The longer it thinks
the better it plays. In the opening it always critical to find the best move -
but later on games are often either won or lost so saving time for those
positions are not so important.

Valkyria saves time in the opening by using an opening library, but as soon as
it leaves the library it spends a lot of time on the first move. Moves it
spends a lot of time on is also stored in the library. And later on I might
correct moves that was played in lost hand myself. I am actually rarely
satisfied with the opening moves of Valkyria. It needs more time for those
moves...

-Magnus
___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/


Re: [computer-go] Why not forums?

2007-02-07 Thread Joshua Shriver

I've had a computer go forum running for a while but has low traffic.
http://www.olympuschess.com/phpBB2

-josh

On 2/4/07, Dmitry Kamenetsky [EMAIL PROTECTED] wrote:

Hello everyone,

Why can't we use proper forums instead of this outdated list? Forums are easier 
to keep track of and search for messages. As a start we can use Yahoo groups. 
What do you think?


Cheers,
Dmitry
___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/


___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/


Re: [computer-go] MC approach (was: Monte Carlo (MC) vs Quasi-Monte Carlo (QMC))

2007-02-07 Thread dhillismail
  
 
-Original Message-
From: [EMAIL PROTECTED]
To: computer-go@computer-go.org
Sent: Wed, 7 Feb 2007 5:34 AM
Subject: [computer-go] MC approach (was: Monte Carlo (MC) vs Quasi-Monte Carlo 
(QMC))


On Wed, Feb 07, 2007 at 12:06:40PM +0200, Tapani Raiko wrote:
 Let my try again using the handicap example. Let's say MC player is given 
 a huge handicap. In the simulations, it is winning all of its games, so 
 there is no information helping to select the next move. 

This situation happens in normal games too, once one player is so much
ahead that it wins almost no matter what. It leads into really
stupid-looking endgames, where live groups are allowed to die, and dead
ones are allowed to be rescued.

All this could be avoided by a simple rule: Instead of using +1 and -1
as the results, use +1000 and -1000, and add the final score to this.

The purpose of the large constant (1000) is to make sure that it prefers
any win to any loss (so that large_win + small_loss  small_win +
small_win). One could even add another term in the result, favouring
games that end early (for the winner) or postpone them (for the looser),
in hope of allowing the opponent more chances to make mistakes.

As far as I can see, this ought to fit straight in to any MC or UCT
program. It may not improve the winning chances, but it sure should make
the programs play look more reasonable.


Just my humble idea. Feel free to shoot down (with serious arguments),
and/or use where ever you like. I would like to hear if this makes any
practical difference, if anyone tries.



 Intuitively, it seems like this should work. You only give the winning 
margin a small weight, or only use it to break ties, or only apply it after the 
game is already decided. I've tried many variations, as have others, including 
your exact algorithm above. It can make some moves look a little prettier, but 
it always causes problems and I have to take it out.
 I have my theories as to why that is, but for brevity's sake, in my 
experience, giving any consideration to winning margin is detrimental. After 
the game is decided, there are more elegant ways to bring it to a close. I came 
to this conclusion reluctantly.
 
- Dave Hillis

Check Out the new free AIM(R) Mail -- 2 GB of storage and industry-leading spam 
and email virus protection.
___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/

Re: [computer-go] MC approach (was: Monte Carlo (MC) vs Quasi-Monte Carlo (QMC))

2007-02-07 Thread dhillismail
 I should have mentioned that I have only tested on 9x9. For larger boards, 
I don't know.
- Dave Hillis
 
`

 Intuitively, it seems like this should work. You only give the winning 
margin a small weight, or only use it to break ties, or only apply it after the 
game is already decided. I've tried many variations, as have others, including 
your exact algorithm above. It can make some moves look a little prettier, but 
it always causes problems and I have to take it out.
 I have my theories as to why that is, but for brevity's sake, in my 
experience, giving any consideration to winning margin is detrimental. After 
the game is decided, there are more elegant ways to bring it to a close. I came 
to this conclusion reluctantly.
 
 

 

Check Out the new free AIM(R) Mail -- 2 GB of storage and industry-leading spam 
and email virus protection.
___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/

Re: [computer-go] MC approach (was: Monte Carlo (MC) vs Quasi-Monte Carlo (QMC))

2007-02-07 Thread terry mcintyre
If I recall correctly, someone spoke of constraining the opening moves to the 
3rd,4th,and 5th lines in the absence of nearby stones, or something to that 
effect. What was the impact of this experiment? I notice the recent discussion 
of the need for a lot of thinking time to find good opening moves; would this 
thinking time be reduced with a better-than-random selection of opening moves 
during the playouts? If fuseki and joseki databases were used to bias the 
playouts, would the speed and quality of opening play improve?

One concern about using expert knowledge of this sort - what happens when one's 
opponent plays out of book? Human players often find this frustrating - we 
sense that moves which deviate from joseki are wrong, but finding the correct 
refutation under time pressure is not always easy.

To address this concern: at least one MC player uses an opening book; would it 
be profitable to automatically analyze past losses, and devote a few 100k 
playouts to look for better replies to plays which were favored by opponents in 
past games? This would be the analogue to advice often given to human players: 
study your own lost games; look for improvements.

Unfortunately, an opening book does not generalize well; learning a better move 
for position Xi won't help with position Xj which differs by even one stone 
from Xi. What sort of move generators would cope with the vast number of 
similar positions? A single stone ( a ladder-breaker or a key point which makes 
or denies a second eye, for instance ) can make a large difference in the 
score, but one hopes that MC playouts would discover the negative consequences 
of moves which are almost but not quite right.

 


 

Don't pick lemons.
See all the new 2007 cars at Yahoo! Autos.
http://autos.yahoo.com/new_cars.html ___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/

Re: [computer-go] MC approach (was: Monte Carlo (MC) vs Quasi-Monte Carlo (QMC))

2007-02-07 Thread terry mcintyre
What sort of sampling was used for the playouts? For this variable ( 
incorporating some information about the score vs only the win-loss variable ), 
does it make a difference whether playouts are totally random or incorporate 
varying degrees of similitude to good play?

From: [EMAIL PROTECTED] [EMAIL PROTECTED]


 I should have mentioned that I have only tested on 9x9. For larger boards, 
I don't know.



- Dave Hillis



 



`







 Intuitively, it seems like this should work. You only give the winning 
margin a small weight, or only use it to break ties, or only apply it after the 
game is already decided. I've tried many variations, as have others, including 
your exact algorithm above. It can make some moves look a little prettier, but 
it always causes problems and I have to take it out.







 I have my theories as to why that is, but for brevity's sake, in my 
experience, giving any consideration to winning margin is detrimental. After 
the game is decided, there are more elegant ways to bring it to a close. I came 
to this conclusion reluctantly.



 



 







 



___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/






 

Get your own web address.  
Have a HUGE year through Yahoo! Small Business.
http://smallbusiness.yahoo.com/domains/?p=BESTDEAL___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/

Re: [computer-go] MC approach (was: Monte Carlo (MC) vs Quasi-Monte Carlo (QMC))

2007-02-07 Thread Richard J. Lorentz

terry mcintyre wrote:
If I recall correctly, someone spoke of constraining the opening moves 
to the 3rd,4th,and 5th lines in the absence of nearby stones, or 
something to that effect. What was the impact of this experiment?


For what it's worth, I tried a number of experiments along these lines 
and none of them produced any improvement in play whatsoever. I'm still 
baffled by this. Sylvain says, both in his paper and in a message posted 
here, that it seems that one must try to constrain moves based on 
sequences of moves rather than individual moves in isolation. All rather 
mysterious to me.


-Richard
___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/

Re: [computer-go] MC approach (was: Monte Carlo (MC) vs Quasi-Monte Carlo (QMC))

2007-02-07 Thread dhillismail
 The tests that involved factoring in the margin of victory while the game 
was still in play, used pure random playouts (or relatively close to it.) 
Later, I did some tests with esthetics as a goal in itself, and for these, I 
used what I call a heavy playout. I doubt that the playout type makes a 
difference but I don't know that for sure. If there was significant curiosity 
on this list about a specific configuration, I could probably run a quick test.
 
- Dave Hillis
 
 
-Original Message-
From: [EMAIL PROTECTED]
To: computer-go@computer-go.org
Sent: Wed, 7 Feb 2007 12:42 PM
Subject: Re: [computer-go] MC approach (was: Monte Carlo (MC) vs Quasi-Monte 
Carlo (QMC))


What sort of sampling was used for the playouts? For this variable ( 
incorporating some information about the score vs only the win-loss variable ), 
does it make a difference whether playouts are totally random or incorporate 
varying degrees of similitude to good play?


From: [EMAIL PROTECTED] [EMAIL PROTECTED]


 I should have mentioned that I have only tested on 9x9. For larger boards, 
I don't know.
- Dave Hillis
 
`

 Intuitively, it seems like this should work. You only give the winning 
margin a small weight, or only use it to break ties, or only apply it after the 
game is already decided. I've tried many variations, as have others, including 
your exact algorithm above. It can make some moves look a little prettier, but 
it always causes problems and I have to take it out.
 I have my theories as to why that is, but for brevity's sake, in my 
experience, giving any consideration to winning margin is detrimental. After 
the game is decided, there are more elegant ways to bring it to a close. I came 
to this conclusion reluctantly.
 
 

 
___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/





Any questions? Get answers on any topic at Yahoo! Answers. Try it now. 
___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/

Check Out the new free AIM(R) Mail -- 2 GB of storage and industry-leading spam 
and email virus protection.
___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/

Re: [computer-go] MC approach

2007-02-07 Thread Matt Gokey

Don Dailey wrote:


On Wed, 2007-02-07 at 11:34 +0100, Heikki Levanto wrote:


All this could be avoided by a simple rule: Instead of using +1 and -1
as the results, use +1000 and -1000, and add the final score to this.



Heikki,

I've tried ideas such as this in the past and it's quite
frustrating - everything that tries to take territory
scoring into account weakens the program.  


If you just need to see prettier moves,  I think it is
good enough to priortize the moves using some other
algorithm at the root of the tree.   If you just cover
the case where a program is easily winning or losing
it will play nicer but not stronger.


Don, do you have any theories or information about why this is the case?

I would think either way the algorithm should always prefer higher
average win probabilities, but faced with alternatives where the win
probabilities are same or nearly the same but the average winning
margins are higher for one alternative, wouldn't it be better to take
the path with the better margin? I mean it may in fact be wrong about
the win/loss classifications so choosing the better scores would seem to
make sense within reason as long as it's not greedy.


___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/


Re: [computer-go] MC Go Effectiveness

2007-02-07 Thread Matt Gokey

Magnus Persson wrote:

Quoting Matt Gokey [EMAIL PROTECTED]:


(snip)


Good point.  This leads to another thought that I have been wondering 
about.  That is I question whether using more time to search more 
simulations in the opening is the best approach.  For the opening, 
selecting reasonable robust moves that tend to lead to more favorable 
options is probably a good objective.  The lengths of the simulation 
are perhaps too long to expect anything better.  Later towards the 
pre-middle to middle game it is very critical to play such that the 
positions tactical potential is exploited such to secure connections 
and eye space, etc.  It would seem to me that focusing the highest 
concentration of time and number of simulations during this part of 
the game might be most advantageous.


It would be interesting for someone with a decent MC player to do an 
experiment like this with one version concentrating highest number of 
simulations in the opening and one concentrating in the middle game, 
but otherwise equal and see which version wins more often.


My experience with Valkyria is that most time must be allocated towards 
early moves. If you make a mistake on 9x9 it is very hard to come back. MC/UCT

programs such as MoGo are very good at defending a won position. Therefore it
is important to get a won position as early as possible. The longer it 
thinks the better it plays. In the opening it always critical to find the best 
move - but later on games are often either won or lost so saving time for those

positions are not so important.

I agree with this last statement for the early end-game / end-game
phase.  For the pre-middle to middle I'm not sure.  This is the
critical part of the game where you need to solidify the winning
positions.  As long as reasonable opening moves that provide robust
options are made and it can play into the middle game stronger than the
opponent, winning positions should result.

I'm not saying not to spend time up front, just less than when in the
middle.  On 9x9, the stage where this becomes tactically critical is
probably much earlier than on 19x19.  Even with 9x9 I predict an MC/UCT
program that spent X simulations per move for the first 8 moves, and 2X
for the next 8 would tend to beat the same player doing the reverse,
where X is some reasonably large number of simulations needed to open
decently.  I could be wrong, but it would be an interesting experiment
anyway.  However IMO 9x9 is too small to see the real complications come
out.

Valkyria saves time in the opening by using an opening library, but as 
soon as it leaves the library it spends a lot of time on the first move. Moves it

spends a lot of time on is also stored in the library. And later on I might
correct moves that was played in lost hand myself. I am actually rarely
satisfied with the opening moves of Valkyria. It needs more time for those
moves...

I was wondering if anyone was combining an opening library with MC/UCT
by running a large number of the simulations and storing the results.
This seems like a pretty natural extension to save simulation time in
the opening.

How strong a player are you? You are probably unfairly evaluating
Valkyria based on your strong expert play/perspective.  I'm rather
amazed that MC simulations find good moves at all given that most of the
playouts are nonsense games.  That is why I say MC/UCT is finding
reasonably robust moves with more favorable options (strategic play) not
necessarily great/best moves.  Because of mostly meaningless playouts it
misses nuances and tactics deeper into the game that would show
otherwise.  It seems the deeper the simulations the worse this effect
would be.

-Matt




___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/


Re: [computer-go] MC approach

2007-02-07 Thread Nick Apperson

If it only did one playout you would be right, but imagine the following
cases:

case 1: White wins by .5 x 100, Black wins by .5 x 100
case 2: White wins by 100.5 x 91, Black wins by .5 x 109

the method that takes into account score would prefer the second case even
though it has a lower winning percentage that may be represented by the fact
that white is making an overplay for instance.  Obviously this is just one
example, but there are many cases like this and overplays tend to be
priveledged in a sense I would suspect with this kind of algorithm.

- Nick

On 2/7/07, Matt Gokey [EMAIL PROTECTED] wrote:


Don Dailey wrote:

 On Wed, 2007-02-07 at 11:34 +0100, Heikki Levanto wrote:

All this could be avoided by a simple rule: Instead of using +1 and -1
as the results, use +1000 and -1000, and add the final score to this.


 Heikki,

 I've tried ideas such as this in the past and it's quite
 frustrating - everything that tries to take territory
 scoring into account weakens the program.

 If you just need to see prettier moves,  I think it is
 good enough to priortize the moves using some other
 algorithm at the root of the tree.   If you just cover
 the case where a program is easily winning or losing
 it will play nicer but not stronger.

Don, do you have any theories or information about why this is the case?

I would think either way the algorithm should always prefer higher
average win probabilities, but faced with alternatives where the win
probabilities are same or nearly the same but the average winning
margins are higher for one alternative, wouldn't it be better to take
the path with the better margin? I mean it may in fact be wrong about
the win/loss classifications so choosing the better scores would seem to
make sense within reason as long as it's not greedy.


___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/

___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/

Re: [computer-go] MC Go Effectiveness

2007-02-07 Thread Don Dailey
On Wed, 2007-02-07 at 14:10 -0600, Matt Gokey wrote:
 I was wondering if anyone was combining an opening library with MC/UCT
 by running a large number of the simulations and storing the results.
 This seems like a pretty natural extension to save simulation time in
 the opening. 

This is not particular effective if you mean to just save the tree.

But here is what I do about the opening which seems to help some:

  1.  Note the very first move out of book that the program is asked
  to search.   Write the move sequence to a file.  

  2.  Have a program running off-line that processes the positions
  first encountered out of book.

  The program searches each position 8 times using several minutes
  per search.   This is to provide variety.  The moves returned
  from the search are placed into the opening book.

  3.  The moves are played in the same proportion they were chosen.
 
  For instance if e5 is chosen 7 times and d5 1 time,  the program
  will play e5 7 out of 8 times.

The book searching code of course does all the board rotations and
reflections - and when a move is chosen a random orientation is chosen
if it's appropriate.For instance after the opening move e5 if
d5 is a choice then it's equally likely to play d5, f5, e4 or e6.


I'm building the book off-line and new positions are presented faster
than new book responses can be generated.So I'm taking Lazarus
off-line once in a while so the book building procedure can catch up.

I don't know how much this really helps - it seems to not help much 
until the book gets fairly large.There are 2 ways it helps of
course - the opening moves are deep searched so presumably they might
be higher quality, and the time saved can be spent to improve the
quality of the later moves.

Lazarus spends a lot more time on early moves so this makes it possible
to save a lot of time on the first 2 or 3 moves and spend it on later
moves.  

I'm in a building phase now - the book is still quite small and has 91
positions at the moment.   Most of the moves it's generating are 3 or
4 ply (moves) into the game, but some are as deep as 10 ply into the
game.   Probably positions from opponents who have fixed playing 
algorithms or books.   

At some point I intend to do some kind of analysis on the win/loss
percentage of certain moves. One possibility is to mini-max the
book lines.  

This way of making a book has severe disadvantages.  I think it works
great on 9x9 but if the program is improved, the book is suddenly 
not very relevant and you have to start over.   Since I am not a
strong player I have no way of fixing up the book.  I can't recognize
when it chooses a weak move or when a stronger move is available 
even if it doesn't like it.

I have a hash funciton that creates a 64 bit cannonical hash of
the position.   The same hash is produced after a rotation for
example.   A book entry is a record with 3 fields,  a priority
value (how many times the deep search selected this position), 
the cannonical hash of the parent position and the cannonical
hash of the posiiton AFTER the move is played.This makes
collision very unlikely.The cannonical hash takes into
consideration simple ko,  so if a ko is possible it will hash
differently than the same position where the ko is illegal.


- Don

 

  

___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/


Re: [computer-go] MC Go Effectiveness

2007-02-07 Thread Don Dailey
On Wed, 2007-02-07 at 16:17 -0500, Don Dailey wrote:
 I have a hash funciton that creates a 64 bit cannonical hash of
 the position.   The same hash is produced after a rotation for
 example.   A book entry is a record with 3 fields,  a priority
 value (how many times the deep search selected this position), 
 the cannonical hash of the parent position and the cannonical
 hash of the posiiton AFTER the move is played.This makes
 collision very unlikely.The cannonical hash takes into
 consideration simple ko,  so if a ko is possible it will hash
 differently than the same position where the ko is illegal. 

Here is some more detail to make this clearer:

typedef struct
{
  intcount;   // number of times this position/response seen
(priority)
  u64key; // cannonical position key
  u64resp;// resulting cannonical position

} book_entry;


These book_entry records are stored in a file and I keep them 
sorted.   So the procedure to see if there is a book move is
to binary search the file on the parent position key,  
collect all of these records together (making sure there is a
legal move which leads to the cannonical response key) and then
choose one of the available moves in proportion to the 
priority values (the count field.)

- Don




___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/


Re: [computer-go] MC approach

2007-02-07 Thread terry mcintyre
That drives me nuts! Minimax search would eliminate bad lines of play whenever 
a refutation is found. A good opponent would not play badly, and the quantity 
of possible bad moves should not affect the evaluation of good moves - but that 
seems to be what MC does, averaging out all moves regardless of whether they 
are known to be good,  have been refuted, or are of indeterminate status.

What am I missing?
 
Terry McIntyre


- Original Message 
From: Nick Apperson [EMAIL PROTECTED]

If it only did one playout you would be right, but imagine the following cases:

case 1: White wins by .5 x 100, Black wins by .5 x 100
case 2: White wins by 100.5 x 91, Black wins by .5 x 109

the method that takes into account score would prefer the second case even 
though it has a lower winning percentage that may be represented by the fact 
that white is making an overplay for instance.  Obviously this is just one 
example, but there are many cases like this and overplays tend to be 
priveledged in a sense I would suspect with this kind of algorithm.


- Nick






 

Get your own web address.  
Have a HUGE year through Yahoo! Small Business.
http://smallbusiness.yahoo.com/domains/?p=BESTDEAL___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/

Re: [computer-go] Paper presents results on proximity heuristic

2007-02-07 Thread Don Dailey
You should never play suicide, whether multiple or single stone
in the play-out portion of the search - ESPECIALLY when it's not
legal anyway in the rule-set you are using.

Although suicide can occasionally be the best move in some
rule-sets,  I think it weakens your program to include it,
even in the UCT tree portion of an MC search.  If the rule-set
allows suicide, I believe it's best to just let your program
accept suicide moves from the opponent.

- Don



On Wed, 2007-02-07 at 13:23 -0800, Christoph Birk wrote:
 On Wed, 7 Feb 2007, Chris Fant wrote:
  Your paper does not mention suicide.  Prohibiting single-stone suicide
  during playout gave me a nice increase in playout rate and strength.
 
 Does you program allow multiple-stone suicide during playout?
 myCtest does NOT allow it; and neither does GenericMC_ by Don.
 
 Christoph
 
 ___
 computer-go mailing list
 computer-go@computer-go.org
 http://www.computer-go.org/mailman/listinfo/computer-go/

___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/


Re: [computer-go] MC approach

2007-02-07 Thread Don Dailey
On Wed, 2007-02-07 at 14:08 -0600, Matt Gokey wrote:
 Don, do you have any theories or information about why this is the
 case?

Not really.  In truth the only thing that matters is to increase your
winning percentage - not your score.   There seems to be no point in
tampering with this. 

- Don



 I would think either way the algorithm should always prefer higher
 average win probabilities, but faced with alternatives where the win
 probabilities are same or nearly the same but the average winning
 margins are higher for one alternative, wouldn't it be better to take
 the path with the better margin? I mean it may in fact be wrong about
 the win/loss classifications so choosing the better scores would seem
 to
 make sense within reason as long as it's not greedy.
 
 

___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/


Re: [computer-go] Monte-Carlo Go Misnomer?

2007-02-07 Thread Ephrim Khong
Hello,

David Doshay wrote:
 On 3, Feb 2007, at 2:51 AM, Sylvain Gelly wrote:

 the speed of the best simulation policy in MoGo is 0.6 * the speed
 of the uniform random one.

 I think that this is very good. You give up less than a factor of 2
 from uniform random and you probably get better than a factor of 2 in
 the % of relevant moves.

Is there any known (by theory or tests) function of how much a increase
in the strength of the simulation policy increases the strength of the
MC/UCT Program as a whole?

Imagine you're using a almost random playout that has a strength, if
playing alone, of say 10 ELO (just a number, I don't know how strong it
really is). Using some thousand MC simulations with that 10 ELO playout
and some other clever tricks like UCT, and you get a program that plays
with 1500 ELO.

Now what if you change the playout to one that has a strength, if playing
alone, of 20 ELO, 50 ELO or 100 ELO. How whould that affect the strength
of your overall programm, if you run the same number of MCs? Any known
numbers on that?

[I understand that a non-random playout would decrease the speed, so
couln't run as many simulations as before in the same time, but let's
imagine we get it for free.]

eph

___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/


Re: [computer-go] Paper presents results on proximity heuristic

2007-02-07 Thread Chris Fant

Okay, thanks for the feedback.  I mentioned that I was allowing
multi-stone suicide a couple days ago but no one said anything.  It
seems little more complicated to check for than single-stone suicide
when only tracking pseudo-liberties.  But I will get it in there and
see what kind of improvement happens and how much it affects the
speed.


On 2/7/07, Don Dailey [EMAIL PROTECTED] wrote:

You should never play suicide, whether multiple or single stone
in the play-out portion of the search - ESPECIALLY when it's not
legal anyway in the rule-set you are using.

Although suicide can occasionally be the best move in some
rule-sets,  I think it weakens your program to include it,
even in the UCT tree portion of an MC search.  If the rule-set
allows suicide, I believe it's best to just let your program
accept suicide moves from the opponent.

- Don



On Wed, 2007-02-07 at 13:23 -0800, Christoph Birk wrote:
 On Wed, 7 Feb 2007, Chris Fant wrote:
  Your paper does not mention suicide.  Prohibiting single-stone suicide
  during playout gave me a nice increase in playout rate and strength.

 Does you program allow multiple-stone suicide during playout?
 myCtest does NOT allow it; and neither does GenericMC_ by Don.

 Christoph

 ___
 computer-go mailing list
 computer-go@computer-go.org
 http://www.computer-go.org/mailman/listinfo/computer-go/

___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/


___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/


Re: [computer-go] MC approach

2007-02-07 Thread dhillismail
 It's a *weighted* average of all moves. UCT tree search doesn't explore 
bad moves as often as good ones, so they don't contribute as much to the 
estimation of the worth of a node.
 
- Dave Hillis
 
 
-Original Message-
From: [EMAIL PROTECTED]
To: computer-go@computer-go.org
Sent: Wed, 7 Feb 2007 4:31 PM
Subject: Re: [computer-go] MC approach


That drives me nuts! Minimax search would eliminate bad lines of play whenever 
a refutation is found. A good opponent would not play badly, and the quantity 
of possible bad moves should not affect the evaluation of good moves - but that 
seems to be what MC does, averaging out all moves regardless of whether they 
are known to be good,  have been refuted, or are of indeterminate status.

What am I missing?

 
Terry McIntyre




- Original Message 
From: Nick Apperson [EMAIL PROTECTED]

If it only did one playout you would be right, but imagine the following cases:

case 1: White wins by .5 x 100, Black wins by .5 x 100
case 2: White wins by 100.5 x 91, Black wins by .5 x 109

the method that takes into account score would prefer the second case even 
though it has a lower winning percentage that may be represented by the fact 
that white is making an overplay for instance.  Obviously this is just one 
example, but there are many cases like this and overplays tend to be 
priveledged in a sense I would suspect with this kind of algorithm. 

- Nick






It's here! Your new message!
Get new email alerts with the free Yahoo! Toolbar. 
___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/

Check Out the new free AIM(R) Mail -- 2 GB of storage and industry-leading spam 
and email virus protection.
___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/

Re: [computer-go] MC approach

2007-02-07 Thread Chris Fant

The only think I can suggest which perhaps hasn't been tried is to
only consider the score in the evaluation if NONE of the playouts
resulted in a loss.


On 2/7/07, [EMAIL PROTECTED] [EMAIL PROTECTED] wrote:


 It's a *weighted* average of all moves. UCT tree search doesn't explore
bad moves as often as good ones, so they don't contribute as much to the
estimation of the worth of a node.

- Dave Hillis


 -Original Message-
 From: [EMAIL PROTECTED]
 To: computer-go@computer-go.org
 Sent: Wed, 7 Feb 2007 4:31 PM
 Subject: Re: [computer-go] MC approach



That drives me nuts! Minimax search would eliminate bad lines of play
whenever a refutation is found. A good opponent would not play badly, and
the quantity of possible bad moves should not affect the evaluation of good
moves - but that seems to be what MC does, averaging out all moves
regardless of whether they are known to be good,  have been refuted, or are
of indeterminate status.

 What am I missing?

  Terry McIntyre




- Original Message 
 From: Nick Apperson [EMAIL PROTECTED]

 If it only did one playout you would be right, but imagine the following
cases:

 case 1: White wins by .5 x 100, Black wins by .5 x 100
 case 2: White wins by 100.5 x 91, Black wins by .5 x 109

 the method that takes into account score would prefer the second case even
though it has a lower winning percentage that may be represented by the fact
that white is making an overplay for instance.  Obviously this is just one
example, but there are many cases like this and overplays tend to be
priveledged in a sense I would suspect with this kind of algorithm.

 - Nick


 
 It's here! Your new message!
 Get new email alerts with the free Yahoo! Toolbar.
___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/


 
 Check Out the new free AIM(R) Mail -- 2 GB of storage and industry-leading
spam and email virus protection.

___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/


___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/


Re: [computer-go] Paper presents results on proximity heuristic

2007-02-07 Thread Peter Drake
By the way, the paper was rejected on first submission, largely  
because we were just testing Orego against itself. We're now testing  
Orego against GNU Go and have a revised version:


https://webdisk.lclark.edu/xythoswfs/webui/_xy-2352013_1-t_Gct7yJ5s%22

(Markus, could you change the link and title? This was DPSV07.)

Peter Drake
Assistant Professor of Computer Science
Lewis  Clark College
http://www.lclark.edu/~drake/




On Feb 7, 2007, at 12:36 PM, Chris Fant wrote:


In this paper, you say that you limit the number of moves to
BoardArea*2 during the playouts.  For me, this barely increases the
playout rate and slightly reduces the strength (perhaps not
statistically significant).

Your paper does not mention suicide.  Prohibiting single-stone suicide
during playout gave me a nice increase in playout rate and strength.


On 11/28/06, Peter Drake [EMAIL PROTECTED] wrote:

Here it is:

https://webdisk.lclark.edu/xythoswfs/webui/_xy-2115826_1-t_OX34gnaB

Peter Drake
Assistant Professor of Computer Science
Lewis  Clark College
http://www.lclark.edu/~drake/



___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/


___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/

___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/


Re: [computer-go] MC Go Effectiveness

2007-02-07 Thread Magnus Persson

Quoting Matt Gokey [EMAIL PROTECTED]:


Magnus Persson wrote:

Quoting Matt Gokey [EMAIL PROTECTED]:



I was wondering if anyone was combining an opening library with MC/UCT
by running a large number of the simulations and storing the results.
This seems like a pretty natural extension to save simulation time in
the opening.


I actually first tried to save the entire search tree for every position. Then
it would load the position and search deeper every time it was 
encountered. The
problem was simply disk storage, and there is always positions where 
the program

itself cannot in a reasonable time find strong moves by searching.


How strong a player are you? You are probably unfairly evaluating
Valkyria based on your strong expert play/perspective.  I'm rather
amazed that MC simulations find good moves at all given that most of the
playouts are nonsense games.  That is why I say MC/UCT is finding
reasonably robust moves with more favorable options (strategic play) not
necessarily great/best moves.  Because of mostly meaningless playouts it
misses nuances and tactics deeper into the game that would show
otherwise.  It seems the deeper the simulations the worse this effect
would be.


I am european 2Dan. I agree that MC/UCT finds robust moves, it is more 
important

to avoid mistakes in go rather than finding the best move. There often little
difference between alternative bestlooking moves but a blunders can be huge.
I still disagree though a little on misses nuances and tactics. I 
think there

is always a signal from the simulations that MC can pick up to some degree.
The signals that get lost or cannot be distinguished are discoverd by 
searching

deeper with UCT.

-Magnus
___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/


Re: [computer-go] Monte-Carlo Go Misnomer?

2007-02-07 Thread Weston Markham

Unfortunately, it's just not that simple, because it depends far more
on _how_ the playout is improved, rather than, say, the ELO numbers
that measure that improvement.  For example, programs exist that are
good (in part) because they entirely disregard some lines of play.
They may be correct to disregard these lines in almost every case,
which generally makes the playout program stronger.  However, for the
few cases where this heuristic pruning is not correct, the calling
program will suffer greatly, because these lines of play become
completely invisible to the random playouts, no matter how many
playouts are performed.  As an extreme example, consider programs that
play completely deterministic strategies.  These are obviously useless
as random players, yet it is in principle possible to construct ones
that play arbitrarily well.

Weston

On 2/7/07, Ephrim Khong [EMAIL PROTECTED] wrote:

Is there any known (by theory or tests) function of how much a increase
in the strength of the simulation policy increases the strength of the
MC/UCT Program as a whole?

___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/


Re: [computer-go] Effective Go Library v0.101

2007-02-07 Thread Eric Boesch

On 2/7/07, steve uurtamo [EMAIL PROTECTED] wrote:


 And of course don't forget about this no_variable_in_for thing...

i'll have to read up on what you're describing.



The following pseudocode block

for (int i = 0; i  10; ++i) {
... code ...
}
// i's lifetime ends after the for loop does

is valid in just about any version of C++ and in the 1999 ISO C standard
(also known as C99), but it is not valid in most older C standards. (Some
older versions of gcc would accept this code but assign the wrong lifetime
(according to the standard) to variable i. If you want to test this code
with gcc, then use the -std=c99 flag, which, as of quite recently at least,
was not enabled by default.) There are at least a couple other C++-isms --
offhand, the // single-line comment form and variable declarations in the
middle of code blocks come to mind -- that are also valid in C99 but invalid
in at least some of the older C standards.

I'm not trying to run off on a language standards tangent! I just feared
that we might be headed towards one anyway, and I wanted to make sure the
information in it was not 8 years out of date.

Sorry, now back to go (I hope)...
___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/

Re: [computer-go] MC approach

2007-02-07 Thread Heikki Levanto
On Wed, Feb 07, 2007 at 04:42:01PM -0500, Don Dailey wrote:
 In truth the only thing that matters is to increase your winning
 percentage - not your score.   There seems to be no point in tampering
 with this. 

I guess I must accept the wisdom of those who have tried these things.
Still, it hurts my intuition that it could be better for a program to
choose a line where it seems to win by 2 points, when another line seems
to end in a 100 point win. What if the opponent can improve his play
from what I expected, and gain an extra 3 points somewhere?

Maybe all this shows that we have not (yet?) understood all the
complications of the MC evaluation, and that more research is needed?

-H



-- 
Heikki Levanto   In Murphy We Turst heikki (at) lsd (dot) dk

___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/


Re: [computer-go] MC Go Effectiveness

2007-02-07 Thread Unknown
On Wed, 2007-02-07 at 16:28 -0500, Don Dailey wrote:
 On Wed, 2007-02-07 at 16:17 -0500, Don Dailey wrote:
  I have a hash funciton that creates a 64 bit cannonical hash of
  the position.   The same hash is produced after a rotation for

Most people do incremental hashing, which is cheaper even if 8(16)
symmetric copies are to be maintained. YMMV.
 
  example.   A book entry is a record with 3 fields,  a priority
  value (how many times the deep search selected this position), 
  the cannonical hash of the parent position and the cannonical
  hash of the posiiton AFTER the move is played.This makes
  collision very unlikely.The cannonical hash takes into
  consideration simple ko,  so if a ko is possible it will hash
  differently than the same position where the ko is illegal. 
 
 Here is some more detail to make this clearer:
 

Since you seem to be replying to yourself, I'll add some comments.

 typedef struct

typedef is useless here, IMO.
(but this is a matter of style, I agree)

 {
   intcount;   // number of times this position/response seen
 (priority)

Warning: alignment will cause this struct to be 3* sizeof(U64), wastong
32 bits.
Warning2: if the count is never negative, unsigned would be more
appropiate.

   u64key; // cannonical position key
   u64resp;// resulting cannonical position

Warning: you are wasting (64- ~9) bits here, since the response stems
from a set of 361+1 choices, maximal.
(this would be different if you intend to search backwards in the
tree/dag, with 'resp' as search key)


 } book_entry;

That could reduce your struct to:

struct booklet {
u64 key;
u32 count;
u16 move;
/* you *could* store the de-canonilisation-info here: */
u16 spare;
};

, which will take only 2*64 bits.


 
 These book_entry records are stored in a file and I keep them 
 sorted.   So the procedure to see if there is a book move is
Sorted on key? (Keep them sorted == resort periodically)
In that case, some kind of hashing would seem more appropiate, IMHO

 to binary search the file on the parent position key,  
 collect all of these records together (making sure there is a
 legal move which leads to the cannonical response key) and then

You are not clear here.
Is there only a one-move-step between key and resp ?
Why not store the move in the first place ? (instead of the resulting
hash)

 choose one of the available moves in proportion to the 
 priority values (the count field.)


HTH,
AvK

___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/


Re: [computer-go] MC approach

2007-02-07 Thread Don Dailey
On Thu, 2007-02-08 at 00:46 +0100, Heikki Levanto wrote:
 On Wed, Feb 07, 2007 at 04:42:01PM -0500, Don Dailey wrote:
  In truth the only thing that matters is to increase your winning
  percentage - not your score.   There seems to be no point in tampering
  with this. 
 
 I guess I must accept the wisdom of those who have tried these things.
 Still, it hurts my intuition that it could be better for a program to
 choose a line where it seems to win by 2 points, when another line seems
 to end in a 100 point win. What if the opponent can improve his play
 from what I expected, and gain an extra 3 points somewhere?
 
 Maybe all this shows that we have not (yet?) understood all the
 complications of the MC evaluation, and that more research is needed?
 
 -H


I agree, more research is needed.

My intuition is also the same as yours - one would think 100 point
win is better.   It's not that it's better or it's a better move
objectively,  but it should be better against a fallible opponent
who could easily get distracted by the little battles and forget
the real point of the game.   At least it's a way to fight when
you are losing.

- Don



___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/


Re: [computer-go] MC Go Effectiveness

2007-02-07 Thread Weston Markham

On 2/7/07, Unknown [EMAIL PROTECTED] wrote:

 to binary search the file on the parent position key,
 collect all of these records together (making sure there is a
 legal move which leads to the cannonical response key) and then

You are not clear here.
Is there only a one-move-step between key and resp ?
Why not store the move in the first place ? (instead of the resulting
hash)


128 bits of hash vs 64 bits.  From his first post:  This makes
collision very unlikely.

Weston
___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/


Re: [computer-go] MC Go Effectiveness

2007-02-07 Thread Don Dailey
On Thu, 2007-02-08 at 01:00 +0100, Unknown wrote:
 On Wed, 2007-02-07 at 16:28 -0500, Don Dailey wrote:
  On Wed, 2007-02-07 at 16:17 -0500, Don Dailey wrote:
   I have a hash funciton that creates a 64 bit cannonical hash of
   the position.   The same hash is produced after a rotation for
 
 Most people do incremental hashing, which is cheaper even if 8(16)
 symmetric copies are to be maintained. YMMV.

I'm an expert in zobrist incremental hashing and have been doing
it since the early years of computer chess.   However, I have 
some other considerations here:

No hashing is still faster than incremental hashing - which is why
I don't do any hashing during the play-out phase.

Since I am only hashing for the purpose of building or looking up
a book position,  this is the best approach.   If I wanted to get
clever, I could pass a flag to the move maker to tell it whether
to do hashing or not and then do it incrementally,  but for all
this trouble I would not get a speedup I would ever notice since
I only hash to find a book move.   

 
   example.   A book entry is a record with 3 fields,  a priority
   value (how many times the deep search selected this position), 
   the cannonical hash of the parent position and the cannonical
   hash of the posiiton AFTER the move is played.This makes
   collision very unlikely.The cannonical hash takes into
   consideration simple ko,  so if a ko is possible it will hash
   differently than the same position where the ko is illegal. 
  
  Here is some more detail to make this clearer:
  
 
 Since you seem to be replying to yourself, I'll add some comments.
 
  typedef struct
 
 typedef is useless here, IMO.
 (but this is a matter of style, I agree)

There are some advantages to NOT doing it my way,  but I prefer
the concise way - I don't like having to define it as  
struct book_entry all over my code.   It's a matter of style.


  {
intcount;   // number of times this position/response seen
  (priority)
 
 Warning: alignment will cause this struct to be 3* sizeof(U64), wastong
 32 bits.
 Warning2: if the count is never negative, unsigned would be more
 appropiate.

I'm not concerned about space usage in the slightest because I have
about 100 book positions currently,  and in my dreams I might get
to a few thousand.

But you are right, I could save a few bits if I didn't worry about
structure alignment.


u64key; // cannonical position key
u64resp;// resulting cannonical position
 
 Warning: you are wasting (64- ~9) bits here, since the response stems
 from a set of 361+1 choices, maximal.
 (this would be different if you intend to search backwards in the
 tree/dag, with 'resp' as search key)

But if I use the compact approach I lose a lot of safety.   These
are hash keys and if I get a collision in key,  then it's extremely
unlikely that I will get a collision in any of the child hash keys.

But if I use your scheme,  I have very little extra safety (although
I still have a little bit since a move might prove to be legal in
one and not in the other,  but this is probably only worth an extra
bit or two.)

To be honest, 64 bits is probably safe enough for a few hundred moves,
unlikely to cause a collision.


 
  } book_entry;
 
 That could reduce your struct to:
 
 struct booklet {
   u64 key;
   u32 count;
   u16 move;
   /* you *could* store the de-canonilisation-info here: */
   u16 spare;
   };
 
 , which will take only 2*64 bits.
 

Based on the considerations I have presented, a better layout is
something like this:

   u64  key;
   u32  child_key:32;
   u32  count;

I'm extremely unlikely to get a collision with a 64 bit parent
and a 32 bit child,  so I would use something like this to save
space.

If I wanted to use bit fields I could do something like this if
I want to sacrafice a few more bits of safety:

  u64   key;
  uint  child_key:28;
  uint  count:4;

I would be happy with just a few bits in child_key because 
64 bits is reasonably safe by itself.   



  These book_entry records are stored in a file and I keep them 
  sorted.   So the procedure to see if there is a book move is
 Sorted on key? (Keep them sorted == resort periodically)
 In that case, some kind of hashing would seem more appropiate, IMHO

I don't need a complicated scheme.  The book is small and always
will be small and when I insert a record I just used insertion
sort, on the fly, which is faster than quicksort on a file that
is already sorted except for one record.   This is trivial easy
stuff and is bug-free.   I dump the book to a file using fwrite
and it's trivial.

If I use a hashing scheme, I have to worrry about resizes and
other complexities and how to put them on the disk.

If I really get to the point where I think I can generated a
huge book,  my plan would be to use the sqlite library and
I would store them using this very easy to use embedded database.


  to binary search the file on the parent position key, 

Re: [computer-go] MC approach

2007-02-07 Thread Weston Markham

But of course, it's not the size of the win that counts, it is rather
the confidence that it really is a win.  In random playouts that
continue from a position from a close game, the ones that result in a
large victory are generally only ones where the opponent made a severe
blunder.  (Put another way, the score of the game is affected more by
how bad the bad moves are, rather than how good the good ones are, or
even how good most of the moves are.  Others have commented on this
effect in this list, in other contexts.)  Since you can't count on
that happening in the real game, these simulations have a lower value
in the context of ensuring a win.

Even when the program is losing (say by a little bit) it is more
important to play moves that it thinks are the most likely to convert
the game to a win by confusing the opponent, rather than to play moves
that will make the losing outcome more apparent.  These tend to be
different moves, and Monte Carlo methods are good at uncovering these
differences.  I agree that this is a bit surprising, but I find it
much less so when I think about it in these terms.

Given that people have reported such a strong effect, I am actually
wondering if these simulations (those that result in a large score
difference) should be _penalized_, for not being properly
representative of the likely outcome of the game.  In other words:

value = 1000 * win - score

Weston

On 2/7/07, Don Dailey [EMAIL PROTECTED] wrote:

My intuition is also the same as yours - one would think 100 point
win is better.   It's not that it's better or it's a better move
objectively,  but it should be better against a fallible opponent
who could easily get distracted by the little battles and forget
the real point of the game.   At least it's a way to fight when
you are losing.

___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/


Re: [computer-go] MC Go Effectiveness

2007-02-07 Thread Don Dailey
After looking at your message I'm not sure you understand how
I set this up.


On Thu, 2007-02-08 at 01:00 +0100, Unknown wrote:
  to binary search the file on the parent position key,  
  collect all of these records together (making sure there is a
  legal move which leads to the cannonical response key) and then
 
 You are not clear here.
 Is there only a one-move-step between key and resp ?
 Why not store the move in the first place ? (instead of the resulting
 hash)

The book has 1 record per playable move.  If there are 4 responses
to some position in the book, there will be 4 records, one for each
possible response.   Those 4 keys will have the same key field
which is a cannonical key of the position.  But they will have
separate child keys (resp for resulting position) which are also
cannonical.   From some actual position it's trivial to convert
the cannonical child keys to actual moves using the cannoical hash. 

The count is just a priority system which right now happens
to corespond to how many times the book searcher wanted to play
the move in question (and since I set an arbitrary limit of 8
searches,  all the moves for a given position would total 8 if
all the lookups have been completed.) 

Doing 8 searches is time-consuming, but I really prefer a book
that has a LOT of variety.  

If I decide that the book stuff is really useful,  I will 
probably convert it to use a sqlite3 database which is quite
nice and easy to use and manage,  I might even place search
data such as the score and number of nodes searched in such
a database.

- Don


___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/


Re: [computer-go] MC approach

2007-02-07 Thread Hideki Kato
Matt Gokey: [EMAIL PROTECTED]:
Weston Markham wrote:

 But of course, it's not the size of the win that counts, it is rather
 the confidence that it really is a win.
Yes, and my reasoning was that a larger average win implied a higher 
confidence since there is more room for error.  That intuition may not 
hold though.
  In random playouts that
 continue from a position from a close game, the ones that result in a
 large victory are generally only ones where the opponent made a severe
 blunder.  (Put another way, the score of the game is affected more by
 how bad the bad moves are, rather than how good the good ones are, or
 even how good most of the moves are.  Others have commented on this
 effect in this list, in other contexts.)  Since you can't count on
 that happening in the real game, these simulations have a lower value
 in the context of ensuring a win.
That is the first reasonable argument I've heard that makes some sense 
as to why this effect may be true.  The opposite of course may be true 
as well and close games may really not be close due to the same blunder 
effect.  Perhaps it is just another symptom of the fact that most 
playouts are nonsense games.

We can test this effect by using, for example,
v = 0.5 * (1.0 + tanh (k * score)); // v is in [0...1].
with a little penalty of simulation speed. As k being lager, this 
function closes to commonly used threshold function, and vice versa.

I guess the best value of k depends on the sensefulness of the games, 
ie., current heuristics for pruning moves are not so effective that 
larger k is the best.

- gg
--
[EMAIL PROTECTED] (Kato)
___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/