hi Sylvain David,
Figure 3 in your UCT paper shows the accuracy of different simulation policies.
Could you repeat these experiments for accuracy of win/loss determination only?
So for each test position, you determine if it's won or lost under perfect play,
and then see how close each policy
Hello John,
Thank you for your interest.
Figure 3 in your UCT paper shows the accuracy of different simulation policies.
Could you repeat these experiments for accuracy of win/loss determination only?
Actually the labelled positions are rather end game positions, and are
labelled as 0/1
hi Sylvain,
Figure 3 in your UCT paper shows the accuracy of different simulation
policies.
Could you repeat these experiments for accuracy of win/loss determination
only?
Actually the labelled positions are rather end game positions, and are
labelled as 0/1 (loss/win). So we already are in
Yamato wrote:
Rémi,
May I ask you some more questions?
(1) You define Dj as Dj=Mij*ci+Bij. Is it not Aij but Bij?
What does this mean?
Yes, it is ! Thanks for pointing that mistake out.
(2) You have relatively few shape patterns. How large is each
pattern? 5x5, 7x7, or more?
I
Hi,
I have just updated my web page with the final version of my paper:
http://remi.coulom.free.fr/Amsterdam2007/
I have tried to improve it based on all your comments and questions, and
those of the workshop reviewer. I thank you all very much for your
interesting remarks.
I have not
On 5/18/07, Rémi Coulom [EMAIL PROTECTED] wrote:
My idea was very similar to what you describe. The program built a
collection of rules of the kind if condition then move. Condition
could be anything from a tree-search rule of the kind in this
particular position play x, or general rule such
Rémi,
May I ask you some more questions?
(1) You define Dj as Dj=Mij*ci+Bij. Is it not Aij but Bij?
What does this mean?
(2) You have relatively few shape patterns. How large is each
pattern? 5x5, 7x7, or more?
(3) You say the nth move is added when 40*1.4^(n-2) simulations
have
I also use an online learning algorithm in RLGO to adjust feature
weights during the game. I use around a million features (all
possible patterns from 1x1 up to 3x3 at all locations on the board)
and update the weights online from simulated games using temporal
difference learning. I also
Thanks for the great paper. And thanks for sharing it before it's
published.
Now I know what directions to take my engine in next.
Time for Team MoGo so share some more secrets :)
We are publishing MoGo's secrets at ICML 2007, in just over a month.
So not long to wait now!
-Dave
It seems that e-mail at my university does not work any more. I have
received none of the replies to my message of yesterday, but I could
read them on the web archives of the list. So I have registered from
another address, and will answer to the questions I have read on the web.
In section
Álvaro Begué wrote:
There are many things in the paper that we had never thought of, like
considering the distance to the penultimate move.
That feature improved the effectiveness of progressive widening a lot.
When I had only the distance to the previous move, and the opponent made
a
Yes, now I understand. I think what made it hard for me conceptually was that
P(Rj) can be rewritten n different ways for each feature ci 1 = i = n. I had
this problem with your example too. I first thought that the lines with the
factors were arbitarary, but then I realized that each line goes
On 5/17/07, Rémi Coulom [EMAIL PROTECTED] wrote:
Álvaro Begué wrote:
There are many things in the paper that we had never thought of, like
considering the distance to the penultimate move.
That feature improved the effectiveness of progressive widening a lot.
When I had only the distance to
Hi Rémi,
2007/5/17, Rémi Coulom [EMAIL PROTECTED]:
to Sylvain: Here are tests of Crazy Stone at 90s/game 1CPU against GNU
3.6 level 10, measured over about 200 games
[...]
Thank you for your answer. These numbers are interesting.
The improvement in the tree search is really huge. It is what I
Rémi Coulom wrote:
to Magnus: If you consider the example of section 2.2: 1,2,3 wins
against 4,2 and 1,5,6,7. The probability is
P=c1c2c3/(c1c2c3+c4c2+c1c5c6c7). For this example:
N1j=c2c3,B1j=0,M1j=c2c3+c5c6c7,A1j=c4c2
N2j=c1c3,B2j=0
N3j=c1c2,B3j=0
N4j=0,B4j=c1c2c3
I will add this example to
15 matches
Mail list logo