Re: [computer-go] Elo move prediction

2009-11-20 Thread dhillismail

Just a wild guess here. One way to get catastrophically slow performance is to 
have superfluous nested loops. For instance, one could iterate over each board 
space, calling a routine to check for legality. If the legality routine also 
iterates over the whole board (perhaps by updating all liberty counts), then 
things will get pretty slow. This kind of bug can be elusive because it does 
give the correct results.
- Dave Hillis


On Thu, Nov 19, 2009 at 12:35:09AM -0800, Seth Pellegrino wrote:
> I'm working on an implementation of Rémi Coulom's Elo-based move
> prediction algorithm[1], and I seem to be running into efficiency
> issues. With my current implementation, I'm running at around 8 or 10
> playouts per second. I can't imagine that the playouts would be so
> accurate that I would be able to accurately evaluate my next move at
> that speed. It seems that simply visiting each vacant point on the
> board and checking whether that move would be a legal move for the
> current player (without even computing which gammas are present at a
> point) takes me down to about 60-75 playouts per second.

___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/

Re: [computer-go] Elo move prediction

2009-11-20 Thread Petr Baudis
On Fri, Nov 20, 2009 at 05:19:01AM -0800, Peter Drake wrote:
> On Nov 20, 2009, at 12:08 AM, Petr Baudis wrote:
> 
> >On Thu, Nov 19, 2009 at 07:03:56PM -0800, Seth Pellegrino wrote:
> >>* I've already got a stable base that I'm working from (specifically,
> >>Peter Drake's Orego[1]).
> >
> > (You could also choose a better base to work from,
> >e.g.  libego, Pachi or Fuego, but maybe you are expected to use Orego
> >for your project. ;-)
> 
> In my defense (and Orego's), the speeds Seth has reported are not
> representative. With light playouts, running 4 threads on a 4-core,
> 3 GHz machine, Orego gets about 50 kpps on 9x9, about 16 on 19x19.
> (Of course, it's somewhat slower with heavy playouts, but not THAT
> bad!)
> 
> If I only run playouts (no tree), I can get those up to 35 and 8
> kpps with a single thread.

Thank you, that's good to know then. :) Sorry that I assumed this is
Orego's general speed.

P.S.: So this also gives a good idea about speeds possible in java
implementation (i.e. quite reasonable speeds).

Petr "Pasky" Baudis
___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/


Re: [computer-go] Elo move prediction

2009-11-20 Thread Peter Drake

On Nov 20, 2009, at 12:08 AM, Petr Baudis wrote:


On Thu, Nov 19, 2009 at 07:03:56PM -0800, Seth Pellegrino wrote:

* I've already got a stable base that I'm working from (specifically,
Peter Drake's Orego[1]).


 (You could also choose a better base to work from,
e.g.  libego, Pachi or Fuego, but maybe you are expected to use Orego
for your project. ;-)


In my defense (and Orego's), the speeds Seth has reported are not  
representative. With light playouts, running 4 threads on a 4-core, 3  
GHz machine, Orego gets about 50 kpps on 9x9, about 16 on 19x19. (Of  
course, it's somewhat slower with heavy playouts, but not THAT bad!)


If I only run playouts (no tree), I can get those up to 35 and 8 kpps  
with a single thread.


Peter Drake
http://www.lclark.edu/~drake/

___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/


Re: [computer-go] Elo move prediction

2009-11-20 Thread Thomas Lavergne
On Fri, Nov 20, 2009 at 10:31:21AM +0100, Petr Baudis wrote:
> For the liberties, of course that applies since they are used very often
> and take little memory (if you are sensible about it), but also always
> get in mind to avoid universal truths - not all features pay off to be
> maintained incrementally. E.g. for my heavy playouts, I tried to
> incrementally update features (mainly is-a-self-atari), but it turned
> out the cache hits for all the stored information _and_ the work
> involved in finding out what parts of the board I actually need to
> recompute noticeably outweighted the benefit. Maybe if I optimized it
> further, I would reduce the costs, but hardly much below the current
> costs of re-computing the features around the play condidate on each
> move.

For sure, there is no universal truth, but my point here that
recomputing some features have a hard additional cost.
For drawing play random from weighted points you have to keep these
weights and additional data-structure, in my case an array with sum of
line's weights and the total sum of weights.
If you need to recompute some features each time you have to draw a
play, you have to update these data each time and next to restore them
and this can also be very costly.

In effect, there is some features where this cost is still inferior to
the cost of maintaining them incrementaly, but for thoses I prefer to
fall back to approximations that can be easily computed incrementaly.

All of this really depend on the design of the engine I think, and I
have to ponder a bit what I have said. It's probaly something more like:
try to reduce to the minimum what you recompute each time in favor on
computing it incrementaly because it's generaly more cheaper, but make
some test ;-)

For the cache point, the proble here is it's very hardware specific. My
goban even with all fancy removed, never fit in a few L1 cache line, so
in all case it will be in L2, and there is enought room there for me
even on my old computer. As profiled, the prefetching done by the
processor seems to work well in my case and for the moment there is no
big problems here, but my big fear is that a small change in the engine
can always make this fall by introducing a pattern badly handle by the
processor... and another processor with small difference can badly
handle the current situation, so I take all cache informations with
care.

Tom


-- 
Thomas Lavergne"Entia non sunt multiplicanda praeter
 necessitatem." (Guillaume d'Ockham)
thomas.laver...@reveurs.orghttp://oniros.org
___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/


Re: [computer-go] Elo move prediction

2009-11-20 Thread Petr Baudis
On Fri, Nov 20, 2009 at 10:21:25AM +0100, Thomas Lavergne wrote:
> On Thu, Nov 19, 2009 at 08:25:53PM -0800, Seth Pellegrino wrote:
> > Apologies all for replying to myself so many times, but Darren Cook
> > has been kind enough to help me clear the matter up. Essentially, I
> > had misread his paper. I was trying to compute the expensive features
> > Rémi uses for progressive widening as a move selector for playouts
> > rather than the stripped down, speedy features he actually uses.
> > 
> > Thank you all for your help in increasing my understanding.
> > 
> > >> Also, Tom, I was wondering if you could speak to adding patterns
> > >> larger than 3x3. A nice feature of 3x3 patterns is that they can be
> > >> jammed into two bytes (4 possibilities [black, white, vacant, off
> > >> board] for 8 positions), which facilitates the creation of a lookup
> > >> table. I've yet to figure out as fast a method for 4x4 patterns (or
> > >> larger), so I was wondering both how you've done it and how much it
> > >> helped.
> 
> A thing to always get in mind is compute incrementaly, so for each
> features you have to check each time you play a stone, if this play
> chane features somewhere on the board.

For the liberties, of course that applies since they are used very often
and take little memory (if you are sensible about it), but also always
get in mind to avoid universal truths - not all features pay off to be
maintained incrementally. E.g. for my heavy playouts, I tried to
incrementally update features (mainly is-a-self-atari), but it turned
out the cache hits for all the stored information _and_ the work
involved in finding out what parts of the board I actually need to
recompute noticeably outweighted the benefit. Maybe if I optimized it
further, I would reduce the costs, but hardly much below the current
costs of re-computing the features around the play condidate on each
move.

-- 
Petr "Pasky" Baudis
A lot of people have my books on their bookshelves.
That's the problem, they need to read them. -- Don Knuth
___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/


Re: [computer-go] Elo move prediction

2009-11-20 Thread Thomas Lavergne
On Thu, Nov 19, 2009 at 08:25:53PM -0800, Seth Pellegrino wrote:
> Apologies all for replying to myself so many times, but Darren Cook
> has been kind enough to help me clear the matter up. Essentially, I
> had misread his paper. I was trying to compute the expensive features
> Rémi uses for progressive widening as a move selector for playouts
> rather than the stripped down, speedy features he actually uses.
> 
> Thank you all for your help in increasing my understanding.
> 
> >> Also, Tom, I was wondering if you could speak to adding patterns
> >> larger than 3x3. A nice feature of 3x3 patterns is that they can be
> >> jammed into two bytes (4 possibilities [black, white, vacant, off
> >> board] for 8 positions), which facilitates the creation of a lookup
> >> table. I've yet to figure out as fast a method for 4x4 patterns (or
> >> larger), so I was wondering both how you've done it and how much it
> >> helped.

A thing to always get in mind is compute incrementaly, so for each
features you have to check each time you play a stone, if this play
chane features somewhere on the board. For example, for the atari
feature, when you play a stone, just check if it's group or one of the
neighboorin groups go to one liberty, if it's the case, update the
weights and flags the group as "in atari". When you add this stone, you
also check if it's group is flagged as "in atari", if it's the case and
now it have more than one liberty, update the weight and flag.
In order to have this work, you must also add little update when
connecting and removing groups, but all this can be done incrementaly.

For the patterns, I keep two things:
- For all points on the board I keep a 16 bit value containing the 3x3
pattern of the surrounding cell like you said. This can be easily done
and update each on cell change. If you change the content of a cell,
just update the neighbooring patterns. This is very fast to do and
require no branching.
- For empty points only, a list of bigger patterns. First, those cost a
lot more so I update them only for empty cells. These are stored as
zobrist hash key, so for each cell I have an array of wiht hash value
for patterns from size 2 to 9.
When the content of one cell change, I have to lookup for empty cells in
it's 9-neighbooring, and for them I have to update the patterns. The
pattern update is done by first computing the old weights of modified
patterns, updating the patterns and updating the weight by the
difference between old and new weights.
When a stone is captured, his cell become empty and you have to
recompute each pattern for this cell because it was not updated
incrementally, but in practice, this cost you less than keeping all
non-empty cell upto date.

Surely this cost a lot more than simples 3x3 patterns but if implemented
carefully it's tracktable and I've got them even in playouts. In
playouts I don't go as far as size 9, just upto size 4. Without them I
do 41kpps and with them I've dropped to 26kpps as stated in  my previous
mail. This is one of the more costly addition to my engine but it was a
very big improvement in the playout quality.
A pure playout bots, i.e. no tree search, can play nicely one 9x9. It's
not very strong and I can beat it easily but his playing style is a lot
better.

For sure, on 19x19 the speeds are a lot lower but the impact of patterns
is just different. On one hand, on 9x9 each time you change a cell, you
almost have to update all the cell on the board as opposed to 19x19
where you just update a subset of them generally between 10% and 20% of
the cells. But, on the other hand, of these cells to update, on 19x19
there is more empty ones than on 9x9. On small board, it's hard to build
big territory, so the board finish almost full of stone. On 19x19 you
build big territory and each time on stone is played near the border of
aterritory, you have to update patterns of all empty cells in it.

So even if there is, less percent of the board covered by a change, a
bigger portion of this set is concerned by updates. I focus for the
moment on 9x9 so I have no bench for the 19x19 boards, but I will try to
make them. Last time I've looked, if I remember well I have similar drop
in performance than in 9x9, but this have to be checked more carefully.

The main things you don't want to do in playouts is complexe analysis
which can take a long time like ladder reading or life and death
reading. For all thing that you can simply compute incrementaly, it's
good to have the possibility to enable them also during playouts and
test if the improvement of the playouts value the slowdown. You never
can guess it without testing.

It's, in my opinion, one of the most difficult things to do. Choosing
exactly what enabling and what not require a lot of testing of a lot of
combination and the improvement of each of them is small so you have to
do a lot of games for seeing if there is an improvement. It's my big
damn that I don't have actually a computer that can be up and connec

Re: [computer-go] Elo move prediction

2009-11-20 Thread Petr Baudis
  Hi!

On Thu, Nov 19, 2009 at 07:03:56PM -0800, Seth Pellegrino wrote:
> Thank you both for your speedy replies. I do apologize, though, as I
> seem to have left some important information out:
> 
> * I know that 70pps is useless, hence my dismay when that's the speed
> I was getting.

> * I am running these playouts on a 19x19 board, which does slow things
> down a little bit
> * I've already got a stable base that I'm working from (specifically,
> Peter Drake's Orego[1]).

  But you said the 70pps is even _before_ you turn on the ELO prediction
- so you are effectively saying that your base speed is useless, then no
matter how fast you make ELO, of course your final speed will remain
useless.

  I think it is obvious you should forget about gamma computation and
feature checks for now completely, and focus on bringing your base
engine up to reasonable speed - of course you should at the same time
consider which features you need provided by the engine (e.g. real
liberties) and move them there.

  (You could also choose a better base to work from,
e.g.  libego, Pachi or Fuego, but maybe you are expected to use Orego
for your project. ;-)

  P.S.: I wonder what's the top speed Java-based engines can currently
achieve? Anyone has any data? How fast is refbot?

-- 
Petr "Pasky" Baudis
A lot of people have my books on their bookshelves.
That's the problem, they need to read them. -- Don Knuth
___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/


Re: [computer-go] Elo move prediction

2009-11-19 Thread Seth Pellegrino
Apologies all for replying to myself so many times, but Darren Cook
has been kind enough to help me clear the matter up. Essentially, I
had misread his paper. I was trying to compute the expensive features
Rémi uses for progressive widening as a move selector for playouts
rather than the stripped down, speedy features he actually uses.

Thank you all for your help in increasing my understanding.

Seth Pellegrino

On Thu, Nov 19, 2009 at 7:07 PM, Seth Pellegrino  wrote:
> Ah, curse the accidental send. A few clarifications below:
>
> On Thu, Nov 19, 2009 at 7:03 PM, Seth Pellegrino  wrote:
>> Hello Tom and Petr!
>>
>> Thank you both for your speedy replies. I do apologize, though, as I
>> seem to have left some important information out:
>>
>> * I know that 70pps is useless, hence my dismay when that's the speed
>> I was getting.
>> * I am running these playouts on a 19x19 board, which does slow things
>> down a little bit
>> * I've already got a stable base that I'm working from (specifically,
>> Peter Drake's Orego[1]).
>>
>> I tried as you suggested, Tom, and didn’t check for the legality of
>> each move before computing the associated gamma, but that still
>> doesn't get me to a feasible play level. It seems that, the way mode
>> code is written, iterating over a collection of vacant points and
>> computing which features are present is too expensive.
>
> That should be "my" code, not "mode" code.
>
>> At present, Orego doesn't store real liberties, so I'm paying a huge
>> price trying to look those up every time. However, after running the
>> code through a profiler, it seems that only 50% of my time or so is
>> being spent looking up those liberties. Granted, eliminating that
>> bottleneck should double the speed of my code, but jumping from 80pps
>> to 160pps still leaves me a bit underwater.
>>
>> Unfortunately, I'm a bit at a loss of how to compute the features
>> faster. I am considering creating a method that would compute whether
>> the last move put any groups into atari (and then store that answer
>> until the board's state changed), but I can't help but feel that I'm
>> doing something very wrong. For instance, I'm presently doing
>
> To complete this thought: I'm presently computing each level of each
> feature for each point. Would it make more sense to compute each level
> of each point for each feature (i.e. nesting the point loop inside the
> feature loop rather than vice versa), or (as I suspect) does it not
> make that much a difference?
>
>> Also, Tom, I was wondering if you could speak to adding patterns
>> larger than 3x3. A nice feature of 3x3 patterns is that they can be
>> jammed into two bytes (4 possibilities [black, white, vacant, off
>> board] for 8 positions), which facilitates the creation of a lookup
>> table. I've yet to figure out as fast a method for 4x4 patterns (or
>> larger), so I was wondering both how you've done it and how much it
>> helped.
>>
>> Thank you much,
>>
>> Seth
>>
>> [1] http://legacy.lclark.edu/~drake/Orego.html
>>
>> On Thu, Nov 19, 2009 at 9:27 AM, Thomas Lavergne
>>  wrote:
>>> On Thu, Nov 19, 2009 at 09:43:41AM +0100, Petr Baudis wrote:
   Hi!

 On Thu, Nov 19, 2009 at 12:35:09AM -0800, Seth Pellegrino wrote:
 > I'm working on an implementation of Rémi Coulom's Elo-based move
 > prediction algorithm[1], and I seem to be running into efficiency
 > issues. With my current implementation, I'm running at around 8 or 10
 > playouts per second. I can't imagine that the playouts would be so
 > accurate that I would be able to accurately evaluate my next move at
 > that speed. It seems that simply visiting each vacant point on the
 > board and checking whether that move would be a legal move for the
 > current player (without even computing which gammas are present at a
 > point) takes me down to about 60-75 playouts per second.

   What language are you using for your implementation? 60-75 playouts
 per second is still *extremely* slow, it seems you are in for much
 optimizing yet. Common speed of heavy playouts is at least _thousands_
 of playouts per second (per core of modern CPU), in case of light
 playouts it should be getting into tens of thousands.

   I'm not sure how much slowdown in general you get by implementing ELO
 patterns, the improvement has not been implemented commonly yet in
 public codebases (though it does seem very promising, no doubt); I plan
 to implement it in Pachi later, but I want to try out few different
 things first yet.

>>>
>>> This is indeed extremly slow. As a reference, some number from my own
>>> bot who is still in full rewrite. On my rather old Pentium4 at 1.6Ghz I
>>> get the following on 9x9 board:
>>>        v1 : ~54000pps
>>>        v2 : ~49000pps
>>>        v3 : ~41000pps
>>>        v4 : ~26000pps
>>> v1 do pure random playouts by selecting play randomly in the empty point
>>>   list and testing for validity.
>

Re: [computer-go] Elo move prediction

2009-11-19 Thread terry mcintyre
It may be helpful to incrementally compute information on a move-by-move basis, 
rather than recompute for all points on the board. 

For example; if you choose to store the liberties of groups, and a single stone 
is placed on the board, it affects only those strings which have thereby lost a 
liberty. It may lead to combining two or more strings. It may also lead to a 
capture, which creates liberties -- but captures are much less frequent than 
non-capturing moves, so it pays to optimize the non-capturing move heavily.


Similar approaches may work for whatever else is being measured. If one stone 
is played at a time, what does this change? Recomputing anything 361 times for 
each move of each playout is likely to be sluggish.

Terry McIntyre 


Anarchism is founded on the observation that since few men are wise enough to 
rule themselves, even fewer are wise enough to rule others. - Edward Abbey


  ___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/

Re: [computer-go] Elo move prediction

2009-11-19 Thread Seth Pellegrino
Ah, curse the accidental send. A few clarifications below:

On Thu, Nov 19, 2009 at 7:03 PM, Seth Pellegrino  wrote:
> Hello Tom and Petr!
>
> Thank you both for your speedy replies. I do apologize, though, as I
> seem to have left some important information out:
>
> * I know that 70pps is useless, hence my dismay when that's the speed
> I was getting.
> * I am running these playouts on a 19x19 board, which does slow things
> down a little bit
> * I've already got a stable base that I'm working from (specifically,
> Peter Drake's Orego[1]).
>
> I tried as you suggested, Tom, and didn’t check for the legality of
> each move before computing the associated gamma, but that still
> doesn't get me to a feasible play level. It seems that, the way mode
> code is written, iterating over a collection of vacant points and
> computing which features are present is too expensive.

That should be "my" code, not "mode" code.

> At present, Orego doesn't store real liberties, so I'm paying a huge
> price trying to look those up every time. However, after running the
> code through a profiler, it seems that only 50% of my time or so is
> being spent looking up those liberties. Granted, eliminating that
> bottleneck should double the speed of my code, but jumping from 80pps
> to 160pps still leaves me a bit underwater.
>
> Unfortunately, I'm a bit at a loss of how to compute the features
> faster. I am considering creating a method that would compute whether
> the last move put any groups into atari (and then store that answer
> until the board's state changed), but I can't help but feel that I'm
> doing something very wrong. For instance, I'm presently doing

To complete this thought: I'm presently computing each level of each
feature for each point. Would it make more sense to compute each level
of each point for each feature (i.e. nesting the point loop inside the
feature loop rather than vice versa), or (as I suspect) does it not
make that much a difference?

> Also, Tom, I was wondering if you could speak to adding patterns
> larger than 3x3. A nice feature of 3x3 patterns is that they can be
> jammed into two bytes (4 possibilities [black, white, vacant, off
> board] for 8 positions), which facilitates the creation of a lookup
> table. I've yet to figure out as fast a method for 4x4 patterns (or
> larger), so I was wondering both how you've done it and how much it
> helped.
>
> Thank you much,
>
> Seth
>
> [1] http://legacy.lclark.edu/~drake/Orego.html
>
> On Thu, Nov 19, 2009 at 9:27 AM, Thomas Lavergne
>  wrote:
>> On Thu, Nov 19, 2009 at 09:43:41AM +0100, Petr Baudis wrote:
>>>   Hi!
>>>
>>> On Thu, Nov 19, 2009 at 12:35:09AM -0800, Seth Pellegrino wrote:
>>> > I'm working on an implementation of Rémi Coulom's Elo-based move
>>> > prediction algorithm[1], and I seem to be running into efficiency
>>> > issues. With my current implementation, I'm running at around 8 or 10
>>> > playouts per second. I can't imagine that the playouts would be so
>>> > accurate that I would be able to accurately evaluate my next move at
>>> > that speed. It seems that simply visiting each vacant point on the
>>> > board and checking whether that move would be a legal move for the
>>> > current player (without even computing which gammas are present at a
>>> > point) takes me down to about 60-75 playouts per second.
>>>
>>>   What language are you using for your implementation? 60-75 playouts
>>> per second is still *extremely* slow, it seems you are in for much
>>> optimizing yet. Common speed of heavy playouts is at least _thousands_
>>> of playouts per second (per core of modern CPU), in case of light
>>> playouts it should be getting into tens of thousands.
>>>
>>>   I'm not sure how much slowdown in general you get by implementing ELO
>>> patterns, the improvement has not been implemented commonly yet in
>>> public codebases (though it does seem very promising, no doubt); I plan
>>> to implement it in Pachi later, but I want to try out few different
>>> things first yet.
>>>
>>
>> This is indeed extremly slow. As a reference, some number from my own
>> bot who is still in full rewrite. On my rather old Pentium4 at 1.6Ghz I
>> get the following on 9x9 board:
>>        v1 : ~54000pps
>>        v2 : ~49000pps
>>        v3 : ~41000pps
>>        v4 : ~26000pps
>> v1 do pure random playouts by selecting play randomly in the empty point
>>   list and testing for validity.
>> v2 do pure random playouts but select play from the empty list according
>>   to their weight who is set to 1.0 for all plays. (this is for testing
>>   the impact of drawing from weights)
>> v3 do weighted playouts using elo rating to weights plays with only 3x3
>>   patterns.
>> v4 same as v3 but use big patterns: up to size 9 manathan paterns.
>>
>> This is probably not the speedest implementation you can have, but I
>> think its good performance for a programme fully written in ansi/C
>> without assembly.
>> You can also use libego to check speed on your machine f

Re: [computer-go] Elo move prediction

2009-11-19 Thread Seth Pellegrino
Hello Tom and Petr!

Thank you both for your speedy replies. I do apologize, though, as I
seem to have left some important information out:

* I know that 70pps is useless, hence my dismay when that's the speed
I was getting.
* I am running these playouts on a 19x19 board, which does slow things
down a little bit
* I've already got a stable base that I'm working from (specifically,
Peter Drake's Orego[1]).

I tried as you suggested, Tom, and didn’t check for the legality of
each move before computing the associated gamma, but that still
doesn't get me to a feasible play level. It seems that, the way mode
code is written, iterating over a collection of vacant points and
computing which features are present is too expensive.

At present, Orego doesn't store real liberties, so I'm paying a huge
price trying to look those up every time. However, after running the
code through a profiler, it seems that only 50% of my time or so is
being spent looking up those liberties. Granted, eliminating that
bottleneck should double the speed of my code, but jumping from 80pps
to 160pps still leaves me a bit underwater.

Unfortunately, I'm a bit at a loss of how to compute the features
faster. I am considering creating a method that would compute whether
the last move put any groups into atari (and then store that answer
until the board's state changed), but I can't help but feel that I'm
doing something very wrong. For instance, I'm presently doing

Also, Tom, I was wondering if you could speak to adding patterns
larger than 3x3. A nice feature of 3x3 patterns is that they can be
jammed into two bytes (4 possibilities [black, white, vacant, off
board] for 8 positions), which facilitates the creation of a lookup
table. I've yet to figure out as fast a method for 4x4 patterns (or
larger), so I was wondering both how you've done it and how much it
helped.

Thank you much,

Seth

[1] http://legacy.lclark.edu/~drake/Orego.html

On Thu, Nov 19, 2009 at 9:27 AM, Thomas Lavergne
 wrote:
> On Thu, Nov 19, 2009 at 09:43:41AM +0100, Petr Baudis wrote:
>>   Hi!
>>
>> On Thu, Nov 19, 2009 at 12:35:09AM -0800, Seth Pellegrino wrote:
>> > I'm working on an implementation of Rémi Coulom's Elo-based move
>> > prediction algorithm[1], and I seem to be running into efficiency
>> > issues. With my current implementation, I'm running at around 8 or 10
>> > playouts per second. I can't imagine that the playouts would be so
>> > accurate that I would be able to accurately evaluate my next move at
>> > that speed. It seems that simply visiting each vacant point on the
>> > board and checking whether that move would be a legal move for the
>> > current player (without even computing which gammas are present at a
>> > point) takes me down to about 60-75 playouts per second.
>>
>>   What language are you using for your implementation? 60-75 playouts
>> per second is still *extremely* slow, it seems you are in for much
>> optimizing yet. Common speed of heavy playouts is at least _thousands_
>> of playouts per second (per core of modern CPU), in case of light
>> playouts it should be getting into tens of thousands.
>>
>>   I'm not sure how much slowdown in general you get by implementing ELO
>> patterns, the improvement has not been implemented commonly yet in
>> public codebases (though it does seem very promising, no doubt); I plan
>> to implement it in Pachi later, but I want to try out few different
>> things first yet.
>>
>
> This is indeed extremly slow. As a reference, some number from my own
> bot who is still in full rewrite. On my rather old Pentium4 at 1.6Ghz I
> get the following on 9x9 board:
>        v1 : ~54000pps
>        v2 : ~49000pps
>        v3 : ~41000pps
>        v4 : ~26000pps
> v1 do pure random playouts by selecting play randomly in the empty point
>   list and testing for validity.
> v2 do pure random playouts but select play from the empty list according
>   to their weight who is set to 1.0 for all plays. (this is for testing
>   the impact of drawing from weights)
> v3 do weighted playouts using elo rating to weights plays with only 3x3
>   patterns.
> v4 same as v3 but use big patterns: up to size 9 manathan paterns.
>
> This is probably not the speedest implementation you can have, but I
> think its good performance for a programme fully written in ansi/C
> without assembly.
> You can also use libego to check speed on your machine for pure random
> playouts.
>
>
> Some tricks from my experiences :
> - A lot of thing can be computed incrementally and will be a lot more
> slower if you recompute them each times: the list of empty points, the
> patterns, the cell weights...
> - Don't check for validity of each play in the empty list before drawing
> one, just pick one until you find a laid one. This not theoreticaly
> exact but in practice, there is no problems and you can gain a lot of
> speed here.
> - Don't try to undo all move at the end of playout before the next one,
> just make a copy before starting playout and do 

Re: [computer-go] Elo move prediction

2009-11-19 Thread Alain Baeckeroot
Le 19/11/2009 à 09:35, Seth Pellegrino a écrit :
> 
> Hello all,
> 
> I'm working on an implementation of Rémi Coulom's Elo-based move
> prediction algorithm[1], and I seem to be running into efficiency
> issues.

Very simple question to start:
What is the programmation language ?
Do you use 1d representation of the board (a list) or 2d ?
 1d is much faster and i guess all efficient bots use it.

You can have a look at "brown" random player (from G.Farneback)
http://www.lysator.liu.se/~gunnar/gtp/brown-1.0.tar.gz
it was built to provide gtp implementation, and legal random moves.

my 2 cents.
Alain
___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/


Re: [computer-go] Elo move prediction

2009-11-19 Thread Thomas Lavergne
On Thu, Nov 19, 2009 at 09:43:41AM +0100, Petr Baudis wrote:
>   Hi!
> 
> On Thu, Nov 19, 2009 at 12:35:09AM -0800, Seth Pellegrino wrote:
> > I'm working on an implementation of Rémi Coulom's Elo-based move
> > prediction algorithm[1], and I seem to be running into efficiency
> > issues. With my current implementation, I'm running at around 8 or 10
> > playouts per second. I can't imagine that the playouts would be so
> > accurate that I would be able to accurately evaluate my next move at
> > that speed. It seems that simply visiting each vacant point on the
> > board and checking whether that move would be a legal move for the
> > current player (without even computing which gammas are present at a
> > point) takes me down to about 60-75 playouts per second.
> 
>   What language are you using for your implementation? 60-75 playouts
> per second is still *extremely* slow, it seems you are in for much
> optimizing yet. Common speed of heavy playouts is at least _thousands_
> of playouts per second (per core of modern CPU), in case of light
> playouts it should be getting into tens of thousands.
> 
>   I'm not sure how much slowdown in general you get by implementing ELO
> patterns, the improvement has not been implemented commonly yet in
> public codebases (though it does seem very promising, no doubt); I plan
> to implement it in Pachi later, but I want to try out few different
> things first yet.
> 

This is indeed extremly slow. As a reference, some number from my own
bot who is still in full rewrite. On my rather old Pentium4 at 1.6Ghz I
get the following on 9x9 board:
v1 : ~54000pps
v2 : ~49000pps
v3 : ~41000pps
v4 : ~26000pps
v1 do pure random playouts by selecting play randomly in the empty point
   list and testing for validity.
v2 do pure random playouts but select play from the empty list according
   to their weight who is set to 1.0 for all plays. (this is for testing
   the impact of drawing from weights)
v3 do weighted playouts using elo rating to weights plays with only 3x3
   patterns.
v4 same as v3 but use big patterns: up to size 9 manathan paterns.

This is probably not the speedest implementation you can have, but I
think its good performance for a programme fully written in ansi/C
without assembly.
You can also use libego to check speed on your machine for pure random
playouts.


Some tricks from my experiences :
- A lot of thing can be computed incrementally and will be a lot more
slower if you recompute them each times: the list of empty points, the
patterns, the cell weights...
- Don't check for validity of each play in the empty list before drawing
one, just pick one until you find a laid one. This not theoreticaly
exact but in practice, there is no problems and you can gain a lot of
speed here.
- Don't try to undo all move at the end of playout before the next one,
just make a copy before starting playout and do it on the copy.


As a plan to build a bot, I first think you must build a basic bot with
pure random playouts only and get it fast enough. You don't have to go
to 50kpps, but I think at least you must go to 10kpps.

When this works and it's fully tested, start to add weighted drawing and
small fixed size patterns. Those are easy to implement and to maintain
incrementally, and will allow you to debug more your base. At this point
you start to get strong playouts.

Only when you have this you can add bigger patterns.


The Elo ratting from Remy is very powerfull but it's also very tricky to
implement efficiently in your bot, so start with a good base before
trying it.


When I've written the first version of goober, I've tried to put all my
ideas and all ideas of others I find nice at the same time and after a
will this starting to be unmanageable and very difficult to debug. Now,
I'm rewritting it fully from scratch, and don't do the same mistake. The
numbers above show you the progress of it :
- I've started to build a bot without tree search, with only pure random
  playouts ;
- Next I've made it work with weighted plays ;
- Next I've added small fixed size patterns with elo-ratings ;
- Adding big patterns was the more tricky and difficult part to get them
  efficients.
Now I have a reliable playout engine, with weighted plays. Weights are
computed from small or big patterns and a lot of other features. And I'm
currently working on the tree search part of the bot.
I've already a basic UCT without RAVE but using a lockless transposition
table (the only part not in ansi/c) and I test it a lot before starting
to add anything fancy.


Another advantage of this approach is you can see the progress of your
bot and check for the efficienty of each addition.

Writting a bot is a lot of work but it's also a lot of pleasure, so good
luck for your one.

Tom

-- 
Thomas Lavergne"Entia non sunt multiplicanda praeter
 necessitatem." (Guillaume d'Ockham)
thomas.laver...@reveurs.org 

Re: [computer-go] Elo move prediction

2009-11-19 Thread Petr Baudis
  Hi!

On Thu, Nov 19, 2009 at 12:35:09AM -0800, Seth Pellegrino wrote:
> I'm working on an implementation of Rémi Coulom's Elo-based move
> prediction algorithm[1], and I seem to be running into efficiency
> issues. With my current implementation, I'm running at around 8 or 10
> playouts per second. I can't imagine that the playouts would be so
> accurate that I would be able to accurately evaluate my next move at
> that speed. It seems that simply visiting each vacant point on the
> board and checking whether that move would be a legal move for the
> current player (without even computing which gammas are present at a
> point) takes me down to about 60-75 playouts per second.

  What language are you using for your implementation? 60-75 playouts
per second is still *extremely* slow, it seems you are in for much
optimizing yet. Common speed of heavy playouts is at least _thousands_
of playouts per second (per core of modern CPU), in case of light
playouts it should be getting into tens of thousands.

  I'm not sure how much slowdown in general you get by implementing ELO
patterns, the improvement has not been implemented commonly yet in
public codebases (though it does seem very promising, no doubt); I plan
to implement it in Pachi later, but I want to try out few different
things first yet.

-- 
Petr "Pasky" Baudis
A lot of people have my books on their bookshelves.
That's the problem, they need to read them. -- Don Knuth
___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/


[computer-go] Elo move prediction

2009-11-19 Thread Seth Pellegrino
Hello all,

I'm working on an implementation of Rémi Coulom's Elo-based move
prediction algorithm[1], and I seem to be running into efficiency
issues. With my current implementation, I'm running at around 8 or 10
playouts per second. I can't imagine that the playouts would be so
accurate that I would be able to accurately evaluate my next move at
that speed. It seems that simply visiting each vacant point on the
board and checking whether that move would be a legal move for the
current player (without even computing which gammas are present at a
point) takes me down to about 60-75 playouts per second.

My question to the list is this: How has this problem been solved in
the past? I feel compelled to to use the Elo-system because it
generates a nice probability distribution across the board, but it
currently looks infeasible. One thought I've come up with is that I
might be abusing the cache (as there's no guarantee of order in the
vacant point collection I iterate over), but my understanding is that
the points tend to be in a stride-one order even if they aren't
certain to be.

Thank you all for your time, I would appreciate any help you can give
on this issue.

Seth Pellegrino
Student, Lewis & Clark College


[1] http://remi.coulom.free.fr/Amsterdam2007/
___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/