[Scid-users] Fwd: Opening Training

Facil Sempre Sat, 27 Feb 2016 07:28:59 -0800

 I have been using the SCID opening training extensively and while it is a
useful utility, it can be improved. It seems that when it is the computer
move, it chooses a random move without taking care of anything else. That
is an easy way to get it running, but it turns out that it repeats the same
positions a lot while others are totally ignored (just because is a random
walk).


I do not know if there is a _right_ solution to this problem, but let me
explain what I think could be best than what it is actually coded now.

The relevant code in opening.tcl is:


################################################################################
  # play one of the candidate moves

################################################################################
  proc play { cm } {
    # addStatsPrev -good 0 -dubious 0 -absent 0 -total 1
    set r [expr int(rand()*[llength $cm]/2) ]
    set m [ lindex $cm [ expr $r * 2 ] ]

    if {[sc_pos moveNumber] == 1 && [sc_pos side] == "white"} {
      ::game::Clear
    }

    if {![catch {sc_move addSan [::untrans $m] }]} {
    }
    updateBoard -pgn -animate
  }

I won't pretend I understand exactly what it does, but it seems pretty
clear that it chooses a random move from the whole list.

I propose two options, one easy and one difficult. Both I think are far
beyond my capability as a programmer, but I want to hear your opinion
before even trying to do anything (hoping, I'm not going to lie to you)
that someone picks the gauntlet and does it for me.

*Easy option*:

Each move has a counter of numbers of times played, how many times there
have been a correct move and how many times there have been a bad move. So
for each candidate move (cm) we will add the numbers of its children nodes
(as it is a computer node, it cannot have bad moves per definition, so if
we want to take that into account, and I think it is important, we need to
do this trick), and that would be the computer node's *'times reached*', '*good
moves*' and '*bad moves*'.

For each node, we would do this weight:

cm.weight = 1/[(times reached+0.1 /* to avoid dividing by zero */) * (1-bad
moves/times reached-0.001 /* to avoid dividing by zero */)]

That will make moves that has been played less often more frequent, and
moves that have a higher percentage of mistake to be played more. However,
it will not take into account that there are branches bigger than others,
hence favoring simple lines over harder ones.

*Difficult option*:

We want to train the whole tree, and it is possible that a branch (1. e4
e5) is way larger than other branch (say 1.e4 Nc6). Hence the larger branch
should be drawn a lot more than the small one.

So let's say cm is the list of candidate moves. I would add to each move in
cm a number, which would be the number of nodes in the tree after that
branch is played. Moreover, I would keep count of how many times those
nodes have been reached and the number of bad moves played on those nodes,
to have something like this after 1.e4:

* 1...e5: nodes after that move = 1000; number of times those nodes have
been reached = 750; number of bad moves in those nodes = 150.
* 1...Nc6: nodes after that move = 100; number of times those nodes have
been reached = 80; number of bad moves in those nodes = 5.

Now we have to weight both moves on account of this data. What I propose is
to do this. We calculate this number for each move:

1/[(number of times nodes reached / total nodes) * (1-number of bad
moves/number of times nodes reached)]

1...e5 -> 1/[750/1000*(1-150/750)]=1/0.6=1.666
1...Nc6 -> 1/[80/100*(1-5/80)]=1/0.75=1.333

So those are the weight of the moves. It will play moves that has less
proportion of time reached vs total nodes (so we will get less seen
positions more often) and it will play more times those nodes where we have
higher percentage of mistakes.

Of course this is a formula I got but there should be plenty better than
that.

We have to add to that a little hiccup: when there exists transpositions
this will not work properly. Imagine our repertoire holds:

1.e4 e5 2.Nf3 Nc6 - 1000 positions.
1.e4 e5 2.Nf3 Nf6 - 100 positions
1.e4 e6 -1000 positions.
1.e4 Nc6 2.Nf3 something different than 2...e5 100 positions.

When choosing between 1...Nc6, 1...e5 and 1...e6, the first option actually
has 1100 nodes, because of the transposition to 1...e5, the second has also
1100 options, and the third has 1000. Hence we would end up going for 1.e4
e5 2.Nf3 Nc6 twice as much as we need. A good solution would be to count
each node only as a fraction

------------------------------------------------------------------------------
Site24x7 APM Insight: Get Deep Visibility into Application Performance
APM + Mobile APM + RUM: Monitor 3 App instances at just $35/Month
Monitor end-to-end web transactions and take corrective actions now
Troubleshoot faster and improve end-user experience. Signup Now!
http://pubads.g.doubleclick.net/gampad/clk?id=272487151&iu=/4140

_______________________________________________
Scid-users mailing list
Scid-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/scid-users

[Scid-users] Fwd: Opening Training

Reply via email to