There is no reason to believe the lines will cross because they seem to
be moving away from each other starting at the origin.

Also, the lines are not straight, there appears to be a gentle slope to
both of them.    A decade or two ago we talked of these lines being
straight or "linear" because we could only see a portion of what we can
see now.    It was not feasible to run thousands of games at high depth
levels.    I'm doing more testing in one day than we could do in 2 or 3
years. 

It's possible that there would be less curve if I plotted by time or if
I used a full width version of both programs.   I can only guess about
this.   I believe it's the case that a good selective search improves
faster with time,  but slower with depth.  

The metric that's most important for practical engineers is time because
you want a program to play as well as possible in the least amount of
time.    I would have preferred to do this study with time, perhaps
setting up levels that used 2X more time than the previous.     However
it's difficult to do this accurately on a computer that is being heavily
used for other things too.     

It would have been possible to do this based on total nodes searched but
that would have required some re-engineering of the autotester (to make
it abort searching when the specified number of nodes have been exceeded
and to ignore moves that might have been selected after this.)    My own
chess programs always had a way to specify levels based on total nodes
searches because it was very useful for consistent testing.

If you look at the chart, you see that in a sense the versions do
"cross" each other,  at least at the moment and based on the current
data which is still subject to a lot of statistical noise.   Consider
what you see now if you assume the numbers are accurate: 

For the first 3 depths,  the program that is doing the deeper search is
the strongest regardless of evaluation function.    For instance Weak-3
is stronger than Strong-2,  just as you would expect.     However, after
that it appears that the weak evaluation function needs 2 extra ply to
be stronger.    At the high end of the range,  it appears to require an
addition 3 plies of depth  to be stronger because a 9 ply search with
the strong evaluation function is beating the weak 11 ply search and so
it would require at least a 12 ply search with the weak evaluation
function to beat a measly 9 ply search.   

If those numbers hold up,  the implication is that the deeper you
search,  the more important your evaluation function becomes.    I'm
sure it's possible to put other spins on this and I'm sure people will,
  but this seems like the most reasonable working premise. 

I think the numbers will hold up even though there is a lot of noise in
the data.    Although each data point is noisy,  the general trend is
not ambiguous.

- Don



 

Ivan Dubois wrote:
> I have a question :
> If the lines in the graph are straight lines and they dont have the
> same increase rate, then isnt there a point where they should cross ?
> Do they all cross at the same point ?
> I guess this point (if it exists) would indicate some kind of starting
> point : It would correspond to the weakest possible strenght.
>
> Any thoughts ?
>
> _______________________________________________
> computer-go mailing list
> computer-go@computer-go.org
> http://www.computer-go.org/mailman/listinfo/computer-go/
>
_______________________________________________
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/

Reply via email to