Olivier Teytaud wrote: > whereas I am looking for > (2) input = a database D > output = frequently matched patterns
This is what I do: I have thousands of separate SGF files and a program to merge them all in a single binary file. This program filters: handicap games, insufficient rank in players, fast time settings, duplicate games, wrong board size, games with less than n moves, etc. Some of these binary files I posted here previously and the one I use most is in the supplementary materials of my IWCG 2010 paper. Paper: http://www.dybot.com/papers/meval.htm Supplementary materials: http://www.dybot.com/meval/ Then I scan this binary file very fast for: * Full board patterns (fuseki) * Corner/side patterns (joseki) * Around the last move patterns (urgency) * Statistics: e.g. How frequent is tenuki at move n, how frequently tenuki goes to the previous last move, how frequent is extension from atari at move n, or whatever I can figure out. Now the bad news: All the extraction is in thousands of undocumented Pascal (Delphi) and x86 assembly source lines. It compiles for Windows. My mew engine has new parts only in C++, but still a lot of assembly and Pascal. And offline learning tools are mainly in Pascal. It is not free software, but if you think it is interesting, I may be very happy to cooperate. Jacques. _______________________________________________ Computer-go mailing list [email protected] http://dvandva.org/cgi-bin/mailman/listinfo/computer-go
