Hello List,

There has been some debate in science about making the research more
reproducible and open. Recently, I have been thinking about making a
standard public fixed dataset of Go games, mainly to ease comparison of
different methods, to make results more reproducible and maybe free the
authors of the burden of composing a dataset. I think that the current
practice can be improved a lot.

Since the success of this endeavor crucially depends on how many authors
use the dataset, I would like to ask You (potential authors) a few
questions:

1) Would this be welcomed and used? Would You personally use it? (Am I not
reinventing the wheel?)

2) What parameters should the dataset have? The number of dataset variants
(if any) should be in my opinion kept at bare minimum to reduce
"fragmentation".

2a) Size: My current view is that at least 2 sizes are necessary: small
(1000-2000 games?) and large dataset (50000-60000 games).
2b) Strength & year span: Currently I am thinking about including modern
professional games only (1970-2015)

3) Do you have any other comments, requirements for the dataset and ideas?


Thanks for Your attention,
Kind regards
Josef Moudrik
_______________________________________________
Computer-go mailing list
Computer-go@computer-go.org
http://computer-go.org/mailman/listinfo/computer-go

Reply via email to