Re: [Computer-go] Standard Computer Go Datasets - Proposal

2015-11-13 Thread Erik van der Werf
On Fri, Nov 13, 2015 at 10:46 AM, Darren Cook wrote: > > The advantages of storing games: > * accountability/traceability > * for programs who want to learn sequences of moves. > Another advantage of storing games is that it is much more efficient; you only have to encode

Re: [Computer-go] Standard Computer Go Datasets - Proposal

2015-11-13 Thread Gonçalo Mendes Ferreira
I think if you start calculating the Zobrist hashes and scraping features yourself you will have a neverending variety of datasets. I would prefer datasets of whole, high quality games without SGF errors, perhaps cleaned of identifying information. Parsing an SGF is already trivial. I

Re: [Computer-go] Standard Computer Go Datasets - Proposal

2015-11-13 Thread Gonçalo Mendes Ferreira
At least in the past some DCNN made use of the players ranks, so it should be best to leave it. On 11/13/2015 10:27 AM, Josef Moudrik wrote: On Fri, Nov 13, 2015 at 11:16 AM Erik van der Werf wrote: On Fri, Nov 13, 2015 at 10:46 AM, Darren Cook

Re: [Computer-go] Standard Computer Go Datasets - Proposal

2015-11-13 Thread Petr Baudis
Hi! On Fri, Nov 13, 2015 at 08:39:20AM +, Josef Moudrik wrote: > There has been some debate in science about making the research more > reproducible and open. Recently, I have been thinking about making a > standard public fixed dataset of Go games, mainly to ease comparison of > different

Re: [Computer-go] Standard Computer Go Datasets - Proposal

2015-11-13 Thread Darren Cook
> standard public fixed dataset of Go games, mainly to ease comparison of > different methods, to make results more reproducible and maybe free the > authors of the burden of composing a dataset. Maybe the first question should be is if people want a database of *positions* or *games*. I

Re: [Computer-go] Standard Computer Go Datasets - Proposal

2015-11-13 Thread fotland
I would only use it if it is licensed for commercial use.   David On Fri, 13 Nov 2015 08:39:20 +, Josef Moudrik wrote: Hello List,  There has been some debate in science about making the research more reproducible and open. Recently, I have been thinking about making a standard

Re: [Computer-go] Standard Computer Go Datasets - Proposal

2015-11-13 Thread Josef Moudrik
Hello, On Fri, Nov 13, 2015 at 10:13 AM wrote: > I would only use it if it is licensed for commercial use. Yes, I would like to licence this as such, please see below. On Fri, Nov 13, 2015 at 10:23 AM Petr Baudis wrote: > I think the current de facto

Re: [Computer-go] Standard Computer Go Datasets - Proposal

2015-11-13 Thread Dave Dyer
I was recently working on assigning final scores to completed games, using the large data set from Badukmovies.com. My observation is that the size of the data set (50,000 games) is not large enough to get good coverage of unusual situations occurring in real games. There's a definite need

Re: [Computer-go] Standard Computer Go Datasets - Proposal

2015-11-13 Thread Dave Dyer
I was recently working on assigning final scores to completed games, using the large data set from Badukmovies.com. My observation is that the size of the data set (50,000 games) is not large enough to get good coverage of unusual situations occurring in real games. There's a definite need

Re: [Computer-go] Standard Computer Go Datasets - Proposal

2015-11-13 Thread Steven Clark
To answer the original question: yes, the curation of a dataset like this would be hugely beneficial to the community. Look at what ImageNet has done for computer vision. In fact, it might be good to emulate ImageNet further and pre-split the dataset into a publicly-available training set, and a

Re: [Computer-go] Standard Computer Go Datasets - Proposal

2015-11-13 Thread Petr Baudis
Hi! On Fri, Nov 13, 2015 at 09:46:54AM +, Darren Cook wrote: > (I did wonder about storing player ranks, e.g. if a given position has a > move chosen by only a single 9p, and you can then extract each follow-up > position, you could extract a game. But, IMHO, you cannot regenerate any >

Re: [Computer-go] Standard Computer Go Datasets - Proposal

2015-11-13 Thread Josef Moudrik
On Fri, Nov 13, 2015 at 11:16 AM Erik van der Werf wrote: > On Fri, Nov 13, 2015 at 10:46 AM, Darren Cook wrote: >> >> The advantages of storing games: >> * accountability/traceability >> * for programs who want to learn sequences of moves. >> > >