Re: [Computer-go] Standard Computer Go Datasets - Proposal

Josef Moudrik Fri, 13 Nov 2015 02:06:07 -0800

Hello,

On Fri, Nov 13, 2015 at 10:13 AM <[email protected]> wrote:
> I would only use it if it is licensed for commercial use.

Yes, I would like to licence this as such, please see below.

On Fri, Nov 13, 2015 at 10:23 AM Petr Baudis <[email protected]> wrote:

> I think the current de facto standard dataset is GoGoD (some year, not
> quite fixed). So I think it's useful to differentiate your proposal
> against this dataset - what are the current problems and what will be
> the advantage?

Yes, I know GoGoD is used frequently, but I think that the lack of
"precise" specification is the problem. There are many choices an author
has to make when using the GoGoD database: year of release, year span,
handicap games?, amateur/professional? (how to tell? pro rank is d not p).
Related thing is that some of the games (If I remember my experience
correctly) cannot be parsed by some libraries in which case they are
usually skipped. All these are branching points that make "precise"
replication of results hard.

> One advantage would be of course if the dataset is freely available.
> But it's not clear how to achieve that, i.e. where to get a large
> professional game collection without copyright protection.

I consider this "negotiation" as the hardest work I will have to do, but
before I start, I want to research if the dataset would be even used. From
the point of view of copyright law, I believe that what is protected is the
"collection of games" and "additional materials" (comments, etc), not the
actual individual games themselves (which as a record of a historical event
afaik cannot be copyrighted). The "collection of games" and "additional
materials" right of current collection owners could be protected by
anonymization of the records and mixing of different databases, if the
current owners agree.

>From the licensing point of view, again given that owners agree, I would
like to release the dataset under something like
free-for-all-purposes-with-attribution license. This I have to research yet.

> What's the usecase for a small dataset?

I had prototype testing in mind, s.t. authors can say "our method is slow,
so we only tested on the SmallGoDataset" instead of "we randomly took 1000
games from the BigGoDataset", but I assume there would be other usecases as
well. Anyway, I think the big and small datasets would not imo cause much
use-fragmentation, because the use cases for big vs small would be
different. But maybe I am overthinking things and this would not be used
much..

Regards,
Josef

_______________________________________________
Computer-go mailing list
[email protected]
http://computer-go.org/mailman/listinfo/computer-go

Re: [Computer-go] Standard Computer Go Datasets - Proposal

Reply via email to