On 01/02/11 19:52, Benoit St-Pierre wrote: Hi!
> This project dates to the days me and Alex were arguing with Pascal > for the need to add unique IDs to games. Right. > Pascal said that adding IDs to games would slow Scid too much. He does not like the size the indexes might grow, right. But this actually targets quite another issue. Having unique IDs would allow to cite games precisely, to refer to them in a unique way and so on. This is even far better than a high quality base. Note also that those IDs would not live within a database but would be meta-metadata. One could envision a system like DOI or URI here. All documents refered to by a DOI also have other metadata like author, title, publication notes and so on. It is very similar. > Without unique IDs, trying to have a *high-quality* database is way to > error-prone to my own taste. Not necessarily. Point is: without a manually sorted and cleaned database of games the assignement of game ids does not make any sense. So, one would have to sort out games first and build a database first. The idea of the project called CentriScid back then was to set up a base of games where names are checked, tournaments are complete and dupes are removed. A human can do that pretty easily, e.g. you just spot the various writings of Chalkidiki e.g. As Gerd mentions with the current state of game metadata you're almost lost with automatic procedures. Due to the fact that there is no unique game id till today this makes things complicated and requires a lot of work. IMHO it can only be pulled of by some Wikipedia-like effort. > If we can design some objective ways to insure quality-control, I > could be back in. The pragmatic way would be something like: Chris works in Zurich Candidats 1953. Checks the games, cleans data, and once he places a hook at Zurich 1953 this tournament is done. Fullstop. Next one... Unless someone spots an inconsistency it is never touched again. That way I have a prelimiary starting point of about 105.000 games here contributed by some diligent users on this list. The basic infrastructure needed to do this is: "You work on this, I work on that". I think all you need is a Wiki to jot down which tournaments are already done. (This is the nice thing: you can work by years and tournaments instead of individual games. This gives some nice chunks and a natural order.) If I remember correctly the guys who did the work so far just split it by time frames and then started to work up some game archives on the Net tourney by turney, unified player names, corrected missing data as far as it was possible and so on. Till here: no need for IDs or fancy infra structure or deep technical knowledge. It is just a huge amount of work. Starting in a community effort to set up such a base and make it available for download to our users would be a great indeed. Nobody can do that alone. cu Alexander PS: For the time being I'd leave out commented games for the time being to avoid any sort of trouble in advance. It could be worthwhile to collect them separately with identical PGN headers. (BTW: crossreferencing would then be indeed easy by some sort of ID.) There might be some idea of copyright issues here and not all players might confirm with Anands views on this issue. (To the best of my knowledge the annotation of a game of chess does not give you ANY copyright on that annotation. Juristic reasoning here is AFAIK that chess is logic, so annotation of chess does not involve any creative act, thus you can not obtain copyright of an annotation.) ------------------------------------------------------------------------------ Learn how Oracle Real Application Clusters (RAC) One Node allows customers to consolidate database storage, standardize their database environment, and, should the need arise, upgrade to a full multi-node Oracle RAC database without downtime or disruption http://p.sf.net/sfu/oracle-sfdevnl _______________________________________________ Scid-users mailing list Scid-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/scid-users