> unique IDs to ensure a *high-quality* cannot be achieved. How should the ID 
> be calculated?

The ID is simply given and maintained by the database facility.  In
relational database theory, I believe that to have unique IDs is
axiomatic.  I am not sure that it's possible to have relational
databases that work without unique IDs.

This ID would guarantee the internal consistency of the database.  To
have something like external consistency, we sould need a service akin
to URI or DOI conventions.

The way I envision it, we could completely kill off duplicates by
first insuring that we have the proper metadata and the correct
gamescore.  Then, as soon as a game is sufficiently similar to thess
corrected data, we simply would delete this game and replace it with
the sanitized data.

That hundreds of thousands chessplayers hand-pick and correct chess
scores and chess metadata by hand is beyond me.  It also runs against
every principles on which is based archiving.

In fact, that we can't yet download a sanitized database for Scid is
still beyond me.

More on that has been already told.  Search for CentriScid in the archives.

***

That said, I know that I am speaking from a theorical standpoint.  I
absolutely have no idea if that's possible in practice, considering
the meager resources we have for now and the way to "hunt and gather"
chess games is deeply rooted in chess culture.  So please bear in mind
that I am in no way asking anyone to do anithing here.

------------------------------------------------------------------------------
Learn how Oracle Real Application Clusters (RAC) One Node allows customers
to consolidate database storage, standardize their database environment, and, 
should the need arise, upgrade to a full multi-node Oracle RAC database 
without downtime or disruption
http://p.sf.net/sfu/oracle-sfdevnl
_______________________________________________
Scid-users mailing list
Scid-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/scid-users

Reply via email to