Hi,

you still have the problem, that two users on two systems enter the same
game. Now these two games are duplicates viewed from their internal database
ids, but not for me. Unique IDs in databases are mainly useful to allow
quick referencing inside of the database. What we would need and what does
not exist is known by databases as natural keys. I guess a composition of
them is what you are looking for. But there is no subset of attributes,
which allows to determine identity of two games. That is why only a smart
duplicate detection can help. 
That is why for example discovery of a computer network with all its
components is such a nice job. Very similar to our problem. 

        Gerd



-----Ursprüngliche Nachricht-----
Von: Benoit St-Pierre [mailto:benbon...@gmail.com] 
Gesendet: Sonntag, 2. Januar 2011 22:55
An: Scid Users List
Betreff: Re: [Scid-users] ScidBase?

> unique IDs to ensure a *high-quality* cannot be achieved. How should the
ID be calculated?

The ID is simply given and maintained by the database facility.  In
relational database theory, I believe that to have unique IDs is
axiomatic.  I am not sure that it's possible to have relational
databases that work without unique IDs.

This ID would guarantee the internal consistency of the database.  To
have something like external consistency, we sould need a service akin
to URI or DOI conventions.

The way I envision it, we could completely kill off duplicates by
first insuring that we have the proper metadata and the correct
gamescore.  Then, as soon as a game is sufficiently similar to thess
corrected data, we simply would delete this game and replace it with
the sanitized data.

That hundreds of thousands chessplayers hand-pick and correct chess
scores and chess metadata by hand is beyond me.  It also runs against
every principles on which is based archiving.

In fact, that we can't yet download a sanitized database for Scid is
still beyond me.

More on that has been already told.  Search for CentriScid in the archives.

***

That said, I know that I am speaking from a theorical standpoint.  I
absolutely have no idea if that's possible in practice, considering
the meager resources we have for now and the way to "hunt and gather"
chess games is deeply rooted in chess culture.  So please bear in mind
that I am in no way asking anyone to do anithing here.

----------------------------------------------------------------------------
--
Learn how Oracle Real Application Clusters (RAC) One Node allows customers
to consolidate database storage, standardize their database environment,
and, 
should the need arise, upgrade to a full multi-node Oracle RAC database 
without downtime or disruption
http://p.sf.net/sfu/oracle-sfdevnl
_______________________________________________
Scid-users mailing list
Scid-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/scid-users


------------------------------------------------------------------------------
Learn how Oracle Real Application Clusters (RAC) One Node allows customers
to consolidate database storage, standardize their database environment, and, 
should the need arise, upgrade to a full multi-node Oracle RAC database 
without downtime or disruption
http://p.sf.net/sfu/oracle-sfdevnl
_______________________________________________
Scid-users mailing list
Scid-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/scid-users

Reply via email to