Hi, On 7/10/07, Tako Schotanus <[EMAIL PROTECTED]> wrote:
I understand the recommendation, I just would like to add that sometimes finding a good identifier is pretty difficult. The example of using a person's email address is just not stable enough, there are lots of people who are changing email addresses continuously and many of the addresses don't contain any clue as to the person they belong to.
Note that names don't really need to be globally unique or stable. A name doesn't even need to reflect any specific attribute (real name, etc.) of the node.
And since the 1.5 years that I live in Spain now I'm still amazed at the number of people that have EXACTLY the same name! Even taking into account that they have 2 last names (from both parents) and normally several first names as well! (Probably due to the fact that it was customary to name children after grandparents)
This is where the recommendation to avoid huge flat collections comes to help. A repository that models the population of Spain could (and should!) use some hierarchy. A geographic hierarchy would divide people based on the area, city, street, block, etc. where they live in. A geneologic hierarchy could use either maternal or paternal lines and have the repository hierarchy reflect the actual parent/child relations in real world. BR, Jukka Zitting
