Re: A draft proposal for UUIDs

Stephen Woodbridge Tue, 16 Aug 2011 18:17:09 -0700

On 8/16/2011 7:14 PM, Ron Savage wrote:

Hi Steve


On Tue, 2011-08-16 at 09:48 -0400, Stephen Woodbridge wrote:
[snip]

Now, if some interface code allows the user to create INDIs, say, them
they have to be flagged as having a different UUID. Or do they?

If the original UUID belonged to the source, then yes, since the new
INDIs are coming from a different source.


I think this is the correct answer. the UUID belongs to the source of
the import action the created the data when the import did not have
UUIDs of its own.

Adding a UUID for the import action would then allow all the data to be
later purged if it needed to be so there might be value in adding a UUID
to the import even if the imported data already has UUIDs.


I think we're getting things clearer now.

To summarize: For a given db, each source which contributes records must
be separately identified by a UUID, with that UUID attached (somehow) to
each record imported.

That means various types of reports:

o Pick 2 UUIDs and process (e.g. compare, update, export, delete) just
the records belonging to those UUIDs.

o Pick 2 UUIDs and flag records such that data with (from) UUID #1 is
deemed more reliable that data with UUID # 2. Clearly both datasets are
preserved.

o Pick 1 UUID and process (e.g. update, export, delete, ...) just the
records belonging to it.

o Many others possibilities ...

Good stuff!


Hi Ron,

Yes exactly, but I not sure that it should be limited to sources becausefrom a single source you might have some good data and some bad data. Soit is fine to say this is more reliable than that, but you also need tobe able to say this item is flat out wrong.

Obviously you can go crazy with this and tag everything a UUID, but hereare the things I think are most important:


1. INDI, FAMI, and NOTE and individual items attached to these
2. actions performed on these or on the database

Oh! I just realized something we are mixing two different needs here

1. INDI, FAMI, SOUR and NOTE records need a UUID that does not changethis is a persistent object identifier. In a given system INDI::UUID=27should always get me the same INDI regardless.

2. for history and object version tracking so you can merge or re-mergea data set, you need a version number that gets incremented every timethe object gets changed. So say I import Joe's GEDCOM and merge it withmy file in January and then in August I get an update. I can ignore allthe UUIDs from Joe that have the same version as in the new import andonly UUIDs that I do not have or have new versions, need to be merged.

So I think we have two separate needs here that should not get merged toavoid confusion: Object need UUIDs and Actions (add, import, editdelete, etc) cause version changes to Objects. Is a version just anotherUUID? If an object like an INDI or INDI::BIRT is in two separate systemsand is edited in both systems you would not what them to be able to havethe same version number.

So a possible use case: I create an INDI in a system A and it has UUID=xand this is exported to a GEDCOM and imported into a another system B, Iassume it retains it UUID=x but also has some additional informationthat it was imported attached to it. Now the BIRT record isadded/updated separately in both systems. Later I import import thesystem B back into system A.

I'm just trying to think this through. There are obviously a lot ofadditional nuances that can be put on this, like each BIRT record couldreference a SOUR record that would have a UUID and later identical orsimilar SOUR records could be merged keeping both UUIDs.

Maybe this is getting to be overkill? Is anyone else following thisthread? I see these things as being a significant aid to managing dataand merging and updating data in an automated way. But then again maybeno one else cares.


Thoughts?
  -Steve

Re: A draft proposal for UUIDs

Reply via email to