Hi Steve On Tue, 2011-08-16 at 21:11 -0400, Stephen Woodbridge wrote: > On 8/16/2011 7:14 PM, Ron Savage wrote: > > Hi Steve > > > > On Tue, 2011-08-16 at 09:48 -0400, Stephen Woodbridge wrote: > > [snip] > >>> Now, if some interface code allows the user to create INDIs, say, them > >>> they have to be flagged as having a different UUID. Or do they? > >>> > >>> If the original UUID belonged to the source, then yes, since the new > >>> INDIs are coming from a different source. > >> > >> I think this is the correct answer. the UUID belongs to the source of > >> the import action the created the data when the import did not have > >> UUIDs of its own. > >> > >> Adding a UUID for the import action would then allow all the data to be > >> later purged if it needed to be so there might be value in adding a UUID > >> to the import even if the imported data already has UUIDs. > > > > I think we're getting things clearer now. > > > > To summarize: For a given db, each source which contributes records must > > be separately identified by a UUID, with that UUID attached (somehow) to > > each record imported. > > > > That means various types of reports: > > > > o Pick 2 UUIDs and process (e.g. compare, update, export, delete) just > > the records belonging to those UUIDs. > > > > o Pick 2 UUIDs and flag records such that data with (from) UUID #1 is > > deemed more reliable that data with UUID # 2. Clearly both datasets are > > preserved. > > > > o Pick 1 UUID and process (e.g. update, export, delete, ...) just the > > records belonging to it. > > > > o Many others possibilities ... > > > > Good stuff! > > > > Hi Ron, > > Yes exactly, but I not sure that it should be limited to sources because > from a single source you might have some good data and some bad data. So > it is fine to say this is more reliable than that, but you also need to > be able to say this item is flat out wrong. > > Obviously you can go crazy with this and tag everything a UUID, but here > are the things I think are most important: > > 1. INDI, FAMI, and NOTE and individual items attached to these > 2. actions performed on these or on the database > > Oh! I just realized something we are mixing two different needs here > > 1. INDI, FAMI, SOUR and NOTE records need a UUID that does not change > this is a persistent object identifier. In a given system INDI::UUID=27 > should always get me the same INDI regardless. > > 2. for history and object version tracking so you can merge or re-merge > a data set, you need a version number that gets incremented every time > the object gets changed. So say I import Joe's GEDCOM and merge it with > my file in January and then in August I get an update. I can ignore all > the UUIDs from Joe that have the same version as in the new import and > only UUIDs that I do not have or have new versions, need to be merged. > > So I think we have two separate needs here that should not get merged to > avoid confusion: Object need UUIDs and Actions (add, import, edit > delete, etc) cause version changes to Objects. Is a version just another > UUID? If an object like an INDI or INDI::BIRT is in two separate systems > and is edited in both systems you would not what them to be able to have > the same version number. > > So a possible use case: I create an INDI in a system A and it has UUID=x > and this is exported to a GEDCOM and imported into a another system B, I > assume it retains it UUID=x but also has some additional information > that it was imported attached to it. Now the BIRT record is > added/updated separately in both systems. Later I import import the > system B back into system A. > > I'm just trying to think this through. There are obviously a lot of > additional nuances that can be put on this, like each BIRT record could > reference a SOUR record that would have a UUID and later identical or > similar SOUR records could be merged keeping both UUIDs. > > Maybe this is getting to be overkill? Is anyone else following this > thread? I see these things as being a significant aid to managing data > and merging and updating data in an automated way. But then again maybe > no one else cares.
It's overkill. We can't possibly design a mechanism for fiddling UUID in order to emulate a version control system such as git. That's utterly futile. So, we need to design UUIDs to serve whatever purpose people need which can't be provided by git/etc. (I'll probably answer your email's points separately. I still have to think about the non-version control aspects of UUIDs :-). -- Ron Savage http://savage.net.au/ Ph: 0421 920 622