Sorry for the quite late reply - you can see how often I wade through secondary lists when I get busy. :)
I've thought about a more literal gedcom diff - you give it two gedcoms and a root pair, and it spits out exactly what the differences are in some easy-to-digest way. I was thinking regular diff-like output, actually, which could be interpreted by gedcom-savvy humans or by a program with a UI. The hard part was deciding which sort order to put the records in because the greatly influences how the diff looks. I remember thinking along the same lines as you about comparing trees, and I think that's the right approach, especially given a known root pair. I think you would still want to compare names or some other identifying characteristics in addition to just tree shape, which would prevent merging multiple parents/spouses, not to mention children! On Tue, 16 Nov 2004 at 06:49 -0700, [EMAIL PROTECTED] wrote: > Has anyone come up with code that can diff two .ged files? I have two > large files with data missing from each (two sides of the family). All the > tools I have looked at seem to use soundex to find matches and my test > merge has not gone well (I just used the paf5 match/merge). > > I was thinking if I wrote code where I would give it a reference person > from each file "these two are the same", the code could build a tree from > there and then compare the trees not the names themselves to tell me which > nodes were missing. I am still not sure how I would handle multiple > spouses, and multiple parents and such.... but I have been mulling it > around for a while. > > Thought I would see if anyone had built anything along these lines. > > -- > Chuck > -- Hans Fugal ; http://hans.fugal.net There's nothing remarkable about it. All one has to do is hit the right keys at the right time and the instrument plays itself. -- Johann Sebastian Bach
signature.asc
Description: Digital signature