Hi Mikkel On Thu, 2011-08-04 at 22:44 +0200, Mikkel Eide Eriksen wrote: > Hi Ron, > > Speaking of genealogy formats, I'm working some on a completely source-based > format: http://carthag.github.com/sourcemarkup/ (don't mind the ugly color > scheme, it was just a random one I chose).
I'm glad to know you're doing this sort of work, but I have a complex set of reactions to it: o Isn't Cocoa an Apple-proprietary software thing? This implies anyone trying to use data in this format outside the Apple cocoon must have a separate set of code to import and manipulate the data. Who's going to write that? Only someone who chooses to support your format. o XML definitely handles nested data, so it can certainly be used as a communication format between users, but is the idea to keep the data in XML at all times? This requires an XML parser (which is a big topic I don't wish to pursue), and either using something like XML::Twig to access small parts of the file, or storing all the XML in a DOM-based structure, which normally takes up 100 (sic) times the space of the file itself (another big topic). This in turn leads to a discussion of speed of access for practical web-based display, and hence deployment under web servers such as Starman so that the code never exits, meaning the slow startup costs for the XML processor, etc, are avoided. o I'll say again I understand XML has its uses, but its proponents are still trying to live down the XML fanaticism of the early days, when every little thing was put in a XML file (the format of choice for control freaks :-), which required a huge parser to be fired up just to read even a 3 line file. As always, it's up to the proponents to support their suggestions, rather than choosing it first and then afterwards claiming it's appropriate. That applies to my suggestions too :-(!. o So why go your own way anyway? Why not join the - very interesting - Better Gedcom group? I do thank you for the reference. We should all think about how that group and we Perl users can interact. > The idea is to use transcribed sources and mark them up with all info that is > contained therein, so as to force all information to be referable to an > actual source. From this data, it should be possible to build family trees, > data sheets for individuals, etc. It is still very much a work in progress > and is just at this point an idea and a very fluid definition of what I want > it to be able to do. The site has two unrelated examples, a birth (source1, > recored as prose) and a marriage (source2, recorded in a table). It's good you've directly focused on one of the major issues - how to handle textual material. I should say I have a strong suspicion an ideal solution (if there is one!) will end up being: o Have basic info (individuals, families, and hence relationships [i/f/r]) in a db (i.e. such as Postgres). This means rapid access in a viewport-like way so as to display a fragment of a family tree in a web page, and o Have all other material in either (potentially huge) text/binary fields in the db, or even in external files, all accessible via the i/f/r records. > Obviously it would be impossible to generate this data from a gedcom file, > but it would be possible to (lossily) export from this format to gedcom. Sure, but the whole point of my current attempt to stir people into responding is to think outside the 'Gedcom-ordained square', and to focus on what's needed, not what was defined in the past. > Additionally, this might also interest you, I came across it last month: > http://bettergedcom.wikispaces.com/ Yes, indeed. Probably they're way ahead of me on this matter... I'd better lie low until I study their material. -- Ron Savage http://savage.net.au/ Ph: 0421 920 622