Hi Mikkel

On Thu, 2011-08-04 at 22:44 +0200, Mikkel Eide Eriksen wrote:
> Hi Ron,
> 
> Speaking of genealogy formats, I'm working some on a completely source-based 
> format: http://carthag.github.com/sourcemarkup/ (don't mind the ugly color 
> scheme, it was just a random one I chose).

I'm glad to know you're doing this sort of work, but I have a complex
set of reactions to it:

o Isn't Cocoa an Apple-proprietary software thing? This implies anyone
trying to use data in this format outside the Apple cocoon must have a
separate set of code to import and manipulate the data. Who's going to
write that? Only someone who chooses to support your format.

o XML definitely handles nested data, so it can certainly be used as a
communication format between users, but is the idea to keep the data in
XML at all times? This requires an XML parser (which is a big topic I
don't wish to pursue), and either using something like XML::Twig to
access small parts of the file, or storing all the XML in a DOM-based
structure, which normally takes up 100 (sic) times the space of the file
itself (another big topic). This in turn leads to a discussion of speed
of access for practical web-based display, and hence deployment under
web servers such as Starman so that the code never exits, meaning the
slow startup costs for the XML processor, etc, are avoided.

o I'll say again I understand XML has its uses, but its proponents are
still trying to live down the XML fanaticism of the early days, when
every little thing was put in a XML file (the format of choice for
control freaks :-), which required a huge parser to be fired up just to
read even a 3 line file. As always, it's up to the proponents to support
their suggestions, rather than choosing it first and then afterwards
claiming it's appropriate. That applies to my suggestions too :-(!.

o So why go your own way anyway? Why not join the - very interesting -
Better Gedcom group? I do thank you for the reference. We should all
think about how that group and we Perl users can interact.

> The idea is to use transcribed sources and mark them up with all info that is 
> contained therein, so as to force all information to be referable to an 
> actual source. From this data, it should be possible to build family trees, 
> data sheets for individuals, etc.  It is still very much a work in progress 
> and is just at this point an idea and a very fluid definition of what I want 
> it to be able to do. The site has two unrelated examples, a birth (source1, 
> recored as prose) and a marriage (source2, recorded in a table).

It's good you've directly focused on one of the major issues - how to
handle textual material.

I should say I have a strong suspicion an ideal solution (if there is
one!) will end up being:

o Have basic info (individuals, families, and hence relationships
[i/f/r]) in a db (i.e. such as Postgres). This means rapid access in a
viewport-like way so as to display a fragment of a family tree in a web
page, and

o Have all other material in either (potentially huge) text/binary
fields in the db, or even in external files, all accessible via the
i/f/r records.

> Obviously it would be impossible to generate this data from a gedcom file, 
> but it would be possible to (lossily) export from this format to gedcom.

Sure, but the whole point of my current attempt to stir people into
responding is to think outside the 'Gedcom-ordained square', and to
focus on what's needed, not what was defined in the past.

> Additionally, this might also interest you, I came across it last month: 
> http://bettergedcom.wikispaces.com/

Yes, indeed. Probably they're way ahead of me on this matter... I'd
better lie low until I study their material.

-- 
Ron Savage
http://savage.net.au/
Ph: 0421 920 622

Reply via email to