Re: GEDCOM to Wiki, anyone?

Darren Duncan Sat, 29 May 2010 10:23:42 -0700

Mike Elston wrote:

Hi Darren,
Thanks for posting this to the list. I haven't included yourconversation with Dave in this reply, to keep the posting short.
A few thoughts...
There already exist a number of systems which store genealogicalinformation in relational databases, for display by wikis or in moretraditional genealogical style. I have been using the open-sourcephpGedView (see phpgedview.sourceforge.net or www.phpgedview.net) forsome while now, and use perl-gedcom utilities for various managementtasks and for writing the occasional one-off processing tool.
phpGedView, GeneoTree, Oxy-Gen and other existing systems are alldesigned to parse gedcom files and convert them to SQL databases. Itwould seem to be more useful to contribute to existing open-sourcesystems that to start trying to write a new one for one's own use: forexample, phpGedView already provides tables which allow for multiplefamilies, aliases, source information and quality by implementing thefull GEDCOM structure.

Speaking just for my own intended project, not Dave's, my understanding is thatwhat I intend to do is so fundamentally different from any existing genealogyprojects that it really is best for me to start my own, though mine would beopen source and it can still glean things from other projects.

For one thing, my project is more abstract and extensible, and actually is moreof an ontology tool than a genealogy tool, but that it would handle genealogyparticularly well. For another thing, my project focuses on presenting chainsof claims, he-said-she-said, with each link being fully described, and notsimply a bibliography, and this relating to how much stock one can put in theclaims associated with the sources.

In any event, as I said this project is temporarily shelved while I work on mySQL replacement first. In fact, if one were to look at my (ostensiblycompletely specified) relational Muldis D language and how its design could beused by DBMS access tools or ORMs as a front for SQL databases today, and seehow my approach differs from every other SQL tool/ORM out there, which arerelatively a lot more self-similar, this might give a hint as to how thoroughlyI may present a different tool for genealogy than those existing now; I'm nomore constrained by GEDCOM than by SQL.

I must admit, I do like your distinction between "first handexperience", "assumed most likely considering X" and "just heard itsomewhere" as examples of differing quality of non-record-based sources.GEDCOM 5.5 (still the de facto standard, and the implementation on whichthis list is based) only has QUAY 0/1/2/3 (which it defines basically as"unreliable" / "questionable" / "secondary" / "primary orevidence-based"), and these are neither carefully enough defined norsufficiently widely or consistently used to be of much value except tothe person ascribing the quality to the source in the first place.

Yes, each person using the database would be a source themselves, or a proxy forone, and can talk about their own first hand experience. This isn't like anencyclopedia where information not first written down elsewhere is disallowed.

Regarding database implementations: I understand (as a softwaredeveloper myself) that there is a camp where PostgreSQL is stronglypreferred over MySQL, and the recent acquisition of the latter has addedfuel to the cause. But (imho) you're heavily overstating the case toclaim that PostgreSQL "is a much more reliable and better quality DBMS,which all the savvy people prefer over MySQL". One factor many peoplehave to take into account is that many third-party web-space providersalso provide MySQL servers: users may not have the choice. In myexperience, they differ more in implementation than in quality, andMySQL works perfectly well (and reliably) for the sort of taskgenealogical databases require. It's widely-available, well-supported,industrial-strength, scalable, and I like it :-)
Just my two-penn'orth...

/mike

If you want to see an ostensibly objective example of how PostgreSQL is betterthan MySQL, and has been for a long time, just look at the release notes orchange logs for both projects. PostgreSQL is much more stable and free of bugs,and its change logs dominantly deal in new features or performance enhancements,and its bug fixes tend towards the relatively minor, though there are somesignificant ones periodically. MySQL in contrast dominantly deals with fixingbugs, many of them serious, which would indicate that MySQL has a lot more bugsin it to begin with. The release notes or change logs reflect how eachproject's own developers see them, never mind other people.

MySQL has been known, for example in its version 5.1, to declare itself"generally available"/"production" quality despite having over a hundred seriousbugs in it. In contrast, PostgreSQL would consider such to be "alpha" or maybe"beta" quality.

But even ignoring the change logs, the designs of the two DBMSs reflectdifferent philosophies, where Postgres considers things like data integrity andconsistent behavior to be highly important while MySQL is much more likely toconsider something to much lower standards as "good enough". MySQL didn't andstill doesn't support transactions product-wide. Nor, I think, foreign keyconstraints. MySQL silently truncates inserted data that is too long for afield, saying things are okay even though it just lost some data, rather thanraising an error citing 'input too long'. I have first-hand experience withthat. MySQL considers the string 'foo ' to be = to 'foo', which it clearlyisn't. Postgres nor any proper relational DBMS does these things. And thereare many other citable things. MySQL databases are much more likely to becomecorrupted.

I liken MySQL to Microsoft Windows, which people use dominantly because it has agreater number of existing installations or they have for the same reasonalready used it and only it before. While people having solid experience withboth MySQL and PostgreSQL, and prefer MySQL exist, I would think those MySQLusers are a minority compared to those for whom MySQL is the only thing theyknow, because it was pre-installed. As with MS Windows, while some people whohave solid experience with other major OSs would like Windows more, I'd think amajority of Windows users are those who don't like it more, and just use itbecause it is either all they know or a program they need requires it.

If both MySQL and PostgreSQL were pre-installed on the same number of hosts, andthe same number of people were solidly experienced with both, I would think thatmore people would choose PostgreSQL by default for their next project.

People who are savvy with databases and know both PostgreSQL and MySQL wouldprefer PostgreSQL for quality and features hands-down. And if their hostdoesn't provide PostgreSQL, they would demand they provide it, or install itthemselves, or find another host, or that particular data is unimportant.


-- Darren Duncan

Re: GEDCOM to Wiki, anyone?

Reply via email to