This has been a fruitful discussion, I think. If I could offer a few
thoughts from a LWG perspective (even though Velke and Anderson know a great
deal more than I do about it)....

* The GDM was never mean to be a database design. I know that you've said
that many times but it bears repeating. In this case it's useful to repeat
because you are concerned about redundant storage and the LWG was not
thinking about storage. At the same time, they were thinking about the
relationships between entities and perhaps this one is one that can be

If we oversimplify (because that helps me understand), let's instantiate
some of these classes.

Repository - Library.
Source - Book.

In theory, if I associate a book with a library I am describing their
collection. I could associate a lot of sources with a repository, including
call numbers and their condition, without being involved in a genealogical
search. I'm not certain, but I think that this association might best be
referred to as a CATALOG, which is a well-established model for that

I think that the LWG may have thought that all linking of sources to
repositories would take place as the result of a research activity, hence
the association of activity to this association of sources and repositories.

On reflection, it seems reasonable to have two separate associations - one
of SOURCE to REPOSITORY (called CATALOG?), and another of ACTIVITY to

I don't think that the LWG ever imagined that the Allen County Public
Library might ever publish an electronic catalog that was compatible with a
GDM compatible client. Hey, it was 1996.

Now it doesn't seem so far-fetched that a GDM compatible client could
contain links to online catalogs - assuming that they aren't being revised
in ways that break the links.

Does that complicate the issue sufficiently?


-----Original Message-----
Hans Fugal
Sent: Wednesday, July 10, 2002 11:08 PM
Subject: Re: [gdmxml] more thoughts on entering a source

I spent a while wrestling this out with my brother Jacob today. There
are situations where one would need to know more than just
repository-id and source-id. For instance, if a particular repository
had more than one copy of source and you wanted to indicate which one
you had searched, repository-id and source-id are not sufficient - you
would also need to know the call-number.  But the call-number itself is
not unique so can't be used as the primary key in repository-source.
Using activity-id as the third key doesn't seem to work though, because
of the extreme redundancy I pointed out. I think repository-source needs
an id field as a primary key, then search can reference that
repository-source-id instead of having repository-id and source-id, and
we take activity-id out of repository-source.

Jacob also helped me see the light on these associative tables (like
repository-source and source-group-source). While I understood their
importance in a database context, I was tempted to collapse them a bit
in xml context. While that's possible to do while still keeping data
integrity, it is better to keep it separate.

As always, I welcome your feedback...

* Stan Mitchell [Tue,  9 Jul 2002 at 23:12 -0700]
> Yes, it does seem that your suggestion reduces redundancy
> without sacrificing search capability.
> Hans Fugal wrote:
> >But then you have to store call-numbers possibly many times. For
> >example, a professional researcher would doubtless perform many searches
> >in any particular US Census. For that Census the repository, source, call
> >number and description would all be the same for every repository-source
> >record. The only unique information in each record would be the
> >activity-id. Yet if we take out the activity-id from repository-source
> >we get rid of that redundancy. AFAICS there is no loss of querying power
> >when we do so - search has all three keys, so if you want to know which
> >searches you did on a particular call-number, you only have to query the
> >search table with the repository-id and source-id.  Or am I still
> >missing something?
> >
> >
> _______________________________________________
> gdmxml mailing list
> http://fugal.net/cgi-bin/mailman/listinfo/gdmxml

"Everybody is talking about the weather but nobody does anything about it."
        -- Mark Twain

gdmxml mailing list

gdmxml mailing list

Reply via email to