Hello Robert,

Thank you for sharing your insight regarding the REPRESENTATION
and SOURCE entities. I hope you don't mind if I ask a couple of
follow up questions.

The first question is about the relationship between REPOSITORY and

The SOURCE data definition, on page 74 of the GDM spec, states
"One SOURCE is found in zero to many REPOSITORYs (through

To setup my question, let's say I have one SOURCE that I've viewed
in three REPOSITORYs. As a table it might look like this
(I've formatted the tables using a fixed-pitch font) ...

ID     Comment   Source-ID   Repository-ID  ID    Name
3358   1850 Cens.  3358        2415         2415  NARA
                   3358        2617         2617  FHL
                   3358        2932         2932  HQ

The REPRESENTATION data definition, on page 67 of the spec,
states "One SOURCE has zero to many REPRESENTATIONs."
So its table might look like this ...

Source-ID   Representation-Type-ID  Medium    Content
3358        JPEG                   microfilm  nara.jpg
3358        JPEG                   microfilm  fhl.jpg
3358        JPEG                   digitally  hq.jpg

My question: How does a REPRESENTATION link back to the
REPOSITORY to determine where the image came from?

A different interpretation of the REPRESENTATION entity is
hinted at in another section of the specification. In the discussion
of the Evidence Submodel on page 28, it states "If there are multiple
copies of a SOURCE, break them out at the lower level of the
SOURCE hierarchy, and draw the ASSERTION from that level
or lower."

Does this mean that each REPRESENTATION corresponds to
a unique SOURCE? That seems to contradict the data definitions.

There is another relationship to REPRESENTATIONs.
In the data definition for REPOSITORY-SOURCE, on page 66 of
the spec, it states, "Each instance in this entity represents a
particular SOURCE in a specific REPOSITORY." This implies that
each instance of REPRESENTATION will be related to one instance
of REPOSITORY-SOURCE. This relationship is not discussed in the

I'm hoping that you can clear up my confusion about
what the "correct" interpretation should be.


This is my second question; it concerns the relationship between

Let's say I have a census source and it is recorded hierarchically
using this table (in reality, there would be many more SOURCEs
corresponding to all of the CITATION-PARTs) ...

ID      Higher-Source-ID  Subject-Date  Comments
2405    -                 1850          pub level data
2406    2405              June 4, 1850  page level data
2407    2406              -             household data
2408    2407              -             individual data

If the REPRESENTATION is for a page of the census, would the Source-ID
be 2406? If I am using the REPRESENTATION as supporting evidence for
an individual datum, say age at last birthday, could the Source-ID also
be 2408? In table form, it would look like this ...

Source-ID   Representation-Type-ID  Medium    Content
2406        JPEG                    microfilm 1850page.jpg
2408        JPEG                    microfilm 1850page.jpg

It seems this would contradict the data definition on page 67
which states: "One REPRESENTATION is a manifestation of one SOURCE".
Here we have two SOURCEs to one REPRESENTATION (or image file).

Does this imply that I would need to create separate textual
REPRESENTATIONs to refer to the line and column items on
the census page. So instead, these REPRESENTATIONs
might be used ...

Source-ID   Representation-Type-ID  Medium    Content
2406        JPEG                    microfilm 1850page.jpg
2408        TEXT                    microfilm "age, 45"

The last REPRESENTATION appears to duplicate what is
also recorded in the CITATION-PARTs.

My question: does a REPRESENTATION correspond to a particular
(single) level in the SOURCE hierarchy and when is a REPRESENTATION
I hope my questions don't appear as hair-splitting.
I'm trying to create a faithful UML representation of GDM
and not introduce too much of my own interpretation.

Stan Mitchell

----- Original Message -----
From: "RCA" <
To: "Hans Fugal" <
Sent: Friday, August 16, 2002 5:59 PM
Subject: Re: [gdmxml] Representation

> First, since I have not posted to this list before, let me introduce
> I am Robert Charles Anderson, one of the "Principal Members" of the
> Working Group.  Everybody in the LWG had both genealogical and
> skills, in my case weighted more toward the former than the latter,
> I did learn a great deal about data modelling during the four years the
> group worked together.
> I'll try to answer the first of Hans's questions, and maybe that will
> clarify some of the later questions as well.  As an example, let us say we
> are working with a recorded deed, in a situation where the original deed
> does not survive, or at least we don't know where it is.  Then the
> deed is the SOURCE.  If I create a written (paper) abstract of that deed,
> then that is one REPRESENTATION.  Then perhaps I go back to the courthouse
> with my digital camera and take a photograph of that same deed, and now I
> have a second REPRESENTATION.  And then I make a complete transcript, as a
> Word document, of the deed, and now I have a third.
> All three of these might end up in electronic form, but the abstract is
> still a paper document, and could have a Physical-File-Code.  A
> copy of the deed, not in digital form, would be another REPRESENTATION of
> the same SOURCE, and could also have a Physical-File-Code.
> A second example of this would be a photograph of the family picnic on
> 4th of 1902, which your grandmother gave you when you were young.  This
> would be the SOURCE in this instance.  A restored version of the photo
> be a REPRESENTATION, as would a digitized and stored version.
> To answer two of the sub-parts of your first question:
> "a photograph in my file cabinet" may be a SOURCE or a REPRESENTATION
> depending on its pedigree.
> REPRESENTATIONs are not "only encodings of the source that can be stored
> transmitted" electronically.
> Hope this helps.

Reply via email to