Lee,

I've had a chance to really look at and examine your schema diagram, and
I've got a few comments...

By removing the attribute-source, event-source, child-source and
name-source tables, you've created many-to-many relationships.  The
problem here is that for any record in attribute, event, child or name,
the researcher will have an arbitrary list of references that describe it.
For example, for myself, my birth date is listed on my birth certificate
and on my marriage license.  By putting the source ID in the event table,
we restrict the user to entering just one reference for each event.  With
this schema, how would I list both sources for the birth?  If we have to
have two records in the event table that are duplicates except for the
source, this would be taking up much more room in the database than is
really necessary.

To show you what I'm thinking in another example, the application that I
maintain at work is one that is used to create sales orders and invoices
(among other things) for our business.  If we look at the sales order,
each order has a header and detail section.  In the header (the SOH
table), we have information about the customer and the shipping (UPS,
FedEx, etc.) and billing options (COD, Net 30, etc.).  Since each sales
order has exactly one customer, we can store the customer ID in the header
table.  We've also got another table (the INV table) that stores
information on each inventory item (part number, price, vendor,
manufacturer, etc.), but we don't have a field in the sales order header
for the part number.  We have a separate table (the SOI table) of sales
order detail that is used to list the parts ordered for each sales order
(part number, quantity, color, selling price, etc.).  The three tables
look like this:

  +---------------+      +-----------+      +------------+
  | SOH           |      | SOI       |      | INV        |
  +---------------+1    M+-----------+     1+------------+
  | soh_no        |------| soi_sono  |M  +--| inv_no     |
  | soh_cusno     |      | soi_invno |---+  | inv_descr  |
  | soh_date      |      | soi_qty   |      | inv_onhand |
  | soh_shipterms |      | soi_price |      | ...        |
  | soh_billterms |      | ...       |      +------------+
  | ...           |      +-----------+
  +---------------+

If we use this sales order analogy on our database for this project, the
event table is represented by SOH above, the source table is represented
by INV, and the event-source table is represented by SOI.  So, the four
tables that were removed, event-source, attribute-source, name-source and
child-source (I think that was all of them like this) are needed for this
database.  There is something to be said for the KISS (Keep It Simple,
Stupid) principle, but this is one situation where reducing the number of
the tables does not work.

I'm not so sure about putting dates in their own table.  Granted, this
could reduce the memory requirement for the database, since the researcher
will not know quite a lot of events as he moves back through the links in
generations.  However, if the reason for splitting dates into their own
table was to add a surety level on the date that is separate from the
surety on the event, for example, it seems that it would be simpler to
just add a evt_date_surety field to the event table.

Now, on combining the attribute-type and event-type into an object-type
table...  This could work.  The only change that I would make is to add a
field (oty_numpersons?) to identify if the event should have one primary
participant (like a death) or two primary participants (like a marriage).

Finally, looking at the name_list and words tables, this could also work,
but perhaps not like it is here.  Basically, we'd be using the name_list
table to create a linked list of words for either place or person names.
My biggest concern is that I want to be able to sort data in the
application (either on the screen or in reports) by place name or person
name (at any granularity level).  Using a table to build linked lists
could make writing these sorting algorithms a bit more difficult,
especially since we can build an index on each of the name fields in the
previous schema, and let the DBMS sort the data for us.

--
Sean Lamb
Chief Webmaster, SCWD, NMRA --- http://www.scwd-nmra.org
Historian and Webmaster, Capitol City "N"Gineers ---
http://www.msn.fullfeed.com/~slambo/capcity.htm
Madison Linux User Group --- http://www.madisonlinux.org


_______________________________________________
Genes-devel mailing list
[EMAIL PROTECTED]
http://lists.sourceforge.net/mailman/listinfo/genes-devel

Reply via email to