I've been using TEI for a while now, both at http://www.nzetc.org/ and
http://www.oss-watch.ac.uk/ and while you may already know this:
* Make sure you're using the P5 release of TEI, P4 and earlier don't
play well being parts of larger documents (xml:lang, xml:id and schema
issues).
* TEI uses the document's logical structure as is starting point
(chapters, paragraphs, etc), rather than the physical structure (i.e.
volumes and pages, etc), but if you're working in a page-centric larger
system, you want to pay special attention to how you're going to encode
pagebreaks in the TEI (See
http://www.tei-c.org/release/doc/tei-p5-doc/en/html/ref-pb.html) so you
can coordinate correctly. And whether you want the links to the page
images to appear in the METS or the TEI (or both). We use corresp's to
point from pagebreaks to the associated images.
In out texts pagebreaks can occur almost anywhere and are marked as:
...<pb xml:id="n112" n="98" corresp="#MarAmon112"/>...
The @n is the page number printed on the page. And in the header:
<notesStmt xml:id="notesStmt-0001">
<note xml:id="page-images">
<list>
...
<item>
<figure xml:id="MarAmon112">
<graphic url="MarAmon112.gif" mimeType="image/gif"
xml:id="MarAmon112-g" n="fp98"/>
</figure>
</item>
...
</list>
</note>
</notesStmt>
For the file this example is taken from:
http://www.nzetc.org/tm/scholarly/tei-MarAmon.html "Other formats" -> "TEI"
cheers
stuart
Christine Schwartz wrote:
Hi Mike,
I'm pretty comfortable with METS, but very new to TEI. So, what I write here
is just an attempt to reflect what my more experienced colleagues are
saying:
* It seems the structure of TEI documents can be problematic since they
follow a logical structure, by paragraphs/sections. And the structMap of all
our METS documents, so far, are divided up by pages of text, not paragraphs.
So the TEI structure does not fit nicely into METS the way we're using METS.
* We're also concerned with not having redundant metadata in the TEI header
and the dmdSec of the METS document. So, we're considering keeping the TEI
header very brief and relying on the METS doc for
descriptive/administrative/technical metadata. (We won't be deriving METS
from TEI which is another issue.)
* The other issue has already been raised by Liza Daly: performance. We've
been told by one of the programmers at Mark Logic that we should embed the
TEI docs into METS for good performance, but we have other reasons why we
don't want to embed the TEI (editing, maintenance, etc.). So, we are
considering writing a script that would integrate the METS and TEI at the
point a search is deployed.
* From the metadata standpoint, I want to keep the TEI docs separate and
link out to them from the METS docs, because I'm not convinced that library
metadata standards are stable. If we move away from using METS in the next
5-10 years, I think it would be easier if all the text/image files remained
separate from the metadata. So, I'd prefer links in the fileSec of METS that
link out to external TEI files.
Chris
On Wed, Nov 4, 2009 at 2:32 PM, Michael J. Giarlo <
[email protected]> wrote:
On Wed, Nov 4, 2009 at 14:38, stuart yeates <[email protected]>
wrote:
Christine Schwartz wrote:
Should we consider embedding the TEI in the METS documents, or just link
out
to them?
It depends on what you're doing and who the likely users of the METS are.
The trouble with separate files, is that they inevitably get separated.
Not to threadjack, but I am curious, Christine: how would you handle
linking between the TEI and the METS?
It could be there's an obvious answer and I'm having a "duh" moment
(or lifetime), of course.
-Mike
--
Stuart Yeates
http://www.nzetc.org/ New Zealand Electronic Text Centre
http://researcharchive.vuw.ac.nz/ Institutional Repository