Re: [CODE4LIB] Creating pdfs from images and their text

2014-01-18 Thread Dan Muresan
You could try to programatically match up each hOCR text block to a corresponding fragment from the transcripts, based on textual similarity (then replace the hOCR text with the "real" text). There's monotonicity in terms of XY coordinates vs offset in the transcript, i.e. (X1,Y1) < (X2,Y2) => text

[CODE4LIB] Call for speakers (deadline March 20): Taxonomy Boot Camp 2014, Washington DC, November 4-5

2014-01-18 Thread DCMI Announce
*** Apologies for cross-posting *** — *10th annual Taxonomy Boot Camp* November 4-5 as part of KMWorld Washington, DC *Website:* http://www.taxonomybootcamp.com/2014/ *Call for Speakers:* http://www.taxonomybootcamp.com/2014/CallForSpeakers.asp *Deadline:* March 20, 2014 ——

[CODE4LIB] rdf ontologies for archival descriptions

2014-01-18 Thread Eric Lease Morgan
If you were to select a set of RDF ontologies intended to be used in the linked data of archival descriptions, then what ontologies would you select? For simplicity's sake, RDF ontologies are akin to the fields in MARC records or the entities in EAD/XML files. Articulated more accurately, they a

Re: [CODE4LIB] COinS metadata format support

2014-01-18 Thread Karen Coyle
Roy, I'm not sure what tips you over into sarcasm mode (unless it's anything I say), but 1) the answer is a few posts down, albeit not in any detail 2) as a member-based organization that exists to serve its members, I would think that OCLC would want to encourage the gathering of information