Re: [CODE4LIB] HTML mark-up in MARC records

2009-06-22 Thread Danielle Plumer
Michael,

For institutions that catalog digital objects in MARC or link to digital 
surrogates as UT Arlington does, my recommendation is to use the 856 as follows:

856 41 $u http://www.uta.edu/library/ccon/images/thumbs/00384Thumb.jpg $3 
thumbnail image
856 41 $u http://www.uta.edu/library/ccon/mrsid_images/ccon/00384.sid $3 access 
image
856 42 $u http://libraries.uta.edu/ccon/scripts/ShowMap.asp?accession=00384 $3 
Cartographic Connections web site

If there were a finding aid, it would go in as

856 42 $u http://library.uta.edu/findingAids/maps.jsp $3 finding aid

There have been conservations on the AUTOCAT list about the subfield 3; there's 
no controlled vocabulary or even best practices for how to use it, which makes 
it very difficult to use as a guide to what exactly you're linking to. We're 
working on a formal set of best practices for digitization projects in Texas 
that will include a recommendation similar to this.

From a set of 856s like this, I can create a stylesheet to display the 
thumbnail image and link out to the website appropriately in our statewide 
image search tool Texas Heritage Online. I access UT Arlington's collections 
over Z39.50, btw -- see 
http://www.texasheritageonline.org/search.tkl?focus=target-utar-ccon.tklcclquery=mapoffset=1.
 

Having HTML tags in the MARC is unnecessary, and might break things in normal 
catalog displays. What I need most is consistency so that I don't have to 
figure out every possible variation for every possible system, which gets a bit 
old.

Danielle Cunniff Plumer, Coordinator
Texas Heritage Digitization Initiative
Texas State Library and Archives Commission
512.463.5852 (phone) / 512.936.2306 (fax)
dplu...@tsl.state.tx.us

-Original Message-
From: Code for Libraries [mailto:code4...@listserv.nd.edu]on Behalf Of
Doran, Michael D
Sent: Sunday, June 21, 2009 5:09 PM
To: CODE4LIB@LISTSERV.ND.EDU
Subject: Re: [CODE4LIB] HTML mark-up in MARC records


Hi Stuart,

 A couple of quick questions:

I'd be glad to answer, but I suspect these really only have relevance *after* 
the main issue (Is embedding HTML mark-up in MARC records a good/bad idea?) 
is decided.  ;-)

 (1) When you say HTML which version of HTML are you using?

For the HTML markup in the record, there's obviously no version explicitly 
specified.  Some img tags have an end tag (i.e. img src=URL /), so could 
be said to conform to XHTML 1.0, others have no end tag, so are generic HTML.  
The ILS in question declared pages to be HTML 4.0 Transitional in older 
versions of the online catalog but HTML standards compliance was wishful 
thinking.  The current version declares pages to be HTML 4.01 Transitional 
and comes a lot closer to conforming.

This does bring up the issue, though, of the potential for a mis-match in 
conformation to a  declared DOCTYPE between the HTML mark-up in the record, and 
the online opac's HTML mark-up. 

 (2) What tool are you using to validate the HTML inside the MARC?

None that I am aware of.  (Note I'm not in the cataloging department, so am not 
familiar with all their workflow.)

 (3) Since HTML can use character encodings that MARC doesn't understand, how 
 are you escaping the non-ASCII characters in the HTML?

I'm not sure what you are asking here.  I'm not aware of any HTML elements 
and/or attributes that contain non-ASCII characters.  Perhaps you are referring 
to data (or perhaps attribute values) rather than to the HTML mark-up code.  
Our MARC records are encoded in Unicode UTF-8, so potentially any character can 
be represented.  For display of the data on the web, the online catalog is 
declaring that character set in a meta tag: META http-equiv=Content-Type 
content=text/html; charset=UTF-8.

-- Michael

# Michael Doran, Systems Librarian
# University of Texas at Arlington
# 817-272-5326 office
# 817-688-1926 mobile
# do...@uta.edu
# http://rocky.uta.edu/doran/

From: Code for Libraries [code4...@listserv.nd.edu] On Behalf Of stuart yeates 
[stuart.yea...@vuw.ac.nz]
Sent: Sunday, June 21, 2009 4:05 PM
To: CODE4LIB@LISTSERV.ND.EDU
Subject: Re: [CODE4LIB] HTML mark-up in MARC records

Doran, Michael D wrote:
 Is anybody else embedding HTML mark-up code in MARC records [1]?  We're 
 currently including an img tag in some MARC Holdings records in the 856z 
 [2].   I'm inclined to think that HTML mark-up does not belong anywhere in 
 MARC records, but am looking for other opinions (preferably with the 
 reasoning behind the opinions), both pro and con.

A couple of quick questions:

(1) When you say HTML which version of HTML are you using?
(2) What tool are you using to validate the HTML inside the MARC?
(3) Since HTML can use character encodings that MARC doesn't understand,
how are you escaping the non-ASCII characters in the HTML?

cheers
stuart
--
Stuart Yeates
http://www.nzetc.org/   New Zealand Electronic Text Centre
http://researcharchive.vuw.ac.nz/ Institutional Repository


Re: [CODE4LIB] HTML mark-up in MARC records

2009-06-22 Thread Cloutman, David
From the perspective of a programmer, rather than a cataloguer, my opinion is 
firmly no, HTML does not belong in your MARC records. 

In application development, general best practice is to separate information 
systems into layers, splitting data from business logic and presentation 
logic. MARC stores data, and HTML belongs to presentation. Though it may sound 
like a good idea today to put HTML into a MARC record, that tag may be 
meaningless down the road when some other technology is used to present your 
record data. If you wish to present data in HTML, you are much better off 
leaving the HTML out of your MARC, and allowing the application to generate 
tags.


-Original Message-
From: Code for Libraries on behalf of Doran, Michael D
Sent: Sun 6/21/2009 1:12 PM
To: CODE4LIB@LISTSERV.ND.EDU
Subject: [CODE4LIB] HTML mark-up in MARC records
 
Is anybody else embedding HTML mark-up code in MARC records [1]?  We're 
currently including an img tag in some MARC Holdings records in the 856z 
[2].   I'm inclined to think that HTML mark-up does not belong anywhere in MARC 
records, but am looking for other opinions (preferably with the reasoning 
behind the opinions), both pro and con.  

I'm asking on code4lib as well as the voyager-l list in order to get a mix of 
ILS-specific and ILS-agnostic opinions (I'm not on any cataloging lists, or 
would probably ask there, too).  I tried googling this topic, but couldn't find 
anything of consequence; so if I've missed something there, and you could point 
me to it, I'd be obliged.

-- Michael

[1] http://en.wikipedia.org/wiki/HTML

[2] http://www.loc.gov/marc/holdings/hd856.html
  
# Michael Doran, Systems Librarian
# University of Texas at Arlington
# 817-272-5326 office
# 817-688-1926 mobile
# do...@uta.edu
# http://rocky.uta.edu/doran/


Email Disclaimer: http://www.co.marin.ca.us/nav/misc/EmailDisclaimer.cfm


Re: [CODE4LIB] HTML mark-up in MARC records

2009-06-22 Thread Eric Lease Morgan

On Jun 22, 2009, at 5:53 PM, Cloutman, David wrote:

From the perspective of a programmer, rather than a cataloguer, my  
opinion is firmly no, HTML does not belong in your MARC records.


In application development, general best practice is to separate  
information systems into layers, splitting data from business  
logic and presentation logic. MARC stores data, and HTML belongs  
to presentation. Though it may sound like a good idea today to put  
HTML into a MARC record, that tag may be meaningless down the road  
when some other technology is used to present your record data. If  
you wish to present data in HTML, you are much better off leaving  
the HTML out of your MARC, and allowing the application to generate  
tags.




I whole-heartedly concur, and I could not hardly have said it any  
better than David. Adding mark-up to a data structure like that only  
confuses the issue and is asking for trouble down the road.


--
Eric Lease Morgan


Re: [CODE4LIB] HTML mark-up in MARC records

2009-06-22 Thread Alexander Johannesen
Hiya,

I guess I'm the one who's got to step up to the self-slaughtering
altar, but the fact that a lot of our systems break or don't know how
to handle HTML is despicable. I'm sure you guys are familiar with RSS
/ Atom, and because in there we *expect* HTML and therefore make sure
our back-ends can grok it, it enhances the meta data *greatly*.

Don't think for a second that purity of the data format in any shape
or form is the definition of its usefulness. Mixed content models
might be complex to work with, but their value is immense. I can fully
understand *why* people say don't do it, because, yes, it ups the
complexity, and perhaps with these dinosaur technologies like MARC and
our ILS's breaking under the pressure of more modern technologies
enforces it, I don't think we should shun it because of it.

If your back-end can't grok HTML, I'd suggest you fix it immediately!
If your ILS chokes on XML and / or HTML snippets, I suggest you
replace it. You seriously shouldn't allow this rigidity into your
infra-structure, and it's depressing to watch how we as complex users
of MARC don't dare to extend it to become a format that does what it
should and need to do.

Even *if* HTML in MARC records probably is a bad idea.


Regards,

Alex
-- 
 Project Wrangler, SOA, Information Alchemist, UX, RESTafarian, Topic Maps
--- http://shelter.nu/blog/ --
-- http://www.google.com/profiles/alexander.johannesen ---


Re: [CODE4LIB] HTML mark-up in MARC records

2009-06-22 Thread Kyle Banerjee
 Don't think for a second that purity of the data format in any shape
 or form is the definition of its usefulness.

We'd be screwed if that was the case. ISBD punctuation has been in the
MARC record from the very beginning. Theoretically, it should be
totally unnecessary since the data is already structured

kyle


Re: [CODE4LIB] HTML mark-up in MARC records

2009-06-22 Thread Roy Tennant
On 6/22/09 6/22/09 € 4:17 PM, Alexander Johannesen
alexander.johanne...@gmail.com wrote:

 Even *if* HTML in MARC records probably is a bad idea.

Yes, it's such a bad idea it's hard to know where to begin. I'd like to
thank Kyle Banerjee for bringing up ISBD. This is like the HTML of the 60's
in the sense that now MARC is saddled with markup from the 60s that we
have ALL KINDS of trouble dealing with going forward. If we've learned
anything at all, it should be to not mix presentation with data. Let
succeeding generations (and us too!) decide how they wish to depict the data
-- but don't saddle them (and us) with the depictions of preceding
generations.
Roy