Re: [CODE4LIB] HTML mark-up in MARC records
Michael, For institutions that catalog digital objects in MARC or link to digital surrogates as UT Arlington does, my recommendation is to use the 856 as follows: 856 41 $u http://www.uta.edu/library/ccon/images/thumbs/00384Thumb.jpg $3 thumbnail image 856 41 $u http://www.uta.edu/library/ccon/mrsid_images/ccon/00384.sid $3 access image 856 42 $u http://libraries.uta.edu/ccon/scripts/ShowMap.asp?accession=00384 $3 Cartographic Connections web site If there were a finding aid, it would go in as 856 42 $u http://library.uta.edu/findingAids/maps.jsp $3 finding aid There have been conservations on the AUTOCAT list about the subfield 3; there's no controlled vocabulary or even best practices for how to use it, which makes it very difficult to use as a guide to what exactly you're linking to. We're working on a formal set of best practices for digitization projects in Texas that will include a recommendation similar to this. From a set of 856s like this, I can create a stylesheet to display the thumbnail image and link out to the website appropriately in our statewide image search tool Texas Heritage Online. I access UT Arlington's collections over Z39.50, btw -- see http://www.texasheritageonline.org/search.tkl?focus=target-utar-ccon.tklcclquery=mapoffset=1. Having HTML tags in the MARC is unnecessary, and might break things in normal catalog displays. What I need most is consistency so that I don't have to figure out every possible variation for every possible system, which gets a bit old. Danielle Cunniff Plumer, Coordinator Texas Heritage Digitization Initiative Texas State Library and Archives Commission 512.463.5852 (phone) / 512.936.2306 (fax) dplu...@tsl.state.tx.us -Original Message- From: Code for Libraries [mailto:code4...@listserv.nd.edu]on Behalf Of Doran, Michael D Sent: Sunday, June 21, 2009 5:09 PM To: CODE4LIB@LISTSERV.ND.EDU Subject: Re: [CODE4LIB] HTML mark-up in MARC records Hi Stuart, A couple of quick questions: I'd be glad to answer, but I suspect these really only have relevance *after* the main issue (Is embedding HTML mark-up in MARC records a good/bad idea?) is decided. ;-) (1) When you say HTML which version of HTML are you using? For the HTML markup in the record, there's obviously no version explicitly specified. Some img tags have an end tag (i.e. img src=URL /), so could be said to conform to XHTML 1.0, others have no end tag, so are generic HTML. The ILS in question declared pages to be HTML 4.0 Transitional in older versions of the online catalog but HTML standards compliance was wishful thinking. The current version declares pages to be HTML 4.01 Transitional and comes a lot closer to conforming. This does bring up the issue, though, of the potential for a mis-match in conformation to a declared DOCTYPE between the HTML mark-up in the record, and the online opac's HTML mark-up. (2) What tool are you using to validate the HTML inside the MARC? None that I am aware of. (Note I'm not in the cataloging department, so am not familiar with all their workflow.) (3) Since HTML can use character encodings that MARC doesn't understand, how are you escaping the non-ASCII characters in the HTML? I'm not sure what you are asking here. I'm not aware of any HTML elements and/or attributes that contain non-ASCII characters. Perhaps you are referring to data (or perhaps attribute values) rather than to the HTML mark-up code. Our MARC records are encoded in Unicode UTF-8, so potentially any character can be represented. For display of the data on the web, the online catalog is declaring that character set in a meta tag: META http-equiv=Content-Type content=text/html; charset=UTF-8. -- Michael # Michael Doran, Systems Librarian # University of Texas at Arlington # 817-272-5326 office # 817-688-1926 mobile # do...@uta.edu # http://rocky.uta.edu/doran/ From: Code for Libraries [code4...@listserv.nd.edu] On Behalf Of stuart yeates [stuart.yea...@vuw.ac.nz] Sent: Sunday, June 21, 2009 4:05 PM To: CODE4LIB@LISTSERV.ND.EDU Subject: Re: [CODE4LIB] HTML mark-up in MARC records Doran, Michael D wrote: Is anybody else embedding HTML mark-up code in MARC records [1]? We're currently including an img tag in some MARC Holdings records in the 856z [2]. I'm inclined to think that HTML mark-up does not belong anywhere in MARC records, but am looking for other opinions (preferably with the reasoning behind the opinions), both pro and con. A couple of quick questions: (1) When you say HTML which version of HTML are you using? (2) What tool are you using to validate the HTML inside the MARC? (3) Since HTML can use character encodings that MARC doesn't understand, how are you escaping the non-ASCII characters in the HTML? cheers stuart -- Stuart Yeates http://www.nzetc.org/ New Zealand Electronic Text Centre http://researcharchive.vuw.ac.nz/ Institutional Repository
Re: [CODE4LIB] HTML mark-up in MARC records
From the perspective of a programmer, rather than a cataloguer, my opinion is firmly no, HTML does not belong in your MARC records. In application development, general best practice is to separate information systems into layers, splitting data from business logic and presentation logic. MARC stores data, and HTML belongs to presentation. Though it may sound like a good idea today to put HTML into a MARC record, that tag may be meaningless down the road when some other technology is used to present your record data. If you wish to present data in HTML, you are much better off leaving the HTML out of your MARC, and allowing the application to generate tags. -Original Message- From: Code for Libraries on behalf of Doran, Michael D Sent: Sun 6/21/2009 1:12 PM To: CODE4LIB@LISTSERV.ND.EDU Subject: [CODE4LIB] HTML mark-up in MARC records Is anybody else embedding HTML mark-up code in MARC records [1]? We're currently including an img tag in some MARC Holdings records in the 856z [2]. I'm inclined to think that HTML mark-up does not belong anywhere in MARC records, but am looking for other opinions (preferably with the reasoning behind the opinions), both pro and con. I'm asking on code4lib as well as the voyager-l list in order to get a mix of ILS-specific and ILS-agnostic opinions (I'm not on any cataloging lists, or would probably ask there, too). I tried googling this topic, but couldn't find anything of consequence; so if I've missed something there, and you could point me to it, I'd be obliged. -- Michael [1] http://en.wikipedia.org/wiki/HTML [2] http://www.loc.gov/marc/holdings/hd856.html # Michael Doran, Systems Librarian # University of Texas at Arlington # 817-272-5326 office # 817-688-1926 mobile # do...@uta.edu # http://rocky.uta.edu/doran/ Email Disclaimer: http://www.co.marin.ca.us/nav/misc/EmailDisclaimer.cfm
Re: [CODE4LIB] HTML mark-up in MARC records
On Jun 22, 2009, at 5:53 PM, Cloutman, David wrote: From the perspective of a programmer, rather than a cataloguer, my opinion is firmly no, HTML does not belong in your MARC records. In application development, general best practice is to separate information systems into layers, splitting data from business logic and presentation logic. MARC stores data, and HTML belongs to presentation. Though it may sound like a good idea today to put HTML into a MARC record, that tag may be meaningless down the road when some other technology is used to present your record data. If you wish to present data in HTML, you are much better off leaving the HTML out of your MARC, and allowing the application to generate tags. I whole-heartedly concur, and I could not hardly have said it any better than David. Adding mark-up to a data structure like that only confuses the issue and is asking for trouble down the road. -- Eric Lease Morgan
Re: [CODE4LIB] HTML mark-up in MARC records
Hiya, I guess I'm the one who's got to step up to the self-slaughtering altar, but the fact that a lot of our systems break or don't know how to handle HTML is despicable. I'm sure you guys are familiar with RSS / Atom, and because in there we *expect* HTML and therefore make sure our back-ends can grok it, it enhances the meta data *greatly*. Don't think for a second that purity of the data format in any shape or form is the definition of its usefulness. Mixed content models might be complex to work with, but their value is immense. I can fully understand *why* people say don't do it, because, yes, it ups the complexity, and perhaps with these dinosaur technologies like MARC and our ILS's breaking under the pressure of more modern technologies enforces it, I don't think we should shun it because of it. If your back-end can't grok HTML, I'd suggest you fix it immediately! If your ILS chokes on XML and / or HTML snippets, I suggest you replace it. You seriously shouldn't allow this rigidity into your infra-structure, and it's depressing to watch how we as complex users of MARC don't dare to extend it to become a format that does what it should and need to do. Even *if* HTML in MARC records probably is a bad idea. Regards, Alex -- Project Wrangler, SOA, Information Alchemist, UX, RESTafarian, Topic Maps --- http://shelter.nu/blog/ -- -- http://www.google.com/profiles/alexander.johannesen ---
Re: [CODE4LIB] HTML mark-up in MARC records
Don't think for a second that purity of the data format in any shape or form is the definition of its usefulness. We'd be screwed if that was the case. ISBD punctuation has been in the MARC record from the very beginning. Theoretically, it should be totally unnecessary since the data is already structured kyle
Re: [CODE4LIB] HTML mark-up in MARC records
On 6/22/09 6/22/09 4:17 PM, Alexander Johannesen alexander.johanne...@gmail.com wrote: Even *if* HTML in MARC records probably is a bad idea. Yes, it's such a bad idea it's hard to know where to begin. I'd like to thank Kyle Banerjee for bringing up ISBD. This is like the HTML of the 60's in the sense that now MARC is saddled with markup from the 60s that we have ALL KINDS of trouble dealing with going forward. If we've learned anything at all, it should be to not mix presentation with data. Let succeeding generations (and us too!) decide how they wish to depict the data -- but don't saddle them (and us) with the depictions of preceding generations. Roy