[CODE4LIB] Archivists' Toolkit: Adding Digital Objects via MySQL

2012-04-18 Thread Rosalyn Metz
Hi Everyone, I posted this over on the Archivists' Toolkit listserv and got no response (yet), so I thought I might try here as well. I have a large quantity (around 300+) of digital objects that I need to add to Archivists' Toolkit. I think I've figured out what queries I need to run in order

[CODE4LIB] Job: Two-Year Research Fellowship in Digital Curation at University of Colorado at Boulder

2012-04-18 Thread jobs
Two-Year Research Fellowship in Digital Curation Journalism and Mass Communication University of Colorado at Boulder We are seeking to hire a research fellow with a degree in Library and/or Information Science, or an arts, humanities or social science discipline in which the candidate has

Re: [CODE4LIB] more on MARC char encoding: Now we're about ISO_2709 and MARC21

2012-04-18 Thread Tod Olson
It has to mean UTF-8. ISO 2709 is very byte-oriented, from the directory structure to the byte-offsets in the fixed fields. The values in these places all assume 8-bit character data, it's completely baked in to the file format. -Tod On Apr 17, 2012, at 6:55 PM, Jonathan Rochkind wrote:

Re: [CODE4LIB] more on MARC char encoding: Now we're about ISO_2709 and MARC21

2012-04-18 Thread Peter Noerr
We cried our eyes out in 1976 when this first came to our attention at the BL. Even more crying when we couldn't get rid of it in the MARC-I to MARC-II conversion (well before MARC21 was even a twinkle) - a lot of tears are gathering somewhere. Peter -Original Message- From: Code

Re: [CODE4LIB] more on MARC char encoding: Now we're about ISO_2709 and MARC21

2012-04-18 Thread Jonathan Rochkind
On 4/18/2012 6:04 AM, Tod Olson wrote: It has to mean UTF-8. ISO 2709 is very byte-oriented, from the directory structure to the byte-offsets in the fixed fields. The values in these places all assume 8-bit character data, it's completely baked in to the file format. I'm not sure that

Re: [CODE4LIB] more on MARC char encoding: Now we're about ISO_2709 and MARC21

2012-04-18 Thread Doran, Michael D
Hi Tod, I'm not understanding how UTF-8 would be considered 8-bit character data (other than the ASCII-range of the Unicode repertoire, natch). I don't think ISO 2709 knows from characters, only bytes. -- Michael # Michael Doran, Systems Librarian # University of Texas at Arlington #

Re: [CODE4LIB] more on MARC char encoding: Now we're about ISO_2709 and MARC21

2012-04-18 Thread LeVan,Ralph
In fact, I worry that the standard may pre-date UTF-8, with it's reference to UCS --- if I understand things right, at one point there was only one unicode encoding, called UCS, which is basically a backwards-compatible subset of what became UTF-16. So I worry the standard really means

Re: [CODE4LIB] more on MARC char encoding: Now we're about ISO_2709 and MARC21

2012-04-18 Thread Karen Coyle
UTF-8 was the marc standard from the beginning: http://www.loc.gov/marc/marbi/1998/98-18.html The first proposals were a character mapping between Unicode and MARC-8 and didn't mention the character encodings, thus the term UCS which was a common term for Unicode at that time. (see:

Re: [CODE4LIB] more on MARC char encoding: Now we're about ISO_2709 and MARC21

2012-04-18 Thread Huwig,Steve
I could be mistaken (never having had the pleasure of reading it), but isn't ISO-2709 specified as a fixed number of characters, and any conflation of characters and 8-bit bytes is on the part of users and implementations? I think ISO 2709 might not know from bytes, only characters.

Re: [CODE4LIB] more on MARC char encoding: Now we're about ISO_2709 and MARC21

2012-04-18 Thread Doran, Michael D
I could be mistaken (never having had the pleasure of reading it), but isn't ISO-2709 specified as a fixed number of characters, and any conflation of characters and 8-bit bytes is on the part of users and implementations? I don't believe that is the case. Take UTF-8 out of the picture, and

Re: [CODE4LIB] Archivists' Toolkit: Adding Digital Objects via MySQL

2012-04-18 Thread Mennerich, Donald
Rosalyn, I've written a number of scripts of this nature. Here's a quick one I wrote recently to add DAOs to our AT for an audio digitization project (note it does not include file versions, just Components, Instances and DAOs). It starts at the ResourceComponent identified by the long at the

Re: [CODE4LIB] more on MARC char encoding: Now we're about ISO_2709 and MARC21

2012-04-18 Thread Andy Kohler
I don't know about ISO 2709 itself, but the MARC21 implementation of it refers to octets, aka 8-bit bytes: http://www.loc.gov/marc/specifications/specrecstruc.html Characters may be encoded using one or more than one octet, depending on the character set. All ASCII characters are encoded using

Re: [CODE4LIB] more on MARC char encoding: Now we're about ISO_2709 and MARC21

2012-04-18 Thread Houghton,Andrew
-Original Message- From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf Of Jonathan Rochkind Sent: Tuesday, April 17, 2012 19:55 To: CODE4LIB@LISTSERV.ND.EDU Subject: [CODE4LIB] more on MARC char encoding: Now we're about ISO_2709 and MARC21 Okay, forget XML for a

Re: [CODE4LIB] more on MARC char encoding: Now we're about ISO_2709 and MARC21

2012-04-18 Thread Doran, Michael D
ISO 2709 doesn't care how many bytes your characters are. The directory and offsets and other things count bytes, not characters. That was exactly my point. (Which I am stating since you quoted me and I couldn't tell if you were refuting my point, or using it to support your conclusion.)

[CODE4LIB] Islandora Camp 2012 Registration Public Brainstorm/Call for Proposals

2012-04-18 Thread David Wilcox
* Apologies for cross-posting * We're excited to invite you all to the third annual Islandora Camp (Aug 1-3, 2012).  Islandora Camp welcomes developers, administrators, and users of Islandora  to meet, learn, and grow the ecosystem! Registration for Islandora Camp is now open, and is available

[CODE4LIB] Representing geographic hiearchy in linked data

2012-04-18 Thread Ethan Gruber
No Message Collected

Re: [CODE4LIB] Job: Senior Application Developer at New York Public Library

2012-04-18 Thread Ross Singer
No Message Collected

Re: [CODE4LIB] more on MARC char encoding: Now we're about ISO_2709 and MARC21

2012-04-18 Thread Tod Olson
In practice it seems to mean UTF-8. At least I've only seen UTF-8, and I can't imagine the code that processes this stuff being safe for UTF-16 or UTF-32. All of the offsets are byte-oriented, and there's too much legacy code that makes assumption about null-terminated strings. -Tod On Apr

[CODE4LIB] JCDL 2012 registration opens today, April 5

2012-04-18 Thread Howard, Barrie
No Message Collected

Re: [CODE4LIB] Job: Senior Application Developer at New York Public Library

2012-04-18 Thread Cary Gordon
No Message Collected

[CODE4LIB] Google Scholar Indexing Guidelines: Highwire Press vs. Eprints vs. BE Press vs. PRISM?

2012-04-18 Thread Brett Bonfield
No Message Collected

[CODE4LIB] Job: Records Management Archivist at Johns Hopkins University

2012-04-18 Thread jobs
The Johns Hopkins University Sheridan Libraries is hiring a Records Management Archivist to work with the University Archivist to develop an innovative approach to records management with the purpose of improving our stewardship of a university history that exists in print, digitized, and