Re: [CODE4LIB] is this valid marc ?

2011-05-19 Thread Doran, Michael D
> Is it really true that newline characters are not allowed in a marc > value? Yes. CONTROL FUNCTION CODES [1] Eight characters are specifically designated as control characters for MARC 21 use: - escape character, 1B(hex) in MARC-8 and Unicode encoding - subfield delimiter, 1F(hex) i

Re: [CODE4LIB] is this valid marc ?

2011-05-19 Thread Jonathan Rochkind
Thanks Michael. So one weird thing is that at least some of those characters "specifically designated as control characters" aren't ordinarily what everyone else considers "control characters". To me, "control character" means ASCII less than 20. Which the last four aren't. So now it's unclear

Re: [CODE4LIB] is this valid marc ?

2011-05-19 Thread Reese, Terry
It's been a while since I looked of the ISO spec (which I still can't believe I had to buy to read) -- but you can certainly infer by looking at legal characters laid out by LC. In reality -- only a handful of unprintable characters are technically allowed in a MARC record -- but you have to re

Re: [CODE4LIB] is this valid marc ?

2011-05-19 Thread Jonathan Rochkind
On 5/19/2011 2:33 PM, Reese, Terry wrote: Jonathan, Karen is correct -- CR/LF are invalid characters within a MARC record. This has nothing to do if the character is valid in the set -- the format itself doesn't allow it. I'm curious where in the spec it says this -- of course, it's an int

Re: [CODE4LIB] is this valid marc ?

2011-05-19 Thread Jonathan Rochkind
On 5/19/2011 2:33 PM, Kyle Banerjee wrote: However, what would be the use case for including them as you don't know how they'll be interpreted by the app that you hand the data to? Only when the destination is an app you have complete control over too. One use case I was idly turning over in

Re: [CODE4LIB] is this valid marc ?

2011-05-19 Thread Reese, Terry
Jonathan, Karen is correct -- CR/LF are invalid characters within a MARC record. This has nothing to do if the character is valid in the set -- the format itself doesn't allow it. --TR -Original Message- From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf Of Jonatha

Re: [CODE4LIB] is this valid marc ?

2011-05-19 Thread Kyle Banerjee
Is it really true that newline characters are not allowed in a marc value? > I thought they were, not with any special meaning, just as ordinary data. > If they're not, that's useful to know, so I don't put any there! > This is also my understanding. However, what would be the use case for incl

Re: [CODE4LIB] is this valid marc ?

2011-05-19 Thread Jonathan Rochkind
I wonder if it depends on if your record is in Marc8 or UTF-8, if I'm reading Karen right to say that CR/LF aren't in the Marc8 character set. They're certainly in UTF-8! And a Marc record can be in UTF-8. On 5/19/2011 2:27 PM, Jonathan Rochkind wrote: Is it really true that newline characters

Re: [CODE4LIB] is this valid marc ?

2011-05-19 Thread Jonathan Rochkind
Is it really true that newline characters are not allowed in a marc value? I thought they were, not with any special meaning, just as ordinary data. If they're not, that's useful to know, so I don't put any there! I'd ask for a reference to the standard that says this, but I suspect it's go

Re: [CODE4LIB] is this valid marc ?

2011-05-19 Thread Karen Coyle
Quoting Andreas Orphanides : Anyway, I think having these two parts of the same URL data on separate lines is definitely Not Right, but I am not sure if it adds up to invalid MARC. Exactly. The CR and LF characters are NOT defined as valid in the MARC character set and should not be use

Re: [CODE4LIB] is this valid marc ?

2011-05-19 Thread Ross Singer
On Thu, May 19, 2011 at 1:33 PM, Bill Dueber wrote: > record['856'] is defined to return the *first* 856 in the record, which, if > you look at the documentation...er...ok. Which is not documented as such in > MARC::Record (http://rubydoc.info/gems/marc/0.4.2/MARC/Record) > > To get them all, you

Re: [CODE4LIB] MARCXML to MODS: 590 Field

2011-05-19 Thread Meehleib, Tracy
Jon and Karen are correct. LC doesn't map/convert local fields because usage varies. Tracy Tracy Meehleib Network Development and MARC Standards Office Library of Congress 101 Independence Ave SE Washington, DC 20540-4402 +1 202 707 0121 (voice) +1 202 707 0115 (fax) t...@loc.gov -Origina

Re: [CODE4LIB] Seth Godin on The future of the library

2011-05-19 Thread Jonathan Rochkind
On 5/19/2011 1:23 PM, Ryan Engel wrote: There are some who argue that if it's valuable to others, then others should pay for it (even when the improved access benefits your institution first and foremost, and distribution of the improvements is an arguably beneficial side effect) . Why should

Re: [CODE4LIB] is this valid marc ?

2011-05-19 Thread Jonathan Rochkind
I believe that the ruby-marc API, when you do record['856'], you just get the first 856, if there are more than one. You have to use other API (I forget offhand) to get more than one, the ['856'] is just a shortcut when you will only have one or only care about the first one. So I don't think

Re: [CODE4LIB] is this valid marc ?

2011-05-19 Thread Bill Dueber
record['856'] is defined to return the *first* 856 in the record, which, if you look at the documentation...er...ok. Which is not documented as such in MARC::Record (http://rubydoc.info/gems/marc/0.4.2/MARC/Record) To get them all, you need to do something like sixfifties = record.fields '650'

Re: [CODE4LIB] Seth Godin on The future of the library

2011-05-19 Thread Ryan Engel
There are some who argue that if it's valuable to others, then others should pay for it (even when the improved access benefits your institution first and foremost, and distribution of the improvements is an arguably beneficial side effect) . Why should one institution carry the financial burd

Re: [CODE4LIB] is this valid marc ?

2011-05-19 Thread Jon Gorman
You've gotten some other good responses, but I thought I'd mention the LoC and OCLC sites on MARC if you haven't seen them yet. First, the LoC site at http://www.loc.gov/marc/. This is what I use as a guide and a reference. Some folks prefer the OCLC docs http://www.oclc.org/bibformats/en/, part

Re: [CODE4LIB] is this valid marc ?

2011-05-19 Thread James Lecard
I'll dig in this one, thanks for this input Jonathan... I'm not so so familiar with the library yet, I'll do some more debugging but in fact what is happening is that I have no value with an access such as record['856']['u'] field, while I get one for record['856']['q'] And the marc you are seeing

Re: [CODE4LIB] is this valid marc ?

2011-05-19 Thread Jonathan Rochkind
I'm curious what's going on here, it doesn't make any sense. Do you just mean that your MARC file has more than one 856 in it? That's what your pasted marc looks like, but that is definitely legal, AND I've parsed many many marc files with more than one 856 in them, with ruby-marc, it was not

Re: [CODE4LIB] is this valid marc ?

2011-05-19 Thread Jonathan Rochkind
Now whether it _means_ what you want it to mean is another question, yeah. As Andreas said, I don't think that particular example _ought_ to have two 856's. But it ought to be perfectly parseable marc. If your 'patch' is to make ruby-marc combine those multiple 856's into one -- that is not r

Re: [CODE4LIB] is this valid marc ?

2011-05-19 Thread Andreas Orphanides
In my last message, some of my "subfield"s should of course read "indicator". Still digesting lunch -dre. On 5/19/2011 12:37 PM, James Lecard wrote: I'm using ruby-marc ruby parser (v.0.4.2) to parse some marc files I get from a partner. The 856 field is splitted over 2 lines, causing the

Re: [CODE4LIB] is this valid marc ?

2011-05-19 Thread James Lecard
Thanks a lot Richard, So I guess my patch could be ported to the source code of ruby-marc, Let me know if interested, James 2011/5/19 Richard, Joel M > I'm no MARC expert, but I've learned enough to say that yes, this is valid > in that what you're seeing is the $q (Electronic format type) an

Re: [CODE4LIB] is this valid marc ?

2011-05-19 Thread Andreas Orphanides
From the MARC documentation [1]: "Field 856 is repeated when the location data elements vary (the URL in subfield $u or subfields $a, $b, $d, when used). It is also repeated when more than one access method is used, different portions of the item are available electronically, mirror sites are

Re: [CODE4LIB] MARCXML to MODS: 590 Field

2011-05-19 Thread Richard, Joel M
Thanks, Karen and Jon! That's what I suspected, but I couldn't find anything on the web about the thought process behind ignoring the 590 altogether. We'll likely end up using a local version of the XSLT to map it the mods:note as you suggested. We simply don't want this information to be lost

Re: [CODE4LIB] is this valid marc ?

2011-05-19 Thread Richard, Joel M
I'm no MARC expert, but I've learned enough to say that yes, this is valid in that what you're seeing is the $q (Electronic format type) and $u (Uniform Resource Identifier ) subfields of the 856 field. http://www.oclc.org/bibformats/en/8xx/856.shtm You'll see other things when you get multipl

[CODE4LIB] is this valid marc ?

2011-05-19 Thread James Lecard
I'm using ruby-marc ruby parser (v.0.4.2) to parse some marc files I get from a partner. The 856 field is splitted over 2 lines, causing the ruby library to ignore it (I've patched it to overcome this issue) but I want to know if this kind of marc is valid ? =LDR 00638nam 2200181uu 4500 =001 c

Re: [CODE4LIB] MARCXML to MODS: 590 Field

2011-05-19 Thread Karen Miller
Joel, The 590 is indeed defined for local use, so whatever your local institution uses it for should guide your mapping to MODS. There are some examples of what it's used for on the OCLC Bibliographic Formats and Standards pages: http://www.oclc.org/bibformats/en/5xx/590.shtm Frequently it's use

Re: [CODE4LIB] MARCXML to MODS: 590 Field

2011-05-19 Thread Jon Stroop
I'm going to guess that it's because 59x fields are defined for local use: http://www.loc.gov/marc/bibliographic/bd59x.html ...but someone from LC should be able to confirm. -Jon -- Jon Stroop Metadata Analyst Firestone Library Princeton University Princeton, NJ 08544 Email: jstr...@princeton.

Re: [CODE4LIB] Seth Godin on The future of the library

2011-05-19 Thread Luciano Ramalho
On Thu, May 19, 2011 at 8:31 AM, Andreas Orphanides wrote: > - As Graham says, there's a sunk-cost issue: you're going to prioritize the > stuff you paid for over free stuff since you've already invested resources in > it. Everybody who believes in sunk-cost should learn to play Go, the ancient

Re: [CODE4LIB] Seth Godin on The future of the library

2011-05-19 Thread Luciano Ramalho
On Thu, May 19, 2011 at 6:24 AM, graham wrote: > 2. It is hard to justify spending time on improving access to free stuff > when the end result would be good for everyone, not just the institution > doing the work (unless it can be kept in a consortium and outside-world > access limited) Why is i

[CODE4LIB] MARCXML to MODS: 590 Field

2011-05-19 Thread Richard, Joel M
Dear hive-mind, Does anyone know why the Library of Congress-supplied MARCXML to MODS XSLT [1] does not handle the MARC 590 Local Notes field? It seems to handle everything else, not that I've done an exhaustive search... :) Granted, I could copy/create my own XSLT and add this functionality i

Re: [CODE4LIB] Seth Godin on The future of the library

2011-05-19 Thread Jonathan Rochkind
On 5/19/2011 11:01 AM, graham wrote: Replying to Jonathan's mail rather at random, since several people are saying similar things. 1. 'Free resources can vanish any time.' But so can commercial ones, which is why LOCKSS was created. This isn't an insoluble issue or one unique to free resources.

Re: [CODE4LIB] Seth Godin on The future of the library

2011-05-19 Thread Mike Taylor
There is no such thing as a zero-cost lunch; but there is such a thing as a freedom lunch. I concur with Karen that (once again) much confusion is being generated here by the English language's lamentable use of the same word "free" to mean too such different things. -- Mike. On 19 May 2011 16

Re: [CODE4LIB] Seth Godin on The future of the library

2011-05-19 Thread Karen Coyle
I wonder if we aren't conflating a diverse set of issues here. - free (no cost) - free and online - free = not peer reviewed - online As Jonathan notes, we already face problems with online materials, even those we subscribe to. And libraries do take in free hard-copy books in the form of do

Re: [CODE4LIB] Seth Godin on The future of the library

2011-05-19 Thread graham
Replying to Jonathan's mail rather at random, since several people are saying similar things. 1. 'Free resources can vanish any time.' But so can commercial ones, which is why LOCKSS was created. This isn't an insoluble issue or one unique to free resources. 2. 'Managing 100s of paid resources is

[CODE4LIB] Job Posting: Web Developer, Smithsonian Institution Libraries

2011-05-19 Thread Richard, Joel M
The Smithsonian Institution Libraries is recruiting for a web developer position. We are in the midst of many interesting projects right now, including working with linked open data, building a new digital library, moving to Drupal, mass-digitization, and other projects. The Libraries serves a

Re: [CODE4LIB] wikipedia/author disambiguation

2011-05-19 Thread Jonathan Rochkind
Curious what script you've used that isn't production ready -- I don't think you meant to post in the URL for the JQuery library? On 5/19/2011 10:39 AM, Karen Coyle wrote: This sounds like a great way to "translate" from library forms to wikipedia name forms. But for on-the-fly use I wonder if

Re: [CODE4LIB] Seth Godin on The future of the library

2011-05-19 Thread Jonathan Rochkind
Another problem with free online resources not just 'collection selection', but maintenance/support once selected. A resource hosted elsewhere can stop working at any time, which is a management challenge. The present environment is ALREADY a management challenge, of course. But consider the p

Re: [CODE4LIB] wikipedia/author disambiguation

2011-05-19 Thread Karen Coyle
This sounds like a great way to "translate" from library forms to wikipedia name forms. But for on-the-fly use I wonder if it wouldn't be more efficient to eliminate the "middle man." Karen, can you say a little about what it took to link library names to WP? Was it a one-step, two-step, et

Re: [CODE4LIB] wikipedia/author disambiguation

2011-05-19 Thread Jonathan Rochkind
In addition to the approaches you note, might be worth investigating this tool that came up in a thread just a few days ago on this list: http://wikipedia-miner.sourceforge.net/ I think nobody's done enough with this yet to be sure what will work best, I think you're going to have to experime

Re: [CODE4LIB] Seth Godin on The future of the library

2011-05-19 Thread Yitzchak Schaffer
On 2011-05-18 20:30, Eric Hellman wrote: Exactly. I apologize if my comment was perceived as coy, but I've chosen to invest in the possibility that Creative Commons licensing is a viable way forward for libraries, authors, readers, etc. Here's a link the last of a 5 part series on open-access

Re: [CODE4LIB] wikipedia/author disambiguation

2011-05-19 Thread Karen Coombs
Graham, I'd advocate using WorldCat Identities to get to the appropriate url for dbpedia. Each Identity record has a wikipedia element in it that you could use to link to either Wikipedia or dbpedia. If you want to see an example of this in action you can check out the Author Info demo I did for

Re: [CODE4LIB] Seth Godin on The future of the library

2011-05-19 Thread Bill Dueber
My short answer: It's too damn expensive to check out everything that's available for free to see if it's worth selecting for inclusion, and library's (at least as I see them) are supposed to be curated, not comprehensive. My long answer: The most obvious issue is that the OPAC is traditionally a

Re: [CODE4LIB] Seth Godin on The future of the library

2011-05-19 Thread Andreas Orphanides
On 5/19/2011 7:36 AM, Mike Taylor wrote: I dunno. How do you assess the whole realm of proprietary stuff? Wouldn't the same approach work for free stuff? -- Mike. A fair question. I think there's maybe at least two parts: marketing and bundling. Marketing is of course not ideal, and likely

Re: [CODE4LIB] Seth Godin on The future of the library

2011-05-19 Thread Mike Taylor
On 19 May 2011 12:31, Andreas Orphanides wrote: > - I think there's a fear of a slippery slope and/or information overload: How > do you assess the whole realm of freely-available stuff? I dunno. How do you assess the whole realm of proprietary stuff? Wouldn't the same approach work for free st

Re: [CODE4LIB] Seth Godin on The future of the library

2011-05-19 Thread Andreas Orphanides
Quoting Karen Coyle 05/19/11 1:32 AM >>> > Eric, > > In what ways do you think that libraries today are not friendly to free stuff? > > kc >From my own (rather limited) experience, I think collection developers see >free/open source/open access stuff as a bit of a management challenge: - As

[CODE4LIB] Materio and modules

2011-05-19 Thread Tony Mattsson
Hi, After about a year of development, we (a hospital library in Sweden) have published some programs that might be of interest for other libraries. They include: Materio - publication platform which gives a common login system, where one can install modules (programs) which do stuff. Modules

[CODE4LIB] wikipedia/author disambiguation

2011-05-19 Thread graham
I need to be able to take author data from a catalogue record and use it to look up the author on Wikipedia on the fly. So I may have birth date and possibly year of death in addition to (one spelling of) the name, the title of one book the author wrote etc. I know there are various efforts in pro

Re: [CODE4LIB] Seth Godin on The future of the library

2011-05-19 Thread graham
Not replying for Eric but I hope he doesn't mind me butting in too.. As a newcomer to (academic) libraries from a software background, some of the things that first struck me were; 1. The amount of money spent on non-free stuff means it has to be emphasized over free stuff in publicity to try to