Re: [CODE4LIB] is this valid marc ?

Jonathan Rochkind Thu, 19 May 2011 10:59:52 -0700

I believe that the ruby-marc API, when you do record['856'], you justget the first 856, if there are more than one. You have to use other API(I forget offhand) to get more than one, the ['856'] is just a shortcutwhen you will only have one or only care about the first one.


So I don't think there's any bug in ruby-marc.

Your data example is _odd_ though, it's not usual to record 856's likethat, and it probably shouldn't be recorded like that. Multiple 856'scan exist where then are in fact multiple URLs recorded.


On 5/19/2011 1:16 PM, James Lecard wrote:

I'll dig in this one, thanks for this input Jonathan... I'm not so so
familiar with the library yet, I'll do some more debugging but in fact what
is happening is that I have no value with an access such as
record['856']['u'] field, while I get one for record['856']['q']
And the marc you are seeing is copy/pasted from a marc editor gui, its not
the actual marc record, I edited it so that its data is not recognisable
(for confidentiality).

James


2011/5/19 Jonathan Rochkind<[email protected]>

Now whether it _means_ what you want it to mean is another question, yeah.
As Andreas said, I don't think that particular example _ought_ to have two
856's.

But it ought to be perfectly parseable marc.

If your 'patch' is to make ruby-marc combine those multiple 856's into one
-- that is not right, two seperate 856's are two seperate 856's, same as any
other marc field. Applying that patch would mess up ruby-marc, not fix it.

You need to be more specific about what you're doing and what you mean
exactly by 'causing the ruby library to ignore it'.  I wonder if you are
just using the a method in ruby-marc which only returns the first field
matching a given tag when there is more than one.




On 5/19/2011 12:51 PM, Andreas Orphanides wrote:

 From the MARC documentation [1]:

"Field 856 is repeated when the location data elements vary (the URL in
subfield $u or subfields $a, $b, $d, when used). It is also repeated when
more than one access method is used, different portions of the item are
available electronically, mirror sites are recorded, different
formats/resolutions with different URLs are indicated, and related items are
recorded."

So it looks like however the URL is handled, a single 856 field should be
used to indicate the location [2]. I am not familiar enough with MARC to say
how it "should" have been done, but it looks like $q and $u would probably
be sufficient (if they're in the same line).

However, since the field is repeatable, the parser shouldn't be choking on
it, unless it's choking on it for a sophisticated reason (e.g., "These
aren't the subfield tags I expect to be seeing"). It also looks like if $u
is provided, the first subfield should indicate access method (in this case
"4" for HTTP). Maybe that's what's causing the problem? [3]

Anyway, I think having these two parts of the same URL data on separate
lines is definitely Not Right, but I am not sure if it adds up to invalid
MARC.

-dre.

[1] http://www.loc.gov/marc/bibliographic/bd856.html
[2] I am not a cataloger. Don't hurt me.
[3] I am not an expert on MARC ingest or on ruby-marc. I could be wrong.

On 5/19/2011 12:37 PM, James Lecard wrote:

I'm using ruby-marc ruby parser (v.0.4.2) to parse some marc files I get
from a partner.

The 856 field is splitted over 2 lines, causing the ruby library to
ignore
it (I've patched it to overcome this issue) but I want to know if this
kind
of marc is valid ?

=LDR  00638nam  2200181uu 4500
=001  cla-MldNA01
=008  080101s2008\\\\\\\|||||||||||||||||fre||
=040  \\$aMy Provider
=041  0\$afre
=245  10$aThis Subject
=260  \\$aParis$bJ. Doe$c2008
=490  \\$aSome topic
=650  1\$aNarratif, Autre forme
=655  \7$abook$2lcsh
=752  \\$aA Place on earth
=776  \\$dParis: John Doe and Cie, 1973
=856  \2$qtext/html
=856  \\$uhttp://www.this-link-will-not-be-retrieved-by-ruby-marc-library

Thanks,

James L.

Re: [CODE4LIB] is this valid marc ?

Reply via email to