Re: [CODE4LIB] MARC field lengths

2013-10-17 Thread Karen Coyle
Thanks, Bill. What you say about "assumptions" is a good part of what is motivating me to try to instigate a discussion. As you know, both FRBR and RDA were developed by the cataloging community with no input from technologists. There are sweeping statements about FRBR being "more efficient" th

Re: [CODE4LIB] MARC field lengths

2013-10-16 Thread Bill Dueber
My guess is that traversing the WEM structure for display of a single record (e.g., in a librarian's ILS client or what not) will not be a problem at all, because the volume is so low. In terms of the OPAC interface itself, well, there are lots and lots of way to denormalize the data (meaning "cop

Re: [CODE4LIB] MARC field lengths

2013-10-16 Thread Karen Coyle
On 10/16/13 4:22 PM, Kyle Banerjee wrote: In some ways, FRBR strikes me as the catalogers' answer to the miserable seven layer OSI model which often confuses rather than clarifies -- largely because it doesn't reflect reality very well. Agreed. I am having trouble seeing FRBR as being benefici

Re: [CODE4LIB] MARC field lengths

2013-10-16 Thread Kyle Banerjee
Depends on how many requests the service has to accommodate. Up to a point, it's no big deal. After a certain point, servicing lots of calls gets expensive and bang for the buck is brought into question. My bigger concern would be getting data encoded/structured consistently. Even though FRBR has

Re: [CODE4LIB] MARC field lengths

2013-10-16 Thread Karen Coyle
Yes, that's my take as well, but I think it's worth quantifying if possible. There is the usual trade-off between time and space -- and I'd be interested in hearing whether anyone here thinks that there is any concern about traversing the WEM structure for each search and display. Does it matte

Re: [CODE4LIB] MARC field lengths

2013-10-16 Thread Bill Dueber
If anyone out there is really making a case for FRBR based on whether or not it saves a few characters in a database, well, she should give up the library business and go make money off her time machine . Maybe -- *maybe* -- 15 years ago. But I have to say, I'm sitting on 10m records right now, an

Re: [CODE4LIB] MARC field lengths

2013-10-16 Thread Bill Dueber
For the HathiTrust catalog's 6,046,746 bibs and looking at only the lengths of the subfields $a and $b in 245s, I get an average length of 62.0 On Wed, Oct 16, 2013 at 3:24 PM, Kyle Banerjee wrote: > 245 not including $c, indicators, or delimiters, |h (which occurs before > |b), |n, |p, with tr

Re: [CODE4LIB] MARC field lengths

2013-10-16 Thread Karen Coyle
On 10/16/13 12:33 PM, Kyle Banerjee wrote: BTW, I don't think 240 is a good substitute as the content is very different than in the regular title. That's where you'll find music, laws, selections, translations and it's totally littered with subfields. The 70.1 figure from the stripped 245 is prob

Re: [CODE4LIB] MARC field lengths

2013-10-16 Thread Kyle Banerjee
BTW, I don't think 240 is a good substitute as the content is very different than in the regular title. That's where you'll find music, laws, selections, translations and it's totally littered with subfields. The 70.1 figure from the stripped 245 is probably closer to the mark IMO, what you stand

Re: [CODE4LIB] MARC field lengths

2013-10-16 Thread Nicolas Franck
aren Coyle [li...@kcoyle.net] Sent: Wednesday, October 16, 2013 7:06 PM To: CODE4LIB@LISTSERV.ND.EDU Subject: [CODE4LIB] MARC field lengths Anybody have data for the average length of specific MARC fields in some reasonably representative database? I mainly need 100, 245, 6xx. Thanks, kc -- Karen C

Re: [CODE4LIB] MARC field lengths

2013-10-16 Thread Kyle Banerjee
245 not including $c, indicators, or delimiters, |h (which occurs before |b), |n, |p, with trailing slash preceding |c stripped for about 9 million records for Orbis Cascade collections is 70.1 kyle On Wed, Oct 16, 2013 at 12:00 PM, Karen Coyle wrote: > Thanks, Roy (and others!) > > It looks l

Re: [CODE4LIB] MARC field lengths

2013-10-16 Thread Karen Coyle
Thanks, Roy (and others!) It looks like the 245 is including the $c - dang! I should have been more specific. I'm mainly interested in the title, which is $a $b -- I'm looking at the gains and losses of bytes should one implement FRBR. As a hedge, could I ask what've you got for the 240? that

Re: [CODE4LIB] MARC field lengths

2013-10-16 Thread Kyle Banerjee
Argh. Must learn to write at third grade level I wanted to say I like breaking up 6XX as Roy has done because 6XX fields vary in purpose and tag frequency varies considerably. On Wed, Oct 16, 2013 at 11:08 AM, Kyle Banerjee wrote: > This squares with what I'm seeing. Data for all holdings o

Re: [CODE4LIB] MARC field lengths

2013-10-16 Thread Kyle Banerjee
This squares with what I'm seeing. Data for all holdings of the Orbis Cascade Alliance is: 100: 30.1 245: 114.1 6XX: 36.1 My values include indicators (2 characters) as well as any delimiters but not the tag number itself. I breaking up 6XX up as Roy has as 6XX's are far from created equal and fr

Re: [CODE4LIB] MARC field lengths

2013-10-16 Thread Roy Tennant
I don't even have to fire it up. That's a statistic that we generate quarterly (albeit via Hadoop). Here you go: 100 - 30.3 245 - 103.1 600 - 41 610 - 48.8 611 - 61.4 630 - 40.8 648 - 23.8 650 - 35.1 651 - 39.6 653 - 33.3 654 - 38.1 655 - 22.5 656 - 30.6 657 - 27.4 658 - 30.7 662 - 41.7 Roy On

Re: [CODE4LIB] MARC field lengths

2013-10-16 Thread Bill Dueber
I'm running it against the HathiTrust catalog right now. It'll just take a while, given that I don't have access to Roy's Hadoop cluster :-) On Wed, Oct 16, 2013 at 1:38 PM, Sean Hannan wrote: > That sounds like a request for Roy to fire up the ole OCLC Hadoop. > > -Sean > > > > On 10/16/13 1:0

Re: [CODE4LIB] MARC field lengths

2013-10-16 Thread Sean Hannan
That sounds like a request for Roy to fire up the ole OCLC Hadoop. -Sean On 10/16/13 1:06 PM, "Karen Coyle" wrote: >Anybody have data for the average length of specific MARC fields in some >reasonably representative database? I mainly need 100, 245, 6xx. > >Thanks, >kc > >-- >Karen Coyle >kc

[CODE4LIB] MARC field lengths

2013-10-16 Thread Karen Coyle
Anybody have data for the average length of specific MARC fields in some reasonably representative database? I mainly need 100, 245, 6xx. Thanks, kc -- Karen Coyle kco...@kcoyle.net http://kcoyle.net m: 1-510-435-8234 skype: kcoylenet