Re: [RDA-L] Completeness of records

2011-08-11 Thread Moore, Richard
Hal 

The initial work of correlating the data from the LC/NAF and the German

authority files and the associated bibliographic records was so
effective 
that it revealed thousands of errors in the LC/NAF -- duplicates, false

attributions, errors with undifferentiated name records.  

I didn't know that. What was done about the errors?

Regards
Richard

_
Richard Moore 
Authority Control Team Manager 
The British Library

Tel.: +44 (0)1937 546806
E-mail: richard.mo...@bl.uk
 


Re: [RDA-L] Completeness of records

2011-08-11 Thread Danskin, Alan
Neither was I - although any large database (as we know) is likely to
contain errors.  You may like to draw Anthony's attention to Hal's
message, in case LC wish to rebut, or explain further.

Alan 

-Original Message-
From: Resource Description and Access / Resource Description and Access
[mailto:RDA-L@LISTSERV.LAC-BAC.GC.CA] On Behalf Of Moore, Richard
Sent: 11 August 2011 09:37
To: RDA-L@LISTSERV.LAC-BAC.GC.CA
Subject: Re: [RDA-L] Completeness of records

Hal 

The initial work of correlating the data from the LC/NAF and the German

authority files and the associated bibliographic records was so
effective 
that it revealed thousands of errors in the LC/NAF -- duplicates, false

attributions, errors with undifferentiated name records.  

I didn't know that. What was done about the errors?

Regards
Richard

_
Richard Moore
Authority Control Team Manager
The British Library

Tel.: +44 (0)1937 546806
E-mail: richard.mo...@bl.uk
 


[RDA-L] FW: [RDA-L] Completeness of records

2011-08-11 Thread Danskin, Alan
 Apologies for sharing this with the list.  It was intended to be a
reply to Richard Moore only.

Alan

-Original Message-
From: Resource Description and Access / Resource Description and Access
[mailto:RDA-L@LISTSERV.LAC-BAC.GC.CA] On Behalf Of Danskin, Alan
Sent: 11 August 2011 09:43
To: RDA-L@LISTSERV.LAC-BAC.GC.CA
Subject: Re: [RDA-L] Completeness of records

Neither was I - although any large database (as we know) is likely to
contain errors.  You may like to draw Anthony's attention to Hal's
message, in case LC wish to rebut, or explain further.

Alan 

-Original Message-
From: Resource Description and Access / Resource Description and Access
[mailto:RDA-L@LISTSERV.LAC-BAC.GC.CA] On Behalf Of Moore, Richard
Sent: 11 August 2011 09:37
To: RDA-L@LISTSERV.LAC-BAC.GC.CA
Subject: Re: [RDA-L] Completeness of records

Hal 

The initial work of correlating the data from the LC/NAF and the German

authority files and the associated bibliographic records was so
effective 
that it revealed thousands of errors in the LC/NAF -- duplicates, false

attributions, errors with undifferentiated name records.  

I didn't know that. What was done about the errors?

Regards
Richard

_
Richard Moore
Authority Control Team Manager
The British Library

Tel.: +44 (0)1937 546806
E-mail: richard.mo...@bl.uk
 


Re: [RDA-L] Completeness of records

2011-08-11 Thread hecain

Quoting Moore, Richard richard.mo...@bl.uk:


Hal


The initial work of correlating the data from the LC/NAF and the German
authority files and the associated bibliographic records was so effective
that it revealed thousands of errors in the LC/NAF -- duplicates, false
attributions, errors with undifferentiated name records.


I didn't know that. What was done about the errors?


My information is from a presentation by OCLC's Ed O'Neill, at the  
ACOC (Australian Committee on Cataloguing) seminar What's in a Name?  
held in Sydney (N.S.W.) in January 2005.


The formal presentation is available (Powerpoint) on the ACOC website  
www.nla.gov.au/lis/stndrds/grps/acoc/viaf2005.ppt and of course  
relates to the early stages of the project.  I've just reviewed that,  
but the observations I referred to are not part of it, so they must  
have been delivered off the cuff; since my notes seem not to be  
findable, I have only recollection to guide me, and cannot be more  
precise. I was struck by the figures Ed presented, as they confirmed  
impressions I had formed over the previous several years about lurking  
errors in the LC/NAF anthe LC catalog, and the OCLC database.


Anyway, my recollection is that Ed told us that these apparent errors  
had been reported to (then) CPSO at LC and were to be reviewed and,  
where found justifed, corrected.  IIRC at this time LC had still not  
completely refined the tools they use today for bulk changes of  
headings in their bib records to match authority changes (including  
reported BFM changes), so the task could have proved very laborious  
and may never have been carried through.  I guess one might inquire of  
the Policy and Standards Division at LC, the chief of which is Dr.  
Barbara Tillett, herself a member of the VIAF project team and heavily  
involved, of course, in RDA.


VIAF relies for identifying matches between separate authority files  
not only on the information in the authority records but (at least in  
the initial work, matching DB and LC/NAF names) also on the  
bibliographic (resource) records in the DB and LC catalogues  
respectively -- Ed O'Neill's presentation gives a fascinating account  
of this.  I haven't paid enough attention recently to understand how  
far this technique has been continued in the expanded VIAF.


At the time I attended Ed O'Neill's presentation, I was more concerned  
with ideas of applying similar techniques (I suppose I might call them  
data mining?) to help identify and consolidate duplicate bibliographic  
records in the ANBD (Australian National Bibliographic Database) which  
supports the Libraries Australia service. Therefore perhaps I didn't  
pay as much attention as I might have to the authority-resolving  
details.  But it seems clear to me from what we were given that by  
taking broad categories of data (names in headings but also in text  
fields (245 $c, 505, 508; publisher names in 260 $b and  
corporates/conferences in 11x/71X); titles in 245 $a, 505, 440/490,  
7XX/8XX $t, 830), that machine grouping can go a long way towards  
record matching, and do a lot to identify bad matches or distinguish  
falsely-matched entities, even when working across different data  
formats (DB data was not in MARC 21, and BNF data isn't MARC 21).  And  
therefore I'm left with doubts about whether very fine granularity in  
our data, as codified in RDA, is really worth the trouble it seems to  
be causing.  Fuzzy logic may even do the job better than too-scarce  
skilled humans.


Hal Cain, whose involvement is now minimal
Melbourne, Australia
hec...@dml.vic.edu.au


This message was sent using IMP, the Internet Messaging Program.


Re: [RDA-L] Completeness of records

2011-08-11 Thread Moore, Richard
Hal

Fuzzy logic may even do the job better than too-scarce skilled humans.

It can also throw up false equivalences of its own, and create compound
problems when datasets are matches against each other. You do have to
set the barrier for matching very high.

_
Richard Moore 
Authority Control Team Manager 
The British Library

Tel.: +44 (0)1937 546806
E-mail: richard.mo...@bl.uk
 


Re: [RDA-L] Completeness of records

2011-08-11 Thread Reser, Dave
Dear Hal and others:

It is true that every time you see your data in a new environment, the 
anomalies jump out-- the addition of LC/NAF records to VIAF is no exception.  
With regard to VIAF, it is true that OCLC Research has sent numerous lists of 
errors, possible errors, or even just things that might benefit from a second 
look.  Ana Cristan, in the Policy  Standards Division here at LC, is kept 
mighty busy analyzing these reports, fixing errors, explaining practices, 
deleting duplicates, etc.  Fuzzy matches is one type of report that comes 
from OCLC research.  Ana is currently working through 1,736 such fuzzy matches 
(out of 3 million personal names from LC/NAF in VIAF, if that helps to put it 
in perspective).  These reports sometimes result in changes to authority 
records and/or bibliographic records, thus improving the matching and 
clustering in VIAF.  The users of VIAF, and those that make daily use of the 
improved LC/NAF benefit from the cooperation of the data magicians at OCLC and 
the human review/correction at LC and elsewhere.  In a typical month PSD 
revises in excess of 3,000 Name authority records, revises over 25,000 related 
bibliographic records, and we delete about 500 duplicate name authority 
records. 

In addition to the reports related to VIAF, LC processes regular error reports 
on the daily distribution file that goes to the other NACO nodes, reports of 
duplicates from a variety of sources, as well as regular reports from our 
friends and colleagues around the world (special kudos to Tom Gilbert (Library 
Technologies, Inc)  and Gary Strawn (Northwestern University) as the most 
frequent correspondents).  

Since 2005 LC's programmatic ability to check for duplicates has greatly 
improved and we have had several projects to clean up older authority records. 
OCLC also undertakes many batch changes, under the watchful eye of Robert 
Bremer. PSD also collaborates with OCLC to eliminate exact NAR duplicates every 
month and OCLC sends LC a list of changed 1XXs for bibliographic file 
maintenace to keep bibliographic headings in sync with the authority file.

It goes without saying, as long as humans are involved in the cataloging 
process, errors will sneak in, particularly to a file as large as the LC/NAF 
with contributors around the world.  Given multiple contributing nodes, 
duplicates are also an inevitable cost of doing business.  LC, like the other 
LC/NAF partners big and small, believes that maintenance of the file is a 
critical activity, and appreciates all of the contributions to that end.

Thanks,
Dave Reser
LC Policy  Standards Division




-Original Message-
From: Resource Description and Access / Resource Description and Access 
[mailto:RDA-L@LISTSERV.LAC-BAC.GC.CA] On Behalf Of hec...@dml.vic.edu.au
Sent: Thursday, August 11, 2011 6:41 AM
To: RDA-L@LISTSERV.LAC-BAC.GC.CA
Subject: Re: [RDA-L] Completeness of records

Quoting Moore, Richard richard.mo...@bl.uk:

 Hal

 The initial work of correlating the data from the LC/NAF and the 
 German authority files and the associated bibliographic records was 
 so effective that it revealed thousands of errors in the LC/NAF -- 
 duplicates, false attributions, errors with undifferentiated name records.

 I didn't know that. What was done about the errors?

My information is from a presentation by OCLC's Ed O'Neill, at the ACOC 
(Australian Committee on Cataloguing) seminar What's in a Name?  
held in Sydney (N.S.W.) in January 2005.

The formal presentation is available (Powerpoint) on the ACOC website 
www.nla.gov.au/lis/stndrds/grps/acoc/viaf2005.ppt and of course relates to 
the early stages of the project.  I've just reviewed that, but the observations 
I referred to are not part of it, so they must have been delivered off the 
cuff; since my notes seem not to be findable, I have only recollection to guide 
me, and cannot be more precise. I was struck by the figures Ed presented, as 
they confirmed impressions I had formed over the previous several years about 
lurking errors in the LC/NAF anthe LC catalog, and the OCLC database.

Anyway, my recollection is that Ed told us that these apparent errors had been 
reported to (then) CPSO at LC and were to be reviewed and, where found 
justifed, corrected.  IIRC at this time LC had still not completely refined the 
tools they use today for bulk changes of headings in their bib records to match 
authority changes (including reported BFM changes), so the task could have 
proved very laborious and may never have been carried through.  I guess one 
might inquire of the Policy and Standards Division at LC, the chief of which is 
Dr.  
Barbara Tillett, herself a member of the VIAF project team and heavily 
involved, of course, in RDA.

VIAF relies for identifying matches between separate authority files not only 
on the information in the authority records but (at least in the initial work, 
matching DB and LC/NAF names) also on the bibliographic (resource

Re: [RDA-L] Completeness of records

2011-08-10 Thread James Weinheimer

On 09/08/2011 15:23, Brenndorfer, Thomas wrote:
snip
Perhaps the problem stems from the words you use, such as allow 
WEMI. FRBR is based on an entity-relationship analysis tool that 
first appeared in about the 1970s, not the 19th century. Catalogs 
don't allow entities-- the FRBR modelling exercise shows what 
entities have been the basis behind the conventions and mechanisms in 
traditional catalogs. Part II of AACR2 is very heavy on the concept of 
work, but related data about the work is scattered all over the 
place in Part I of AACR2. FRBR says here's what we have always 
intended-- let's present it in a way that can be sufficiently 
abstracted so we can do it differently, do it better, do it in a 
machine-friendly way, do it in a way that is consistent with other, 
more modern technical standards, do it in a way that can be extended 
and modified, and, to boot, do it in a way that is compatible with the 
existing record structure. 

/snip

Well, I have studied databases as well and created a number of them. I 
personally don't care one whit whether we do something that is friendly 
to machines. They can scream for all I care, so long as the job is done. 
I would much rather do something in a librarian- or cataloger-friendly 
way and let the machines do more work. This includes being able to 
achieve some notable successes now, not putting our faith in vague 
promises of the future, and using the machines to their fullest 
potential, whether it happens to be friendly or not.


snip
So, no, it is not moot. I ran into the FRBR issue in the late 1990s 
when customizing my first web-based version of the catalog. I couldn't 
do things, not because of the limits of the technology (of which there 
were and still are many), but because of the limitations in the 
underlying data structure in traditional AACR/MARC records. It was one 
of those the emperor has no clothes moments. FRBR made more sense 
than the traditional catalog, because it was written in the modern 
language of databases. And in studying database design in courses, it 
was quite embarrassing to compare the comfort level students had with 
concept such as primary keys and relationships between tables. In 
describing the traditional catalog-- well, we have relationship 
designators, but they mess up our displays, so we have traditionally 
made decisions since the 19th century, not based upon efficient 
database design, but on the vagaries of a medley of display 
conventions and encoding conventions that are overly contingent and 
conditional on extraneous factors.

/snip

Relational databases are not the only choice today, and we must keep our 
options open and limit our concerns of this is not the way it is 
supposed to be done. Designing an RDBMS has a certain sense of what I 
call computer aesthetics which I believe should be irrelevant. There 
is now the very efficient and powerful option of the Lucene search 
engine indexing (with its variants), which forgoes a relational database 
altogether and indexes flat files. I have read quite a bit on it 
although much of it is highly technical and beyond my capabilities, but 
the proof is in the pudding. I have already mentioned that this must 
be how Worldcat is indexed. But here is an even better implementation (I 
believe) by our Australian colleagues: http://ll01.nla.gov.au/index.jsp. 
(They have some links to papers and one is broken although I found it 
here 
http://www.nla.gov.au/openpublish/index.php/nlasp/article/viewArticle/1047)


This works on a database of 16 million records. It is very fast and 
provides the extracted headings for further refinements, a major step 
forward in catalog technology. Even doing a ridiculous search for of 
the a and limiting it to online resources took less than 20 seconds! 
(It's so fast, you don't need a stopword list) It's important to note 
that with Lucene indexing, it does not use a database at all! This 
Australian project says that it also uses Lucene to store the data and 
from what I have read, a melding of the two is best: and RDBMS for 
storage and maintenance, and the full-text search engine for the public, 
just as Koha is designed. As an added bonus, Lucene-type technologies 
are open source!


Relevance ranking is a part of these technologies, and this is why (I 
think!) that Eric Hellman in his talk Library Data, why bother? 
http://www.facebook.com/l.php?u=http%3A%2F%2Fbit.ly%2FipVVoHh=4AQDVSolC 
says that libraries should be trying to tweak relevance ranking (i.e. 
search engine optimization) and adding microdata as more important than 
anything else. This was discussed in Autocat (I disagreed in part), but 
Eric Hellman got involved too. It was a very enlightening exchange of ideas.


A lot of this reminds me of my researches into the library catalog of 
the future built under Ernest Richardson when he was at Princeton 
University back in the 1920s. It was built using the latest technology 
at the time (linotype slugs). I finally 

Re: [RDA-L] Completeness of records

2011-08-10 Thread Kevin M Randall
James Weinheimer wrote:

 So we shouldn't make our data *fit*
 into the tool and then go on to explain why this is the way it must be
 done (as Richardson and his researchers did with the 180 characters),
 but instead, fashion tools to fit your data. Lucene allows this.

But a tool can only use what is there for it to use.  Data that aren't 
sufficiently atomized are not going to be able to work as well as data that 
*are* sufficiently atomized.  If the data structure does not allow for 
determining unambiguous relationships between the pieces of data, that places 
limits on *any* kind of search engine.  As wonderful as Lucene may be, it 
cannot possibly determine the relationships between pieces of data in a 
document if that document's structure does not label those relationships.  A 
computer cannot work with something that simply isn't there.

Kevin M. Randall
Principal Serials Cataloger
Bibliographic Services Dept.
Northwestern University Library
1970 Campus Drive
Evanston, IL  60208-2300
email: k...@northwestern.edu
phone: (847) 491-2939
fax:   (847) 491-4345


Re: [RDA-L] Completeness of records

2011-08-10 Thread J. McRee Elrod
Kevin said:

But a tool can only use what is there for it to use.  Data that
aren't sufficiently atomized are not going to be able to work as well
...

Our present tools don't *begin* to use the atomization our present
date contains.  Perhaps it is the tools which need our attention?


   __   __   J. McRee (Mac) Elrod (m...@slc.bc.ca)
  {__  |   / Special Libraries Cataloguing   HTTP://www.slc.bc.ca/
  ___} |__ \__


Re: [RDA-L] Completeness of records

2011-08-10 Thread Kevin M Randall
Mac Elrod wrote:

 Our present tools don't *begin* to use the atomization our present
 date contains.  Perhaps it is the tools which need our attention?

Attention is needed on *both* fronts.  We need to be able to work with the 
detail already in our current records, but we also need more detail than our 
current records provide us.

Kevin M. Randall
Principal Serials Cataloger
Bibliographic Services Dept.
Northwestern University Library
1970 Campus Drive
Evanston, IL  60208-2300
email: k...@northwestern.edu
phone: (847) 491-2939
fax:   (847) 491-4345 


Re: [RDA-L] Completeness of records

2011-08-10 Thread Brenndorfer, Thomas
 -Original Message-
 From: Resource Description and Access / Resource Description and Access
 [mailto:RDA-L@LISTSERV.LAC-BAC.GC.CA] On Behalf Of Kevin M Randall
 Sent: August 10, 2011 12:34 PM
 To: RDA-L@LISTSERV.LAC-BAC.GC.CA
 Subject: Re: [RDA-L] Completeness of records

 Mac Elrod wrote:

  Our present tools don't *begin* to use the atomization our present
  date contains.  Perhaps it is the tools which need our attention?

 Attention is needed on *both* fronts.  We need to be able to work with
 the detail already in our current records, but we also need more detail
 than our current records provide us.



And the pile of new MARC proposals http://www.loc.gov/marc/marbi/list-p.html 
moves more of the data into atomic and granular placeholders.

I particular like the fix to the not-so-atomic (large print) placement from 
http://www.loc.gov/marc/marbi/2011/2011-08.html :

300 $a ix, 253 p. (large print)


becomes in RDA


Extent: ix, 253 pages
Font size: large print


and would be coded in MARC as


300 $a ix, 253 pages
340 $n large print

What's needed is greater connections to normalized forms of the data, such as 
008/23=d for Large print.


The RDA element set at http://metadataregistry.org/ also moves us into a more 
granular and atomic future, with these defined elements:

Extent of text:
http://rdvocab.info/Elements/extentOfText
(subproperty of Extent: http://rdvocab.info/Elements/extent )

and

Font size:
http://rdvocab.info/Elements/fontSize


with vocabulary also registered, such as for Font size the value Large print 
registered as:
http://rdvocab.info/termList/fontSize/1002


Linking registered vocabulary with parallel vocabulary (such as in other 
languages) into registered elements, which in turn can be linked hierarchically 
with other elements, which collectively can be linked to entities and 
registered access points looks like a good example of appropriate 
machine-actionable atomic data.


Thomas Brenndorfer
Guelph Public Library


Re: [RDA-L] Completeness of records

2011-08-10 Thread hecain

Quoting Kevin M Randall k...@northwestern.edu (in part):

If the data structure does not allow for determining unambiguous  
relationships between the pieces of data, that places limits on  
*any* kind of search engine.  As wonderful as Lucene may be, it  
cannot possibly determine the relationships between pieces of data  
in a document if that document's structure does not label those  
relationships.  A computer cannot work with something that simply  
isn't there.


It is however possible for data encoded in somewhat different systems  
to be cross-correlated.  For an example, see VIAF  http://viaf.org/  
and try the name of a favorite author.  The initial work of  
correlating the data from the LC/NAF and the German authority files  
and the associated bibliographic records was so effective that it  
revealed thousands of errors in the LC/NAF -- duplicates, false  
attributions, errors with undifferentiated name records.  There are  
limits, of course.


It's not always necessary to bring existing data exactly into line.   
For the future, of course, a standard format consistently applied is  
clearly the way to go; and reprocessing existing data to achieve a  
closer match to the new standard may be worthwhile -- but at whose cost?


Hal Cain
Melbourne, Australia
hec...@dml.vic.edu.au


This message was sent using IMP, the Internet Messaging Program.


Re: [RDA-L] Completeness of records

2011-08-09 Thread James Weinheimer

On 08/08/2011 23:42, Kevin M Randall wrote:
snip

James Weinheimer wrote:

On 08/08/2011 19:00, Kevin M Randall wrote:

I was really hoping for something that could become part of the

conversation *here*.  I'm sure there are others who would appreciate it too.
/snip

That means redoing an awful lot which I really don't feel like doing or
have time for.

I specifically stated that just one example would do.  If you don't feel like 
contributing something to explain an argument that you have been making for a 
long time, what choice have I but to question your commitment to serious 
involvement in the conversation

/snip

That is really unfair. I have spent many hours discussing my opinions of 
FRBR, on this list and several others, plus doing a number of podcasts, 
each of which takes some time. Therefore, to conclude that because you 
don't want to look at those things and I am supposed to redo it, means 
that I am not serious, is unfair and I must protest. So far as I know, I 
was the first one to attack the FRBR sacred cows and for some time, I 
was alone. Many out there don't agree with me and that is fine. We can 
agree to disagree.


One thing I want to point out (again!) is that I am *absolutely not* 
claiming that no one, ever, wants works, expressions, manifestations, 
and items because they do. I have said this over and over and over 
again, so many times I am thinking about making a macro for it. You 
mentioned that you have wanted WEMI, and I have said I have wanted it 
too. So what? We are both library-types. Knowing how the public searches 
is what is of the highest importance. In any case, I consider that the 
argument is moot since our catalogs allow WEMI *right now* and they have 
for almost two centuries (if not much longer). The problem is that the 
structures for this type of access worked much better in a printed 
environment where people were forced to browse pre-arranged individual 
records (of varying types). In several ways this structure simply fell 
apart with the transfer to computers because of keyword access, the 
weird alphabetization of the computer and the problem of adding 
cross-references to the headings in a keyword environment. The catalog 
became even less comprehensible to the average person. Add the fact that 
people now search library catalogs like they do Google (very 
understandable) and they necessarily get inferior results. It is no 
wonder that things have broken down.


Finally, these problems with the online catalogs have begun to be 
recognized and they must be corrected. So how do we go about it? Do we 
recreate the original ideas from the 1840s as FRBR envisions? While that 
would satisfy my historical sensibility, does it make sense to create 
something like that for our users? It does only *if* you claim that our 
users want the FRBR user tasks. If you claim otherwise, it makes no 
sense. Creating a tool *for the public* is of primary importance to the 
future of the catalog, and I believe, to the future of the library 
itself. Therefore, such a vital question should be researched and 
answered very seriously whether people really want the FRBR user tasks 
so badly, and such a statement should *not* be taken as a sacred 
commandment handed down from our forefathers that can not be questioned. 
The future of the profession is at stake.


Additionally, it has been demonstrated that we could make the FRBR user 
tasks operable in today's environment *right now* by systems people who 
can create the correct queries and views, and is *not* a matter of 
reworking our rules and formats. I personally think it would be a highly 
positive achievement to claim victory and then move on.


I will skip to a major point in your message:
snip
That's an awfully self-centered way of looking at the bibliographic 
universe. So the only information that should be in there is the 
information you want to have in this one particular instance? How in the 
world is a cataloger right now in Library X going to know what exactly 
it is (and nothing more, apparently) that you want to know five years 
from now when you do a search? *ALL* metadata has meaning depending on 
the context. The FRBR report acknowledges this, and that's the whole 
point of the tables in chapters 6 and 7 of FRBR. The elements are 
analyzed in terms of their general value (high, moderate, low) in 
meeting the user tasks. If you're interested in research on FRBR, an 
excellent first step is the FRBR Bibliography at 
http://www.ifla.org/en/node/881

/snip

That is *precisely the point* of the new information environment: it is 
a personal one. This must be understood and accepted, whether we like it 
or not, and it is an environment where the library has sharply 
decreasing control over anything at all. I personally do not care for 
this environment and explain why in my podcast on search, but I 
realize--and say as much--that my feelings are 100% irrelevant. This is 
where many say that the one size fits all 

Re: [RDA-L] Completeness of records

2011-08-09 Thread Bernhard Eversberg

08.08.2011 23:42,  Kevin M Randall:


 I know the validity of the FRBR user tasks from my own personal
 experience over a lifetime, plus interactions with other people who
 have apparently had the same kinds of experiences over their
 respective lifetimes.

 The FRBR user tasks are:

 FIND - ...
 IDENTIFY - ...
 SELECT - ...
 OBTAIN - ...

 In all of my life, through primary and secondary school, college, and
 graduate school, and in my general day-to-day life, whenever I
 approach a library catalog (or any catalog or web site for that
 matter), I have been doing these things.


Yes, whenever seeking out books or, more generally, recorded knowledge,
these steps - as by our experience - are what it takes to get there:
let's call it the FISO technique.

Now, for information seeking on the web, those 4 steps many times
happen all in one fell swoop, or so is the experience, and
certainly the expectation, of search engine users. That means
those steps are not, as such, perceived as separate stages of
a search activity.

Ask anyone entering a library what tasks they are hoping to get
done there. Will FISO be their reply? Most often, I'm fairly
sure, they have questions and need answers. How and from whence
these come is secondary. Only after it turns out the answer will be
somewhat complex and maybe only in this or that book, or any book,
do they go about aforementioned tasks in one way or other, stepwise,
as guided by a clever system or intermediate. While their
expectation, based on experience, makes them believe it ought to
be lots easier, quicker, directer.

How large is the subset of questions that should end up in a
catalog search as the best or only way of searching?
Esp. if the FRBR entities of class 1 and 2 is all we are dealing
with in a catalog - and it is all RDA is up to right now - this
fraction of questions is presumably not very big. And fewer still
are those that could use a WEMI model. There was a time when it
was necessary for most any question to first ponder in what book
or category of literature the answer might be hidden.
And then of course, i.e. almost always, the procedure was FISO.
Today, it is what one has to follow less and less frequently.
The practical relevance of catalogs, and their rules, with regard
to the body of questions people are out to solve is going
down ever further, I'm afraid. The decline may possibly be
protracted but not reversed, if we enrich catalogs and endow
them with new functions and features, most of which not
figuring in the current RDA or FRBR.

B.Eversberg

As an aside: To insist on FISO and FRBR reminds of a scene in
Goethe's Faust I, where Mephisto tells the student about what
to expect from Collegium logicum:

...
Then many a day they'll teach you how
The mind's spontaneous acts, till now
As eating and as drinking free,
Require a process;---one! two! three!
In truth the subtle web of thought
Is like the weaver's fabric wrought:
...

http://www.gutenberg.org/cache/epub/3023/pg3023.html










Re: [RDA-L] Completeness of records

2011-08-08 Thread James Weinheimer

On 08/08/2011 01:49, Brenndorfer, Thomas wrote:
snip
There's a difference when data is controlled by identifiers or control 
numbers vs text strings. I've gone through several library and library 
systems, and currently I am able to do a lot of authority updating and 
maintenance based upon control numbers that I couldn't do before with 
earlier, less capable systems. However, once I move closer to cleaning 
up the bibliographic records I have to switch to more manual 
operations, manual checking, crude global updates methods and deduping 
algorithms, etc. (such as all that annoying checking of changed 
headings in name-title forms, and with added subject subdivisions). 
It's like the last mile in broadband connectivity. Fast fibre optic 
everywhere except when one gets closer to home where antiquated 
technology slows things done. It would be wonderful if everything 
works perfectly *right now* but it emphatically does not work as 
simply as you suggest. It's only when data is modelled out thoroughly 
and correctly that we can start talking about new functionality. 

/snip

Once again, I point out that the primary objective, going beyond 
textual strings or identifiers is that first, the information is 
*entered*, and second, *entered consistently*. When the information 
actually exists and can be reliably found, then it is possible to do all 
kinds of things with it, including converting to identifiers or whatever 
else you want, if it is desired. If it is either not entered, or entered 
in unpredictable ways, while you can still work with it, it becomes far 
more difficult and the results will be far less satisfactory. But 
ultimately, it doesn't matter if this consistency consists of a number 
or text because it makes absolutely no difference to the computer.


snip
It would be wonderful if the functionality could be extended more 
deeply, showing the user for example, related works that are actually 
available in the library based upon the relationship clustering 
inherent in FRBR.

/snip

Would it be wonderful? I believe very little will change in library 
cataloging until the metadata creators divorce themselves from this 
official, traditional dogma that what our users want is the FRBR user 
tasks, something the new information tools by the information 
companies don't talk about and are not weighed down by such 
preconceived ideas. Therefore, they are free to discover what their 
users really want; how their organizations can build new tools that 
approximate what they have discovered about their users, then do more 
research based on what they have discovered users like and dislike about 
the new tools, discover new needs of their users, continuing this 
process on and on, and concentrate on providing those things.


snip
Good data input up front saves everyone time down the road. Some 
library users don't really care about the format details for what 
they're after. Other library users are very particular, and can be 
quite canny in figuring things out, and be quite vocal about system 
functionality. And other library users are quite pleased when they 
discover new things while searching for something else-- such as 
different formats, and different expressions (we recently got in some 
wonderful new Shakespeare play expressions and adaptations, based upon 
different vocabulary levels, graphic novel versions, side-by-side 
renderings with modern English, etc.). Staff are always requesting 
that at-a-glance kind of functionality in the catalog, rather than 
having to examine each record in detail. The more element-based the 
data is, and the more tabular it is, and the more groupings and 
relationships are shown clearly (and we have a quasi-FRBR-like 
breakdown already in the title browse index), the happier everyone is. 
And with most popular items checked out at any point in time, such as 
DVDs and bestsellers (there's lots of great stuff not in e-book 
format), the catalog is the ONLY mechanism endusers have to find, 
identify, select and obtain what they want, so the more functionality 
based upon cleanly delineated data, the better. Even with e-books, 
holds are often still necessary, and that can only be done in a 
discovery layer of some sort.

/snip

I agree about the good data input, but that is only another way of 
saying that standards are important. If standards are not enforced, it 
doesn't matter if the standards themselves are great or lousy--anybody 
can do whatever they want anyway. It's so very sad that there seems to 
be in the library world the idea that:
If only *those others* had done things differently before, *then* I 
could do all these wonderful things, therefore, *those others* have to 
change everything they do before I can really begin to start on my 
wonderful things

as opposed to:
We are facing a serious problem. We have *these resources* at our 
disposal right now. Perhaps it's true that different decisions should 
have been made in the past so that we have 

Re: [RDA-L] Completeness of records

2011-08-08 Thread Bernhard Eversberg

08.08.2011 10:01, James Weinheimer:


 The Worldcat example that I gave before for searching the work of
 Cicero's Pro Archia
 http://www.worldcat.org/search?q=au%3Acicero+ti%3Apro+archia,
 allowing the searcher to limit by format, by other authors (editors),
 by date of publication, language, etc. overfulfills those 19th
 century FRBR user tasks without the need for redoing, retraining,
 reconceptualizing, re-everything. It can be done today, right now for
 *no extra money*--just let your systems people devise some queries.
 ...

 If this bit of reality could be accepted, perhaps we could claim
 success: FRBR is now implemented! And at no real costs! Wouldn't
 THAT be nice to claim?!. Then we could move on to other discussions
 that would be more relevant to the genuine needs of the vast majority
 of our patrons.

Right, AND don't we forget we need consistent data, esp. with the 
uniform titles.


Add to this the AACR2 updates done by M. Gorman and Mac, and there is 
indeed,

and I think this bears repeating, no urgent need to venture on a big
migration of both code and format. The results of that herculean act 
would just

not go far enough beyond what can already be done without it. (Furthermore,
cataloging codes that are not under open access cannot succeed anyway.)

I think VIAF could be extended to include uniform titles. Better
integration of VIAF into cataloging interfaces would then go a long
way towards improved consistency.

For countries, such as Germany, hitherto not under the star-spangled banner
of AACR2, the need for migration can also be obviated by intensified and
clever use of VIAF. [Though this is not an open access tool either, but
there's nothing to replace it, whatever code and format we use.]

B.Eversberg




Re: [RDA-L] Completeness of records

2011-08-08 Thread James Weinheimer

On 07/08/2011 17:32, Karen Coyle wrote:
snip
In the Open Library, where they decided to gather manifestations under 
works (as usual, expression was harder to do), all it took was one 
record for the manifestation to have a uniform title. I'll illustrate:


Mann, Thomas
[Der zauberberg]
Magic Mountain

Mann, Thomas
[Der zauberberg]
Montagna incantata

Mann, Thomas
Magic Mountain

Mann, Thomas
Montagna incantata

These give you the information you need to bring them together into a 
single work even though some records don't have a direct link to the 
work. I could imagine a kind of switching file with links between 
original and translated titles that would remove the need for uniform 
titles in the process of work-ifying a set of bib records. (Not 
unlike OCLC's xISBN service, BTW, only based on titles not identifiers.)

/snip

So, the links to the individual records are gathered in the collective 
record for the work? e.g. http://openlibrary.org/works/OL14866824W.rdf I 
see:
rdf:Description 
rdf:about=http://openlibrary.org/books/OL14227095M/;rdrel:workManifested

  http://openlibrary.org/works/OL14866824W/;
/rdrel:workManifesteddcterms:titleThe magic mountain =: der 
Zauberberg/dcterms:titledcterms:date1939/dcterms:date/rdf:Description


with the link to the manifestation in the rdf:about. I don't see a 
reciprocal link from the single item 
(http://openlibrary.org/books/OL14227095M/) to the work record but that 
would be overkill.


Why did you choose that structure? Is it a more efficient use of 
computer resources? It seems to work as well as making the links the 
other way. The only problem I could see with this type of structure is 
that if someone took a copy of the individual record, there would be no 
link back to the work record. But within the database, everything seems 
fine. Still, if they did as you mentioned, turning it into a switching 
file (or whatever it is called), making that openly available, it may 
work even then.


--
James Weinheimer  weinheimer.ji...@gmail.com
First Thus: http://catalogingmatters.blogspot.com/
Cooperative Cataloging Rules: http://sites.google.com/site/opencatalogingrules/


Re: [RDA-L] Completeness of records

2011-08-08 Thread Jonathan Rochkind
You _can_ do things this way, out of neccesity, but it's definitely not 
preferable from a data mangement point of view, right?  We're talking 
about the difference between a a single 'foreign key' in each record 
stating that it's part of a certain work (preferable from data 
management point of view), compared to basically heuristics for guessing 
from as-written-on-title-page (or as entered by a user) title/author 
combinations (less preferable from data management point of view, but 
possibly neccesary to avoid the expense of human data control), compared 
to this idea of a switching file that is sort of just a 
human-controlled enhancement to the heuristics (but if you're going to 
spend human time doing that, why not just spend human time doing it 
right, the foreign key approach?  The switching file approach is to 
my mind a less efficient encoding, not a more efficient one.)


On 8/7/2011 11:32 AM, Karen Coyle wrote:

Quoting James Weinheimer weinheimer.ji...@gmail.com:


if the purpose is to get the FRBR-type results
to show what works, expressions, manifestations and items exist. For 
those records that do not have the uniform title entered, they fall 
outside, and there is nothing to do except to add the uniform titles 
(or URIs or whatever),


In the Open Library, where they decided to gather manifestations under 
works (as usual, expression was harder to do), all it took was one 
record for the manifestation to have a uniform title. I'll illustrate:


Mann, Thomas
[Der zauberberg]
Magic Mountain

Mann, Thomas
[Der zauberberg]
Montagna incantata

Mann, Thomas
Magic Mountain

Mann, Thomas
Montagna incantata

These give you the information you need to bring them together into a 
single work even though some records don't have a direct link to the 
work. I could imagine a kind of switching file with links between 
original and translated titles that would remove the need for uniform 
titles in the process of work-ifying a set of bib records. (Not 
unlike OCLC's xISBN service, BTW, only based on titles not identifiers.)


Not every bit of information has to be in every record. We can have 
information outside of individual bib records that helps us make 
decisions or do things with the records. One of the benefits given for 
FRBR is that it makes it easier for us to share this common knowledge, 
and to make use of it. I think that even without a formal adoption of 
FRBR we could gain efficiencies in bib record creation and system 
functionality by having a place (undoubtedly on the web) where we 
share this knowledge. If you look at what DBPedia is doing with 
general information from Wikipedia and other resources, then you get 
the idea. DBPedia is messy and rather ad hoc, but a LIBPedia could be 
made up of authoritative sources only.


kc


Re: [RDA-L] Completeness of records

2011-08-08 Thread Karen Coyle

Quoting James Weinheimer weinheimer.ji...@gmail.com:




Why did you choose that structure? Is it a more efficient use of  
computer resources?


To begin with, I'm just an observer of the Open Library development,  
not a designer, so I can't give any detail on the WHY of things. You  
can, however, see the guts by going to


http://openlibrary.org/type/

This is a list of all of the structures and data elements. The two  
most relevant here are:


http://openlibrary.org/type/edition  -- the manifestation/expression
http://openlibrary.org/type/work   -- the work

There are links in each to the other, as you can see there. Obviously,  
how you handle the links will depend on your database management  
system and your record structure and the flow of search and display.


The use of uniform titles that I demonstrated is part of the  
application that merges editions into works. That is buried somewhere  
in the github repo:


https://github.com/openlibrary

Finding particular areas of the code isn't easy, but if you want that  
kind of thing, it's all there. My guess is that it's somewhere in this  
path:


https://github.com/openlibrary/openlibrary/tree/master/openlibrary/catalog/works

Enjoy!
kc



--
Karen Coyle
kco...@kcoyle.net http://kcoyle.net
ph: 1-510-540-7596
m: 1-510-435-8234
skype: kcoylenet


Re: [RDA-L] Completeness of records

2011-08-08 Thread J. McRee Elrod
Karen said:

We have a lot of information, collectively, that shouldn't have to be  
re-done by every cataloger.
 
This was one of the objectives of the UK PRECIS.  It was a disaster.

The mismatches some put down to the ambiguities of language, others to
the complexity of the bibliographic universe, including moving images.  
(The Canadian National Film Board tried PRECIS.)

It's interesting to see the same ideas recycle in differing forms over
the decades,


   __   __   J. McRee (Mac) Elrod (m...@slc.bc.ca)
  {__  |   / Special Libraries Cataloguing   HTTP://www.slc.bc.ca/
  ___} |__ \__

 


Re: [RDA-L] Completeness of records

2011-08-08 Thread Kevin M Randall
James Weinheimer wrote:
 Would it be wonderful? I believe very little will change in library
 cataloging until the metadata creators divorce themselves from this
 official, traditional dogma that what our users want is the FRBR user
 tasks, [...]

James, you have continually made the assertion that users are not interested in 
the FRBR user tasks, that what they want is something else.  In order that we 
may be able to communicate more clearly about FRBR, I respectfully request a 
simple example of something the users want that does not fit into one of the 
FRBR user tasks.  Just one will do.

Thanks!

Kevin M. Randall
Principal Serials Cataloger
Bibliographic Services Dept.
Northwestern University Library
1970 Campus Drive
Evanston, IL  60208-2300
email: k...@northwestern.edu
phone: (847) 491-2939
fax:   (847) 491-4345


Re: [RDA-L] Completeness of records

2011-08-08 Thread James Weinheimer

On 08/08/2011 18:30, Kevin M Randall wrote:
snip
James, you have continually made the assertion that users are not 
interested in the FRBR user tasks, that what they want is something 
else. In order that we may be able to communicate more clearly about 
FRBR, I respectfully request a simple example of something the users 
want that does not fit into one of the FRBR user tasks. Just one will do.

/snip

I suggest you listen to my podcast on Search. 
http://catalogingmatters.blogspot.com/2010/12/cataloging-matters-podcast-no-7-search.html, 
and my latest podcast 
http://catalogingmatters.blogspot.com/2011/08/cataloging-matters-podcast-12.html 
for a more humorous view. Concerning the latter one, lots of people have 
sent messages saying that this is how they feel about library catalogs.


I also suggest the writings of John Battelle, who wrote the book 
Search. Here is one article 
http://searchengineland.com/john-battelle-on-the-future-of-search-38382, 
and there are a lot of his talks online too. Determining what the public 
wants and expects from searching is a major topic now, potentially with 
lots of money riding on the outcome. My point is: for better or worse, 
that is the future and there is little we can do about it. Therefore, 
how can we fit into that scenario using the resources available now? 
What do we have to offer that no one else does?


--
James Weinheimer  weinheimer.ji...@gmail.com
First Thus: http://catalogingmatters.blogspot.com/
Cooperative Cataloging Rules: http://sites.google.com/site/opencatalogingrules/


Re: [RDA-L] Completeness of records

2011-08-08 Thread Karen Coyle

I'll briefly give you my objections to the FRBR tasks, which are summed up by:

they start when the user approaches the library, and they stop once  
the user *obtains* a library resource. They don't include, for  
example, linking catalog entries to wikipedia articles so that users  
discover library resources while in a non-library environment, and  
they also don't include things like formulating citations, downloading  
citations into writings or databases, organizing bibliographic data,  
comparing items in the catalog, sharing with colleagues, using  
retrieved items to find more information on the web, etc etc.


It *may* be possible to shoe-horn those activities into the FRBR-4,  
but I think that would be artificial. The catalog should be part of a  
whole range of services outside of a catalog search. That requirement  
*could* require changes to *cataloging*, that is, the creation of the  
catalog entry.


kc

Quoting James Weinheimer weinheimer.ji...@gmail.com:


On 08/08/2011 18:30, Kevin M Randall wrote:
snip
James, you have continually made the assertion that users are not  
interested in the FRBR user tasks, that what they want is something  
else. In order that we may be able to communicate more clearly  
about FRBR, I respectfully request a simple example of something  
the users want that does not fit into one of the FRBR user tasks.  
Just one will do.

/snip

I suggest you listen to my podcast on Search.  
http://catalogingmatters.blogspot.com/2010/12/cataloging-matters-podcast-no-7-search.html, and my latest podcast http://catalogingmatters.blogspot.com/2011/08/cataloging-matters-podcast-12.html for a more humorous view. Concerning the latter one, lots of people have sent messages saying that this is how they feel about library  
catalogs.


I also suggest the writings of John Battelle, who wrote the book  
Search. Here is one article  
http://searchengineland.com/john-battelle-on-the-future-of-search-38382, and  
there are a lot of his talks online too. Determining what the public  
wants and expects from searching is a major topic now, potentially  
with lots of money riding on the outcome. My point is: for better or  
worse, that is the future and there is little we can do about it.  
Therefore, how can we fit into that scenario using the resources  
available now? What do we have to offer that no one else does?


--
James Weinheimer  weinheimer.ji...@gmail.com
First Thus: http://catalogingmatters.blogspot.com/
Cooperative Cataloging Rules:  
http://sites.google.com/site/opencatalogingrules/






--
Karen Coyle
kco...@kcoyle.net http://kcoyle.net
ph: 1-510-540-7596
m: 1-510-435-8234
skype: kcoylenet


Re: [RDA-L] Completeness of records

2011-08-08 Thread Kevin M Randall
James Weinheimer wrote:

 I suggest you listen to my podcast on Search.

I was really hoping for something that could become part of the conversation 
*here*.  I'm sure there are others who would appreciate it too.

Kevin M. Randall
Principal Serials Cataloger
Bibliographic Services Dept.
Northwestern University Library
1970 Campus Drive
Evanston, IL  60208-2300
email: k...@northwestern.edu
phone: (847) 491-2939
fax:   (847) 491-4345


Re: [RDA-L] Completeness of records

2011-08-08 Thread Karen Coyle

Quoting J. McRee Elrod m...@slc.bc.ca:





Changes to the ILS seem more to the point to me.


Of course the ILS will also change accordingly. However, cataloging  
provides the data. As I've said here before, systems have to work with  
the data they have. Those examples I gave? Many of them cannot be done  
with the data we have today, and others are made difficult by our data  
structure. It's not just whether the data is there, but whether it can  
be used efficiently. An ILS cannot read the mind of either the  
cataloger nor the user. It's just a dumb computer.


kc

--
Karen Coyle
kco...@kcoyle.net http://kcoyle.net
ph: 1-510-540-7596
m: 1-510-435-8234
skype: kcoylenet


Re: [RDA-L] Completeness of records

2011-08-08 Thread Brenndorfer, Thomas




 -Original Message-

 From: Resource Description and Access / Resource Description and Access

 [mailto:RDA-L@LISTSERV.LAC-BAC.GC.CA] On Behalf Of Karen Coyle

 Sent: August 8, 2011 1:06 PM

 To: RDA-L@LISTSERV.LAC-BAC.GC.CA

 Subject: Re: [RDA-L] Completeness of records



 I'll briefly give you my objections to the FRBR tasks, which are summed

 up by:



 they start when the user approaches the library, and they stop once

 the user *obtains* a library resource. They don't include, for

 example, linking catalog entries to wikipedia articles so that users

 discover library resources while in a non-library environment, and

 they also don't include things like formulating citations, downloading

 citations into writings or databases, organizing bibliographic data,

 comparing items in the catalog, sharing with colleagues, using

 retrieved items to find more information on the web, etc etc.



 It *may* be possible to shoe-horn those activities into the FRBR-4,

 but I think that would be artificial. The catalog should be part of a

 whole range of services outside of a catalog search. That requirement

 *could* require changes to *cataloging*, that is, the creation of the

 catalog entry.



 kc







That makes a lot of sense, as there are multiple things we can do or should be 
able to do with catalog data.



There are some distinctions I think. The user tasks also presuppose a granular 
element set, as specific elements are assigned values based upon the relative 
importance for the user tasks. The organizing and retrieving of data can be 
enhanced by simply better and more specific data, without necessarily 
anticipating their ultimate use by users (although, logically, we would still 
want the user to actually work with the data in some way).



As Mac says, we need to improve our ILS's. The ILS's look like they will be 
improved with all the RDA-based MARC tags that exist and are being proposed, 
since they tackle the poor organization and lack of granularity in MARC. I 
already make use of the new RDA authority record 3XX fields in quickly 
identifying a Person (I think all of these RDA-based 3XX fields in authority 
records are not dependent on RDA implementation decisions - from what I 
understand they're good to go today, and are now part of the ever-changing and 
ever-expanding family of MARC fields).



There's also a say what you mean, mean what you say aspect to FRBR that is 
often missed. For example, are users comparing items in a particular instance, 
or do they really mean works? A site like LibraryThing has been built up around 
the work concept, and ties in user-generated and social networking content 
around that work entity level, to great effect and with all the efficiencies 
that effort represents.


Also, the full range of user tasks hasn't really been looked at. The 
consolidation of the three FR models (FRBR, FRAD, FRSAD) is, I believe, 
underway or being planned. In FRSAD, there's the user task of Explore (to 
explore any relationships between entities (thema or nomen), correlations to 
other subject vocabularies and structure of a subject domain). That looks a 
massive undertaking, but it does reflect the purposes to which a lot of catalog 
effort is already directed, in all the work in controlled vocabulary for 
subjects that is done today.




Thomas Brenndorfer

Guelph Public Library




Re: [RDA-L] Completeness of records

2011-08-08 Thread James Weinheimer

On 08/08/2011 19:00, Kevin M Randall wrote:
snip

James Weinheimer wrote:

I suggest you listen to my podcast on Search.

I was really hoping for something that could become part of the conversation 
*here*.  I'm sure there are others who would appreciate it too.

/snip

That means redoing an awful lot which I really don't feel like doing or 
have time for. May I suggest the opposite: would you point out why 
people do want the FRBR user tasks? Where is the evidence? Where is the 
research? Especially, why do we assume that they want works, 
expressions, manifestations, and items? How often have you yourself (not 
as a cataloger) needed a specific printing or needed to know the number 
of pages of a book? I have seen no evidence that very many people want 
this, but they definitely want other capabilities. As I had in my 
Dialog between a patron and the library catalog, the patron says:


Well, I'm a user too and I need something else [i.e. besides the FRBR 
user tasks]. In full-text databases, I can do all kinds of searches and 
analyze the texts themselves and make decisions. I guess I can 
understand that if you don't have any full text and that you cannot 
examine the items immediately, somebody will need to make a choice among 
similar resources. But if I am to make a meaningful choice, I need 
meaningful information. Giving me publication dates and page numbers 
doesn't help me make a decent decision. If I can look at a thing 
directly, I can decide which one I want, so if I am able to examine the 
versions, I can decide that one is easier to read or one has pages 
falling out, or I just choose any one I want. Otherwise, I am being 
forced to choose texts based on information that means nothing to me at 
all. How am I supposed to decide I want something published in 1923 or 
another from 1962 without knowing what the differences are? Why is this 
information supposed to have meaning for me?


Exactly the same arguments (other than the references to full text!) 
were made by several people, using different words of course, in the 
famous Royal Commission report discussing Panizzi's catalog, so the 
complaint is nothing new. In addition, the information universe is 
growing very far away from our traditional tools, concentrating on 
different aspects of search.


Since I personally am interested in the history of bibliography, I 
actually want to know different printings and page numbers--once in 
awhile. In fact, now that I have an ebook, I have discovered that 
scan/print size has become important to me, and even margin width 
because I can see some pdfs more comfortably on my reader. Should we 
start putting in the width of the printed text on the page? Of course 
not, but it would come in handy for me now.


It has become clear to me that even in Panizzi's time, the task of the 
catalog as *inventory tool* for librarians was absolutely critical and 
because of the ways the Library of the British Museum functioned in the 
1840s, it was more important still. Today, the catalog as inventory tool 
is still vital and I don't question its importance for librarians for a 
single moment. But that same function of inventory control is *NOT* 
important to the vast majority of users.


So, why do we have this strange situation? I think it is because 
*everybody* has always had to use the same tool: the library catalog 
where the needs of the librarians and the collection necessarily and 
*correctly* trumped those of the users (in spite of what everyone has 
said). For instance, who are these users? A huge group with so many 
different needs they cannot be lumped together at all. That is 
completely obvious. This is one aspect that the information companies 
understand *very well* and are exploiting to the full, I think, at our 
expense. And we have to confess that they are right. Expecting all to 
use a one size fits all catalog has never made much sense, and makes 
even less today. Formerly, there was no choice though since making 
separate catalogs for the people (i.e. various types of printed 
catalogs) became impossible both financially and practically.


Today, we do NOT all have to use the same tools. We librarians can 
retain our tools to maintain management of the collection and go on to 
improve those tools in whatever ways we want, without caring about the 
impact on the public, because the records themselves can be ported out 
into Drupal or Moodle or all sorts of other systems, so that people and 
developers can go crazy with them. *That* will be when we can begin to 
discover what people really and truly want and need from the information 
in our catalog records.


I hope to write an article on the historical aspects of this somewhere 
along the way, but it is still in development. Still, I've pointed to 
several things discussing search, etc. Can you point me to modern 
evidence done among the public (i.e. *not* asking library students or 
librarians!) that the *public* wants WEMI?


--
James 

Re: [RDA-L] Completeness of records

2011-08-07 Thread James Weinheimer

On 06/08/2011 19:00, Brenndorfer, Thomas wrote:
snip
But it's not true FRBR, and it doesn't do translations well, and so it 
requires extra effort to answer patron queries about titles in our small 
language collections. And part of the problem with translations stems 
from removing fields like 240 for display purposes when that destroys 
the only mechanism left to relate those resources. It's that tangling of 
display and user task functionality in fields that causes so much grief. 
That's why those aspects of catalog design need to be separated. 
Fortunately, FRBR absolutely does NOT depend upon those antiquated 
methods, such as collocation by uniform titles, to specify 
relationships. As the FRBR report 
(http://archive.ifla.org/VII/s13/frbr/frbr2.htm#5) indicates, the 
current methods of creating relationships in catalog records are haphazard.

/snip

FRBR does need the uniform title in some form, that is, some bit of data 
that brings the different records together. How that data is to be 
encoded, using a 130/240/etc. textual string, or some kind of URI, URJ, 
URK, L M N O or P, the final product will be to bring the metadata 
records together in some way, just as the heading did in the card 
catalog. The primary task is to ensure that it is consistently entered 
and then many things can happen. If the information is inconsistent, or 
does not exist in textual or some kind of form, there is not enough 
information to bring everything together.


As I demonstrated with searching Worldcat, for those records that have 
the uniform title entered, collocation of those records can be done 
*right now* and there is no reason to change any of our current records 
or procedures if the purpose is to get the FRBR-type results to show 
what works, expressions, manifestations and items exist. For those 
records that do not have the uniform title entered, they fall outside, 
and there is nothing to do except to add the uniform titles (or URIs or 
whatever), that is, *if* it can be demonstrated that this provides the 
public with what they really want (which should not be accepted on 
faith) and it is judged worthwhile to edit those records at the cost of 
doing other things that our patrons would prefer, such as cataloging 
more items, or perhaps cataloging more deeply, with better and more 
useful subjects and/or analysing more collections.


--
James Weinheimer  weinheimer.ji...@gmail.com
First Thus: http://catalogingmatters.blogspot.com/
Cooperative Cataloging Rules: http://sites.google.com/site/opencatalogingrules/


Re: [RDA-L] Completeness of records

2011-08-07 Thread Brenndorfer, Thomas

From: Resource Description and Access / Resource Description and Access 
[RDA-L@LISTSERV.LAC-BAC.GC.CA] On Behalf Of James Weinheimer 
[weinheimer.ji...@gmail.com]
Sent: August-07-11 6:08 AM
To: RDA-L@LISTSERV.LAC-BAC.GC.CA
Subject: Re: [RDA-L] Completeness of records

On 06/08/2011 19:00, Brenndorfer, Thomas wrote:
snip
But it's not true FRBR, and it doesn't do translations well, and so it
requires extra effort to answer patron queries about titles in our small
language collections. And part of the problem with translations stems
from removing fields like 240 for display purposes when that destroys
the only mechanism left to relate those resources. It's that tangling of
display and user task functionality in fields that causes so much grief.
That's why those aspects of catalog design need to be separated.
Fortunately, FRBR absolutely does NOT depend upon those antiquated
methods, such as collocation by uniform titles, to specify
relationships. As the FRBR report
(http://archive.ifla.org/VII/s13/frbr/frbr2.htm#5) indicates, the
current methods of creating relationships in catalog records are haphazard.
/snip

FRBR does need the uniform title in some form, that is, some bit of data
that brings the different records together.


There's a difference when data is controlled by identifiers or control numbers 
vs text strings. I've gone through several library and library systems, and 
currently I am able to do a lot of authority updating and maintenance based 
upon control numbers that I couldn't do before with earlier, less capable 
systems. However, once I move closer to cleaning up the bibliographic records I 
have to switch to more manual operations, manual checking, crude global updates 
methods and deduping algorithms, etc. (such as all that annoying checking of 
changed headings in name-title forms, and with added subject subdivisions).

It's like the last mile in broadband connectivity. Fast fibre optic everywhere 
except when one gets closer to home where antiquated technology slows things 
done. It would be wonderful if everything works perfectly *right now* but it 
emphatically does not work as simply as you suggest.

It's only when data is modelled out thoroughly and correctly that we can start 
talking about new functionality. An example is the Item-level functionality in 
the latest library systems I've worked with. Holdings displays can be finetuned 
based upon user location and item availability attributes. This saves the user 
time by showing the user holdings with priority ranking based upon library 
branch location and item availability. This functional requirement based upon 
user tasks is done first by data modelling (quite likely, based upon the 
database fields involved, by asking for instance: what entities do I need 
(item, branch, workstation), what attributes do I need (item availability 
status), and what relationships do I need). And this is popular, and would be 
emphasized on any RFP for a system.

It would be wonderful if the functionality could be extended more deeply, 
showing the user for example, related works that are actually available in the 
library based upon the relationship clustering inherent in FRBR.

We see something similar with the integration of the NoveList readers' advisory 
service in the catalog. The linking is done by manifestation identifier (ISBN), 
but this is crude, because different manifestations (U.S., Canada, UK 
publishers), and different expressions (e-book, audiobook) can be missed.

We see similar issues with the new ebook interfaces with the various new ebook 
services we're promoting. The ebook services are not that great for searching-- 
as the collection gets larger, the weaker the tool becomes. The MARC records 
having the highest quality data, but the catalog records are missing the 
item-level attributes found only in the ebook service interface. In addition, 
changes to manifestation level details such as DRM changes and format changes 
(MP3 vs WMA) are better handled in the ebook service.

The bulk of the staff time nowadays in helping people with e-books goes to 
manifestation selection details with the all different confusing formats, and 
well as assistance with the system requirements of the different intermediary 
devices. The more explicit and better arranged the data, the easier it is on 
staff and endusers.

But even outside of all the new technology like e-books, library users can 
still be very insistent on specific aspects that relate to the different FRBR 
bibliographic levels. Last week, a library user I dealt with was absolutely 
insistent on getting a book-on-tape version of a title (our collection is 
dwindling and being replaced by books-on-CD, e-audiobooks, and Playaways). But 
there's no harm in promoting the other formats-- the library is there to help 
in getting people set up with the different formats.

I recall the library user who absolutely wanted Seamus Heaney's

Re: [RDA-L] Completeness of records (was: Browse and search BNB open data)

2011-08-05 Thread J. McRee Elrod
Karen said:

It is easy to find records for translations that do not have a uniform  
title for the original.
 
Our smaller clients strongly object to a 240 for translations,
particularly if the foreign language text is not on the title page;
they say it confuses patrons.  We change the 240 to to 246 3  
$iTranslation of:$a.  They accept 240 for classical music and
Shakespeare, but little else.

There is also the case in Canada of simultaneous publications in English
and French.  There is no way of know which is a translation of the other.

Don't assume failure on the part of the cataloguer; it may be patron
desire.  Patron convenience seems to be the forgotten factor in much
or our discussions.  
  
My preference would be address the problem though systems, rather than
changing records, e.g., to have 240s suppressed in display and
hitlists, but that would remove 240s from classical music and
Shakespeare as well.


   __   __   J. McRee (Mac) Elrod (m...@slc.bc.ca)
  {__  |   / Special Libraries Cataloguing   HTTP://www.slc.bc.ca/
  ___} |__ \__


Re: [RDA-L] Completeness of records (was: Browse and search BNB open data) (fwd)

2011-08-05 Thread J. McRee Elrod
I said:

Our smaller clients strongly object to a 240 for translations ...

I should have added they are not very fond of 130s either,
particularly when the 130 says (motion picture) and the 245 says
[videorecording].  They say patrons see it as a contradiction.  

They will accept 130s for Bible, and we've had no complaints about
Arabian nights.

There seems to be a gap between those who make these decisions, and
the resulting experience of many library users.  RDA seems even further
removed than AACR2, since it does not address display.


   __   __   J. McRee (Mac) Elrod (m...@slc.bc.ca)
  {__  |   / Special Libraries Cataloguing   HTTP://www.slc.bc.ca/
  ___} |__ \__


Re: [RDA-L] Completeness of records (was: Browse and search BNB open data)

2011-08-05 Thread Karen Coyle

Quoting J. McRee Elrod m...@slc.bc.ca:



Don't assume failure on the part of the cataloguer; it may be patron
desire.  Patron convenience seems to be the forgotten factor in much
or our discussions.


Not only do I not assume failure on the part of the cataloguer, I  
don't assume failure at all. But the fact is that we can only work  
with the data we have in our bibliographic records regardless of what  
data *possibilities* there are in the MARC record. I believe this is  
indisputable.




My preference would be address the problem though systems, rather than
changing records, e.g., to have 240s suppressed in display and
hitlists, but that would remove 240s from classical music and
Shakespeare as well.


It's not rocket science to keep 240's in music records, as long as  
they are coded as music records, and drop them from text records. It's  
not even rocket science to display uniform titles for items with  
multiple Expressions. There are a lot of possibilities, but for these  
possibilities to become realities we have to get the data out of MARC  
in into a more manipulable format. These things are a pain to do with  
our current data, but I think they become much more plausible with a  
format that is less based on the structure of the display and more on  
the meaning of the data. In fact, the RDA elements, as defined, are  
closer to this concept of manipulable data elements than MARC is.  
That's not to say that RDA is perfect as a cataloging code, but it is  
based on more modern data concepts than AACR/MARC was.


kc




--
Karen Coyle
kco...@kcoyle.net http://kcoyle.net
ph: 1-510-540-7596
m: 1-510-435-8234
skype: kcoylenet


Re: [RDA-L] Completeness of records (was: Browse and search BNB open data)

2011-08-05 Thread Daniel CannCasciato
On 8/5/11 at 2:16 PM, Karen Coyle wrote in part:

But the fact is that we can only work  with the data we have in our 
bibliographic records regardless of what data *possibilities* there are in the 
MARC record. I believe this is indisputable.

I like this.  I just hope that this indisputable fact begins to register with 
the admin folk who make budget and staffing decisions - - often, it seems to 
me, they ignore the simple fact that if no one creates metadata, then the shiny 
discovery interface only appears to be aiding our patrons because it's built 
over a shallow/poor resource.  We should talk much less (in other professional 
areas) about baseline/standard records and more about enriched and quality 
records.  

Daniel




-- 

Daniel CannCasciato
Head of Cataloging
Central Washington University Brooks Library
Ellensburg, WA
 
We offer solid services that people need, and we do so wearing sensible 
shoes. -- MT