Re: [RDA-L] Automatically adding relationship designators (was Cost of Retrospective Conversion for Legacy Data...)

2013-12-12 Thread Bernhard Eversberg

12.12.2013 14:07, James Weinheimer:


... still remains unproven, or the question of how much implementing FRBR/RDA 
will
ultimately cost. The costs are already beyond many libraries.


Not just the cost of implementing but even the costs of just reading
the rules.
Therefore, we are heading into a two-class library environment. This
will be further diversified when it comes to whether or not a library
can or wants to or is allowed to buy a systems upgrade to support FRBR
and/or BIBFRAME and/or to have their local data upgraded to take
advantage of new options, and so on. Patrons should love it when they
are no longer all of them so boringly alike ...

B.Eversberg

To unsubscribe from RDA-L send an e-mail to the following address from the 
address you are subscribed under to:
lists...@listserv.lac-bac.gc.ca
In the body of the message:
SIGNOFF RDA-L


Re: [RDA-L] Automatically adding relationship designators (was Cost of Retrospective Conversion for Legacy Data...)

2013-12-12 Thread James Weinheimer

On 12/12/2013 12:12 AM, Kevin M Randall wrote:

This statement proves the point that you do not understand what the 
FRBR tasks are. The FRBR tasks are not "methods". They are objectives. 
Relevance ranking and algorithmic connections are examples of methods 
which are used to accomplish the user tasks. The statement you made is 
as nonsensical as saying "People don't like to travel from Los Angeles 
to New York. They would rather take an airplane." 



I don't understand your reasoning here, but I guess it is yet another 
example of the feebleness of my intellect--which, as is well-known, is 
proven by my disagreeing with the FRBR/RDA library gods. By focusing on 
me, there is no need to address the problems of whether users actually 
want the user tasks (by the way, no matter how stupid I may be, whether 
people want to do those things so desperately still remains unproven and 
quite obviously, is a question that cannot be asked or you will be 
pilloried for it) or the question of how much implementing FRBR/RDA will 
ultimately cost. The costs are already beyond many libraries. But of 
course, all of those issues are so irrelevant they can just be ignored.


I have grown weary of the personal attacks and so will not answer any of 
those for awhile.


One point of substance however:


The OLAC Movie & Video Credit Annotation Experiment which you decried is 
addressing exactly this point. As they say: "Eventually, we intend to 
automate most of this conversion. For now, we need help from human 
volunteers, who can train our software to recognize the many ways names 
and roles have been listed in library records for movies."



I didn't "decry" the project at all. I merely asked the question that 
would pop into the head of any 5 or 6 year old child: "Why not Google 
it?" So I did and the results were very, very good--at least as good as 
any library catalog could ever hope to offer. So then I asked another 
natural question: what is the purpose of the project? I am not decrying 
anything but asking some very natural questions.


I mentioned that if it is considered an experiment, everything is fine 
and perhaps we could learn something from it. To be more specific now: 
since the film roles already exist in Wikipedia and IMDB, they could 
serve as "control groups" to help estimate how accurate any automatic 
conversions would be, how much manual cleanup would be needed and how 
much everything would cost. The results may be applicable to other 
materials.


In the specific case of films, for the final result however, it would 
seem much more efficient to use the information that already exists by 
implementing the APIs. Otherwise it would quite clearly duplicating what 
already exists.


--
James Weinheimer weinheimer.ji...@gmail.com
First Thus http://catalogingmatters.blogspot.com/
First Thus Facebook Page https://www.facebook.com/FirstThus
Cooperative Cataloging Rules 
http://sites.google.com/site/opencatalogingrules/
Cataloging Matters Podcasts 
http://blog.jweinheimer.net/p/cataloging-matters-podcasts.html


To unsubscribe from RDA-L send an e-mail to the following address from the 
address you are subscribed under to:
lists...@listserv.lac-bac.gc.ca
In the body of the message:
SIGNOFF RDA-L


Re: [RDA-L] Automatically adding relationship designators (was Cost of Retrospective Conversion for Legacy Data...)

2013-12-11 Thread Kevin M Randall
James Weinheimer wrote:

> But people have been backing away from those
> user tasks for awhile now as it becomes more and more obvious that
> people prefer other methods such as relevance ranking and algorithmic
> connections that work in completely different ways.

This statement proves the point that you do not understand what the FRBR tasks 
are.  The FRBR tasks are not "methods".  They are objectives.  Relevance 
ranking and algorithmic connections are examples of methods which are used to 
accomplish the user tasks.  The statement you made is as nonsensical as saying 
"People don't like to travel from Los Angeles to New York.  They would rather 
take an airplane."

> And as for the relationships, there remains that uncomfortable fact that
> *if* they are to be implemented, then quite literally millions of
> records will have to be updated by cataloging staffs that are decreasing
> in numbers; staffs who are already overworked and in many cases with
> morale not doing all that well.

The OLAC Movie & Video Credit Annotation Experiment which you decried is 
addressing exactly this point.  As they say:  "Eventually, we intend to 
automate most of this conversion. For now, we need help from human volunteers, 
who can train our software to recognize the many ways names and roles have been 
listed in library records for movies."

> I just wish that we could declare victory for FRBR now because modern
> computing has allowed people to do them right now, as I have shown
> often
> enough with using the facets in Worldcat.

What you fail to realize is that the faceting will become better and more 
powerful as the data is refined.

But if you're happy with how things are now, so be it.  It would just be nice 
if we didn't hear constant complaints about us hoping for and working toward a 
better future.

Kevin M. Randall
Principal Serials Cataloger
Northwestern University Library
k...@northwestern.edu
(847) 491-2939

Proudly wearing the sensible shoes since 1978!

To unsubscribe from RDA-L send an e-mail to the following address from the 
address you are subscribed under to:
lists...@listserv.lac-bac.gc.ca
In the body of the message:
SIGNOFF RDA-L


Re: [RDA-L] Automatically adding relationship designators (was Cost of Retrospective Conversion for Legacy Data...)

2013-12-11 Thread James Weinheimer

On 11/12/2013 15.44, Brenndorfer, Thomas wrote:


Why are you repeating your egregious misinterpretation of FRBR? FRBR is not 
Find-Identify-Select-Obtain by author, title, subject headings.

If I "select" a book because it won the Booker Prize then I am engaging in one 
of the FRBR user tasks. The user tasks apply to the entire spectrum of attributes and 
relationships-- not just the ones found and implemented in traditional catalogs through 
the heading structure. In addition, the entity-relationship model can be applied to many 
other data systems used in libraries. For example, the circulation module in my system is 
a feature-rich implementation of entity-relationship principles that if anything 
showcases how much the potential of the data locked in AACR2-MARC records lies untapped.
...


It is so remarkable that these basic facts are glaringly absent in your posts, 
and yet not surprising since many of your cited sources are to your own blog 
posts and podcasts.

For example, in emphasizing the power of free text searching in your blog post, 
why would not that same power be brought to bear on the topic at hand-- 
retrospective conversion?

If the technology is so "incredible" then how is it that some problems are then 
so insurmountable that we should give up?


 From you blog post -- "it is the reliance on alphabetical order that has become 
obsolete in our new environment" indicates that your interpretation of FRBR is 
incorrect as FRBR and RDA are not just repeating the alphabetical structure of 
traditional catalogs. Notably in RDA, the instructions for authorized access points (the 
equivalent to headings) are relegated to second-class citizen status by being put at the 
back of chapters, after discrete data elements are covered (with the expectation that 
these discrete data elements, along with new forms of controlled identifiers which are 
always emphasized first in RDA, will become the primary operative pieces in catalogs in 
the future).



I have been on ruder lists, but this one of the rudest lists I have ever 
been on. Once again, there is a correlation with "understanding" and 
"agreement": if someone disagrees, there must be something wrong with 
*that person's* understanding. Because, if the person understands, he or 
she must see the light and be in agreement. Of course, that is a very 
modern interpretation. Or it could be argued that it is actually a very 
old attitude, harking back to the medieval church.


Still, when questioning an *unproven system* among a group of true 
believers, I realize it can become quite difficult.


Yes, I quote myself, but otherwise I would keep writing many of the same 
things. I *link* to what already exists. After all, that is the essence 
of what linked data is all about and that is supposed to be the 
salvation of us all. Still, it is true that I also quote lots of others. 
For instance, I quoted Amanda Cossham's excellent paper where she in 
turn quoted many others who all have major problems with FRBR and have 
offered dozens of alternative models. Therefore, it seems that I am far 
from alone in questioning the utility of FRBR.


So now, it seems that FRBR has nothing to do with the user tasks? That's 
news. Makes it hard to justify those entities and attributes and 
relationships since that is the very first step in building an 
entity-relationship model. But people have been backing away from those 
user tasks for awhile now as it becomes more and more obvious that 
people prefer other methods such as relevance ranking and algorithmic 
connections that work in completely different ways. The public is moving 
on at a terrifying rate.


And as for the relationships, there remains that uncomfortable fact that 
*if* they are to be implemented, then quite literally millions of 
records will have to be updated by cataloging staffs that are decreasing 
in numbers; staffs who are already overworked and in many cases with 
morale not doing all that well. Plus, money does not seem to be pouring 
in for any of this. And do it all without the *slightest evidence* that 
adding those relationships will bring anybody back to our catalogs. Why 
shouldn't people question? It is logical to assume that adding the 
relationships may have as much of an impact on the public as did those 
hundreds of thousands of updates to the authorized headings that 
occurred recently, where the cataloging abbreviations were spelled out. 
We know what kind of impact that had: that was a "shot *not* heard 
around the world".


But I guess there is the assumption that, as you say, "... in 
emphasizing the power of free text searching in your blog post, why 
would not that same power be brought to bear on the topic at hand-- 
retrospective conversion?" Technology will come in as a deus ex machina, 
and save RDA and FRBR. And not in 15 or 20 years. True, it may, but it 
is just as probable that after all those relationships will exist and it 
will still not make a difference to t

Re: [RDA-L] Automatically adding relationship designators (was Cost of Retrospective Conversion for Legacy Data...)

2013-12-11 Thread Myers, John
Of course if one wants to answer the question "What movies did John Huston
direct?" one goes to a reference source.  If, however, one wants to answer
the question "What movies does the library provide access to that John
Huston directed?" then one would go to the library catalog, PROVIDED such
information is present.  I recall reports of some early video tape
collections organized by director, then title, in an attempt to address
this information need.  Recording this information in the record provides a
much more useful and adaptable framework than a mere physical arrangement.
 And while the specifics of a "post-MARC" world are still tenuous, the
place of relational structures, such as those articulated in a linked data
scenario, would appear to be promising.  Improving our existing data,
however crudely, is a key step to making this future transition.

I do like the idea of leveraging existing data sources outside the catalog
that already specify this information.  How amenable the organizations
providing such data, and how amenable the data is to leverage are important
caveats to my enthusiasm.

John Myers, Catalog Librarian
Schaffer Library, Union College
Schenectady NY 12308

518-388-6623
mye...@union.edu


On Tue, Dec 10, 2013 at 7:34 AM, James Weinheimer <
weinheimer.ji...@gmail.com> wrote:

>  [snip]
> I hesitate to bring this up because most probably everybody already thinks
> of me as a purveyor of doom and gloom, but I still believe that we must
> consider these things in realistic terms. Although the attempt is laudable,
> I still say that we must first of all see through the eyes of the users who
> would be interested in this kind of information. For instance, if I am a
> regular user and I wanted to know the movies directed by John Huston, what
> would be the first thing I would think of?
>
> [snip]
> There is also the option that the library catalog could interact with the
> IMDB (and/or Wikipedia) using the APIs.
>
> This opens up a highly pertinent question for me: I don't even know what a
> library catalog is supposed to provide in today's semi-total information
> environment. This is a great example. We can't ignore these wonderful
> sites. What should the catalog do today?
>
>

To unsubscribe from RDA-L send an e-mail to the following address from the 
address you are subscribed under to:
lists...@listserv.lac-bac.gc.ca
In the body of the message:
SIGNOFF RDA-L


Re: [RDA-L] Automatically adding relationship designators (was Cost of Retrospective Conversion for Legacy Data...)

2013-12-11 Thread Brenndorfer, Thomas
>From: Resource Description and Access / Resource Description and Access 
>[mailto:RDA-L@LISTSERV.LAC-BAC.GC.CA] On Behalf Of James Weinheimer
>Sent: December-11-13 6:22 AM
>To: RDA-L@LISTSERV.LAC-BAC.GC.CA
>Subject: Re: [RDA-L] Automatically adding relationship designators (was Cost 
>of Retrospective Conversion for Legacy Data...)

>On 12/10/2013 8:52 PM, Kyrios, Alex (akyr...@uidaho.edu) wrote:
>
>James, would it be too cynical of me to summarize your position as "Our data 
>isn't good enough, so why bother improving it?" Is it wrong to hope that a 
>catalog can do more than help someone locate >an item on a shelf?
>

...
>When you went to a library in the past where all the librarians had done their 
>jobs in a professional manner (and had high morale!), the public saw what I 
>described above, even though they were >probably not aware of it. The 
>information the public got wasn't necessarily always the "best" information or 
>the "newest" or today's strange idea of the "most relevant"--but the library 
>always >offered something different. What people found in a library was also 
>not the FRBR user tasks, which in the past was only one type of a 
>*method*--and LOTS of people complained loudly about that >method from the 
>beginning and were happy when it bothered them no more. With earlier 
>technology however, there was little room for flexibility and genuine 
>cooperation. Today, there are >*many* methods in addition to the FRBR user 
>tasks that lead to the same goals I laid out above.


Why are you repeating your egregious misinterpretation of FRBR? FRBR is not 
Find-Identify-Select-Obtain by author, title, subject headings.

If I "select" a book because it won the Booker Prize then I am engaging in one 
of the FRBR user tasks. The user tasks apply to the entire spectrum of 
attributes and relationships-- not just the ones found and implemented in 
traditional catalogs through the heading structure. In addition, the 
entity-relationship model can be applied to many other data systems used in 
libraries. For example, the circulation module in my system is a feature-rich 
implementation of entity-relationship principles that if anything showcases how 
much the potential of the data locked in AACR2-MARC records lies untapped.

In addition, the other user tasks in FRAD and FRSAD will likely be incorporated 
into a single Functional Requirements model, and these altogether will apply to 
bibliographic "data" of all kinds and not just "records." The last "R" of FRBR 
will be replaced by a "D" for data, as it already has in FRAD and FRSAD. The 
user tasks apply to every single entity, every single attribute or bit of data, 
and every single relationship (even those not implemented in traditional 
catalogs).


It is so remarkable that these basic facts are glaringly absent in your posts, 
and yet not surprising since many of your cited sources are to your own blog 
posts and podcasts.

For example, in emphasizing the power of free text searching in your blog post, 
why would not that same power be brought to bear on the topic at hand-- 
retrospective conversion?

If the technology is so "incredible" then how is it that some problems are then 
so insurmountable that we should give up?


>From you blog post -- "it is the reliance on alphabetical order that has 
>become obsolete in our new environment" indicates that your interpretation of 
>FRBR is incorrect as FRBR and RDA are not just repeating the alphabetical 
>structure of traditional catalogs. Notably in RDA, the instructions for 
>authorized access points (the equivalent to headings) are relegated to 
>second-class citizen status by being put at the back of chapters, after 
>discrete data elements are covered (with the expectation that these discrete 
>data elements, along with new forms of controlled identifiers which are always 
>emphasized first in RDA, will become the primary operative pieces in catalogs 
>in the future).


Thomas Brenndorfer
Guelph Public Library 

To unsubscribe from RDA-L send an e-mail to the following address from the 
address you are subscribed under to:
lists...@listserv.lac-bac.gc.ca
In the body of the message:
SIGNOFF RDA-L


Re: [RDA-L] Automatically adding relationship designators (was Cost of Retrospective Conversion for Legacy Data...)

2013-12-11 Thread James Weinheimer

On 12/10/2013 8:52 PM, Kyrios, Alex (akyr...@uidaho.edu) wrote:



James, would it be too cynical of me to summarize your position as 
"Our data isn't good enough, so why bother improving it?" Is it wrong 
to hope that a catalog can do more than help someone locate an item on 
a shelf?





No, that is not at all my position and I have always been concerned that 
this is how it may seem when I criticize something. I blame myself. That 
is always the danger of criticism.


I believe that our catalogs--*if they worked correctly*--and they 
haven't for many, many years now--would give the public something that 
is unique today and something that the Googles and Yahoos and Bings will 
not and cannot provide. And that is: reliable, consistent, and 
confidential access to materials that have been specially selected and 
organized by experts, all who work without any goals of personal 
monetary gain, and who do not advance personal or organizational ideals 
that are political or moral or religious. All methods of selection and 
arrangement we use are open to anyone who is interested in examining them.


It seems to me that this represents the antithesis of what people 
experience with the web today and also the antithesis of the future 
direction the web promises to take. I think people are concerned about 
that today. Perhaps 5 or 10 years ago, a person would have read 
something like what I just wrote and would roll their eyes because they 
would have thought it was too backward or just plain silly, but now I 
believe that some may be beginning to think, "If only I could have 
something like that!"


When you went to a library in the past where all the librarians had done 
their jobs in a professional manner (and had high morale!), the public 
saw what I described above, even though they were probably not aware of 
it. The information the public got wasn't necessarily always the "best" 
information or the "newest" or today's strange idea of the "most 
relevant"--but the library always offered something different. What 
people found in a library was also not the FRBR user tasks, which in the 
past was only one type of a *method*--and LOTS of people complained 
loudly about that method from the beginning and were happy when it 
bothered them no more. With earlier technology however, there was little 
room for flexibility and genuine cooperation. Today, there are *many* 
methods in addition to the FRBR user tasks that lead to the same goals I 
laid out above. With the power and flexibility of today's tools, and the 
fact that cooperation among people of all levels is much simpler than 
ever before, the possibility of employing multiple methods of access can 
be considered very seriously. This was never possible before.


The public has actually been crying for a tool that does exactly what I 
described. I say: let's give it to them! But in a form that means 
something to the society of today.


Achieving this would take work and genuine cooperation, plus a sense of 
humility. I have tried my best to illustrate the "unique" kind of access 
made possible by catalogs along with the problems achieving that kind of 
access today in my podcast "Cataloging Matters Podcast no. 18: Problems 
with Library Catalogs" 
http://blog.jweinheimer.net/2013/02/catalog-matters-podcast-no-18-problems.html


So yes, I think there is a lot that the catalog can offer. As I keep 
pointing out: the problem is the *catalog* and not the *catalog records*.

--
James Weinheimer weinheimer.ji...@gmail.com
First Thus http://catalogingmatters.blogspot.com/
First Thus Facebook Page https://www.facebook.com/FirstThus
Cooperative Cataloging Rules 
http://sites.google.com/site/opencatalogingrules/
Cataloging Matters Podcasts 
http://blog.jweinheimer.net/p/cataloging-matters-podcasts.html


To unsubscribe from RDA-L send an e-mail to the following address from the 
address you are subscribed under to:
lists...@listserv.lac-bac.gc.ca
In the body of the message:
SIGNOFF RDA-L


Re: [RDA-L] Automatically adding relationship designators (was Cost of Retrospective Conversion for Legacy Data...)

2013-12-10 Thread Cindy Wolff


I don't think of you as a doom and gloom person James. I understand your
concern. I also think that the library catalog and the people who catalog
in them play a part as defenders of truth and accuracy.

A few
years ago, an actor sued IMDb because they printed her real date of birth
as opposed to the "younger" one she distributes on resumes to
casting directors. She felt when her real age was revealed, her
opportunities dropped. The suit was dismissed, but IMDb had to spend the
time and money to defend itself. What if IMDb just caved? This probably
won't be the last time IMDb gets sued either.

I think the
important thing about the catalog (as it is used) is that it isn't a
commercial enterprise. I also think there is a lot to be said for subject
specialists who may debunk popular assumptions about information regarding
popular media such as films. If something is repeated often enough, it
will be regarded as truth, which will have been overwritten by popularity
as opposed to accuracy.

People don't seem to consult librarians
anymore because others have convinced them librarians are not needed. I
think it is our job as catalogers to take that argument back and defend
the academic and educational value of the catalog.

So, what
should the catalog do today? Keep the bullshitters honest.

Cindy Wolff



> I hesitate to bring this up
because most probably everybody already
> thinks of me as a
purveyor of doom and gloom, but I still believe that
> we must
consider these things in realistic terms. Although the attempt
>
is laudable, I still say that we must first of all see through the eyes
> of the users who would be interested in this kind of information.
For
> instance, if I am a regular user and I wanted to know the
movies
> directed by John Huston, what would be the first thing I
would think of?
> 
> "Google it". I am sure
almost everybody would. So I did a natural
> language search:
"what movies did john huston direct" and what happens?
>
https://www.google.it/search?q=what+movies+did+john+huston+direct (This
> is linked data in action!) We find that down below in the links
area (at
> least in the results I get), #1 is a link to John
Huston in Wikipedia,
> #2 goes to "Category:Films directed by
John Huston" also in Wikipedia,
> and #3 goes into his page
at the IMDB (which I personally prefer). All
> have lists of the
movies he directed. This is incredibly easy to do and
> free to
all.
> 
> Putting aside for the moment the linked data
result, the 3 links perform
> exactly the same function as in the
past when someone would ask a
> reference librarian, "I need
a list of the movies John Huston directed"
> and the
knowledgeable reference librarian would reply: "Here. You can
> find the list in this book." and would hand the user the
latest issue of
> this title http://lccn.loc.gov/sn99044419 (or
something similar) which
> was very possibly shelved in the
reference collection for quick and easy
> access.
> 
> Therefore, just as the reference librarian would take the user's
> question and convert it into, "He needs to look in Film
directors : a
> complete guide", today a reference librarian
would do the same thing but
> answer/include, "He needs to
look in the IMDB". Without any doubt, that
> is the ethical
answer for such a question and will remain so for a long,
> long
time in the future.
> 
> The huge difference is that
today, people rarely consult reference
> librarians. The librarian
would already know that if you want to find
> the films of
specific directors, the library catalog is currently not
> the
right place to look for this information and when viewed
>
realistically, it never will be the right place. There is nothing at
all
> wrong with that. Not every tool is good for every use, just
as if you
> want the latest business news or to find out why your
XML won't
> validate, the best place is not JSTOR, and it never
will be. That
> doesn't mean JSTOR is no good--it just means that
you have to look in
> other places for that kind of information.
Today, the correct place to
> look for the films people have
directed is the IMDB or perhaps a few
> other places on the web.
We are *really lucky* that we have such options
> for free today.
The reference librarians would be able to help the
> searcher in
these directions *if* they were asked, but sadly, that is
>
happening less and less.
> 
> So, adding the relator codes
automatically will still demand manual
> cleanup, perhaps
(probably) on a massive scale, if it is ever to become
> as good
as IMDB is *right now*. I suggest that the correct method for a
>
library catalog is to lead the person to the *right resource* that he
or
> she wants and perhaps even do it *better* than Google. In
this case of
> film directors, I find it very difficult even to
imagine how we could do
> better than Google because the Google
search works so incredibly well.
> Perhaps a film librarian could
discover that the IMDB and Wikipedia are
> incorrect or
incomplete. In that respec

Re: [RDA-L] Automatically adding relationship designators (was Cost of Retrospective Conversion for Legacy Data...)

2013-12-10 Thread Kyrios, Alex (akyr...@uidaho.edu)
James, would it be too cynical of me to summarize your position as "Our data 
isn't good enough, so why bother improving it?" Is it wrong to hope that a 
catalog can do more than help someone locate an item on a shelf?

Alex Kyrios
Metadata and Catalog Librarian
University of Idaho
208-885-2513
akyr...@uidaho.edu

From: Resource Description and Access / Resource Description and Access 
[mailto:RDA-L@LISTSERV.LAC-BAC.GC.CA] On Behalf Of James Weinheimer
Sent: Tuesday, December 10, 2013 4:35 AM
To: RDA-L@LISTSERV.LAC-BAC.GC.CA
Subject: Re: [RDA-L] Automatically adding relationship designators (was Cost of 
Retrospective Conversion for Legacy Data...)

On 09/12/2013 0.04, Kelley McGrath wrote:

OLAC is attempting a project of this sort for film and video credits. We are 
trying to teach a computer to recognize the names and roles that appear in 
245$c, 260+$b, 508 and 511 (and if we get really brave maybe 505) and also 
connect them to the correct 1xx/7xx if present. The current program, which uses 
natural language processing (NLP) techniques, is reasonably successful with 
personal names and with roles given in English. We are working on building a 
multilingual vocabulary. It tends to choke on complicated statements that 
involve a lot of corporate bodies.


I hesitate to bring this up because most probably everybody already thinks of 
me as a purveyor of doom and gloom, but I still believe that we must consider 
these things in realistic terms. Although the attempt is laudable, I still say 
that we must first of all see through the eyes of the users who would be 
interested in this kind of information. For instance, if I am a regular user 
and I wanted to know the movies directed by John Huston, what would be the 
first thing I would think of?

"Google it". I am sure almost everybody would. So I did a natural language 
search: "what movies did john huston direct" and what happens? 
https://www.google.it/search?q=what+movies+did+john+huston+direct (This is 
linked data in action!) We find that down below in the links area (at least in 
the results I get), #1 is a link to John Huston in Wikipedia, #2 goes to 
"Category:Films directed by John Huston" also in Wikipedia, and #3 goes into 
his page at the IMDB (which I personally prefer). All have lists of the movies 
he directed. This is incredibly easy to do and free to all.

Putting aside for the moment the linked data result, the 3 links perform 
exactly the same function as in the past when someone would ask a reference 
librarian, "I need a list of the movies John Huston directed" and the 
knowledgeable reference librarian would reply: "Here. You can find the list in 
this book." and would hand the user the latest issue of this title 
http://lccn.loc.gov/sn99044419 (or something similar) which was very possibly 
shelved in the reference collection for quick and easy access.

Therefore, just as the reference librarian would take the user's question and 
convert it into, "He needs to look in Film directors : a complete guide", today 
a reference librarian would do the same thing but answer/include, "He needs to 
look in the IMDB". Without any doubt, that is the ethical answer for such a 
question and will remain so for a long, long time in the future.

The huge difference is that today, people rarely consult reference librarians. 
The librarian would already know that if you want to find the films of specific 
directors, the library catalog is currently not the right place to look for 
this information and when viewed realistically, it never will be the right 
place. There is nothing at all wrong with that. Not every tool is good for 
every use, just as if you want the latest business news or to find out why your 
XML won't validate, the best place is not JSTOR, and it never will be. That 
doesn't mean JSTOR is no good--it just means that you have to look in other 
places for that kind of information. Today, the correct place to look for the 
films people have directed is the IMDB or perhaps a few other places on the 
web. We are *really lucky* that we have such options for free today. The 
reference librarians would be able to help the searcher in these directions 
*if* they were asked, but sadly, that is happening less and less.

So, adding the relator codes automatically will still demand manual cleanup, 
perhaps (probably) on a massive scale, if it is ever to become as good as IMDB 
is *right now*. I suggest that the correct method for a library catalog is to 
lead the person to the *right resource* that he or she wants and perhaps even 
do it *better* than Google. In this case of film directors, I find it very 
difficult even to imagine how we could do better than Google because the Google 
search works so incredibly well. Perhaps a film librarian could discover that 
the IMDB and Wikipedia are incorrect or incomplete. In that respect perhaps 
library e

Re: [RDA-L] Automatically adding relationship designators (was Cost of Retrospective Conversion for Legacy Data...)

2013-12-10 Thread Kevin M Randall
The OLAC project is a wonderful example of the ways people are finding 
imaginative ways to automate the creation of metadata.  Yes, there is much data 
in places like IMDb and Wikipedia that duplicates data in library catalogs; and 
there is also much data in those resources that is unique to those resources.  
By the same token, there is also data in library catalogs that cannot be found 
in IMDb or Wikipedia.  The OLAC project will help greatly toward perfecting the 
semantic links between library metadata and data elsewhere.  It's visionary 
stuff like this that helps to further the advancement of information science 
and technology.

Kevin M. Randall
Principal Serials Cataloger
Northwestern University Library
k...@northwestern.edu<mailto:k...@northwestern.edu>
(847) 491-2939

Proudly wearing the sensible shoes since 1978!

From: Resource Description and Access / Resource Description and Access 
[mailto:RDA-L@LISTSERV.LAC-BAC.GC.CA] On Behalf Of James Weinheimer
Sent: Tuesday, December 10, 2013 6:35 AM
To: RDA-L@LISTSERV.LAC-BAC.GC.CA
Subject: Re: [RDA-L] Automatically adding relationship designators (was Cost of 
Retrospective Conversion for Legacy Data...)

On 09/12/2013 0.04, Kelley McGrath wrote:

OLAC is attempting a project of this sort for film and video credits. We are 
trying to teach a computer to recognize the names and roles that appear in 
245$c, 260+$b, 508 and 511 (and if we get really brave maybe 505) and also 
connect them to the correct 1xx/7xx if present. The current program, which uses 
natural language processing (NLP) techniques, is reasonably successful with 
personal names and with roles given in English. We are working on building a 
multilingual vocabulary. It tends to choke on complicated statements that 
involve a lot of corporate bodies.


I hesitate to bring this up because most probably everybody already thinks of 
me as a purveyor of doom and gloom, but I still believe that we must consider 
these things in realistic terms. Although the attempt is laudable, I still say 
that we must first of all see through the eyes of the users who would be 
interested in this kind of information. For instance, if I am a regular user 
and I wanted to know the movies directed by John Huston, what would be the 
first thing I would think of?

"Google it". I am sure almost everybody would. So I did a natural language 
search: "what movies did john huston direct" and what happens? 
https://www.google.it/search?q=what+movies+did+john+huston+direct (This is 
linked data in action!) We find that down below in the links area (at least in 
the results I get), #1 is a link to John Huston in Wikipedia, #2 goes to 
"Category:Films directed by John Huston" also in Wikipedia, and #3 goes into 
his page at the IMDB (which I personally prefer). All have lists of the movies 
he directed. This is incredibly easy to do and free to all.

Putting aside for the moment the linked data result, the 3 links perform 
exactly the same function as in the past when someone would ask a reference 
librarian, "I need a list of the movies John Huston directed" and the 
knowledgeable reference librarian would reply: "Here. You can find the list in 
this book." and would hand the user the latest issue of this title 
http://lccn.loc.gov/sn99044419 (or something similar) which was very possibly 
shelved in the reference collection for quick and easy access.

Therefore, just as the reference librarian would take the user's question and 
convert it into, "He needs to look in Film directors : a complete guide", today 
a reference librarian would do the same thing but answer/include, "He needs to 
look in the IMDB". Without any doubt, that is the ethical answer for such a 
question and will remain so for a long, long time in the future.

The huge difference is that today, people rarely consult reference librarians. 
The librarian would already know that if you want to find the films of specific 
directors, the library catalog is currently not the right place to look for 
this information and when viewed realistically, it never will be the right 
place. There is nothing at all wrong with that. Not every tool is good for 
every use, just as if you want the latest business news or to find out why your 
XML won't validate, the best place is not JSTOR, and it never will be. That 
doesn't mean JSTOR is no good--it just means that you have to look in other 
places for that kind of information. Today, the correct place to look for the 
films people have directed is the IMDB or perhaps a few other places on the 
web. We are *really lucky* that we have such options for free today. The 
reference librarians would be able to help the searcher in these directions 
*if* they were asked, but sadly, that is happening less and less.

So, adding the relator codes automatically will still demand manual cleanup, 
perhaps (probably) on a massive scale, 

Re: [RDA-L] Automatically adding relationship designators (was Cost of Retrospective Conversion for Legacy Data...)

2013-12-10 Thread James Weinheimer

On 09/12/2013 0.04, Kelley McGrath wrote:

OLAC is attempting a project of this sort for film and video credits. 
We are trying to teach a computer to recognize the names and roles 
that appear in 245$c, 260+$b, 508 and 511 (and if we get really brave 
maybe 505) and also connect them to the correct 1xx/7xx if present. 
The current program, which uses natural language processing (NLP) 
techniques, is reasonably successful with personal names and with 
roles given in English. We are working on building a multilingual 
vocabulary. It tends to choke on complicated statements that involve a 
lot of corporate bodies.



I hesitate to bring this up because most probably everybody already 
thinks of me as a purveyor of doom and gloom, but I still believe that 
we must consider these things in realistic terms. Although the attempt 
is laudable, I still say that we must first of all see through the eyes 
of the users who would be interested in this kind of information. For 
instance, if I am a regular user and I wanted to know the movies 
directed by John Huston, what would be the first thing I would think of?


"Google it". I am sure almost everybody would. So I did a natural 
language search: "what movies did john huston direct" and what happens? 
https://www.google.it/search?q=what+movies+did+john+huston+direct (This 
is linked data in action!) We find that down below in the links area (at 
least in the results I get), #1 is a link to John Huston in Wikipedia, 
#2 goes to "Category:Films directed by John Huston" also in Wikipedia, 
and #3 goes into his page at the IMDB (which I personally prefer). All 
have lists of the movies he directed. This is incredibly easy to do and 
free to all.


Putting aside for the moment the linked data result, the 3 links perform 
exactly the same function as in the past when someone would ask a 
reference librarian, "I need a list of the movies John Huston directed" 
and the knowledgeable reference librarian would reply: "Here. You can 
find the list in this book." and would hand the user the latest issue of 
this title http://lccn.loc.gov/sn99044419 (or something similar) which 
was very possibly shelved in the reference collection for quick and easy 
access.


Therefore, just as the reference librarian would take the user's 
question and convert it into, "He needs to look in Film directors : a 
complete guide", today a reference librarian would do the same thing but 
answer/include, "He needs to look in the IMDB". Without any doubt, that 
is the ethical answer for such a question and will remain so for a long, 
long time in the future.


The huge difference is that today, people rarely consult reference 
librarians. The librarian would already know that if you want to find 
the films of specific directors, the library catalog is currently not 
the right place to look for this information and when viewed 
realistically, it never will be the right place. There is nothing at all 
wrong with that. Not every tool is good for every use, just as if you 
want the latest business news or to find out why your XML won't 
validate, the best place is not JSTOR, and it never will be. That 
doesn't mean JSTOR is no good--it just means that you have to look in 
other places for that kind of information. Today, the correct place to 
look for the films people have directed is the IMDB or perhaps a few 
other places on the web. We are *really lucky* that we have such options 
for free today. The reference librarians would be able to help the 
searcher in these directions *if* they were asked, but sadly, that is 
happening less and less.


So, adding the relator codes automatically will still demand manual 
cleanup, perhaps (probably) on a massive scale, if it is ever to become 
as good as IMDB is *right now*. I suggest that the correct method for a 
library catalog is to lead the person to the *right resource* that he or 
she wants and perhaps even do it *better* than Google. In this case of 
film directors, I find it very difficult even to imagine how we could do 
better than Google because the Google search works so incredibly well. 
Perhaps a film librarian could discover that the IMDB and Wikipedia are 
incorrect or incomplete. In that respect perhaps library efforts could 
be better focused on improving IMDB and Wikipedia than adding relator 
codes.


There is also the option that the library catalog could interact with 
the IMDB (and/or Wikipedia) using the APIs.


This opens up a highly pertinent question for me: I don't even know what 
a library catalog is supposed to provide in today's semi-total 
information environment. This is a great example. We can't ignore these 
wonderful sites. What should the catalog do today?

--
James Weinheimer weinheimer.ji...@gmail.com
First Thus http://catalogingmatters.blogspot.com/
First Thus Facebook Page https://www.facebook.com/FirstThus
Cooperative Cataloging Rules 
http://sites.google.com/site/opencatalogingrules/
Cataloging Matters Podcasts 
http:/