Re: [RDA-L] Automatically adding relationship designators (was Cost of Retrospective Conversion for Legacy Data...)
12.12.2013 14:07, James Weinheimer: ... still remains unproven, or the question of how much implementing FRBR/RDA will ultimately cost. The costs are already beyond many libraries. Not just the cost of implementing but even the costs of just reading the rules. Therefore, we are heading into a two-class library environment. This will be further diversified when it comes to whether or not a library can or wants to or is allowed to buy a systems upgrade to support FRBR and/or BIBFRAME and/or to have their local data upgraded to take advantage of new options, and so on. Patrons should love it when they are no longer all of them so boringly alike ... B.Eversberg To unsubscribe from RDA-L send an e-mail to the following address from the address you are subscribed under to: lists...@listserv.lac-bac.gc.ca In the body of the message: SIGNOFF RDA-L
Re: [RDA-L] Automatically adding relationship designators (was Cost of Retrospective Conversion for Legacy Data...)
On 12/12/2013 12:12 AM, Kevin M Randall wrote: This statement proves the point that you do not understand what the FRBR tasks are. The FRBR tasks are not "methods". They are objectives. Relevance ranking and algorithmic connections are examples of methods which are used to accomplish the user tasks. The statement you made is as nonsensical as saying "People don't like to travel from Los Angeles to New York. They would rather take an airplane." I don't understand your reasoning here, but I guess it is yet another example of the feebleness of my intellect--which, as is well-known, is proven by my disagreeing with the FRBR/RDA library gods. By focusing on me, there is no need to address the problems of whether users actually want the user tasks (by the way, no matter how stupid I may be, whether people want to do those things so desperately still remains unproven and quite obviously, is a question that cannot be asked or you will be pilloried for it) or the question of how much implementing FRBR/RDA will ultimately cost. The costs are already beyond many libraries. But of course, all of those issues are so irrelevant they can just be ignored. I have grown weary of the personal attacks and so will not answer any of those for awhile. One point of substance however: The OLAC Movie & Video Credit Annotation Experiment which you decried is addressing exactly this point. As they say: "Eventually, we intend to automate most of this conversion. For now, we need help from human volunteers, who can train our software to recognize the many ways names and roles have been listed in library records for movies." I didn't "decry" the project at all. I merely asked the question that would pop into the head of any 5 or 6 year old child: "Why not Google it?" So I did and the results were very, very good--at least as good as any library catalog could ever hope to offer. So then I asked another natural question: what is the purpose of the project? I am not decrying anything but asking some very natural questions. I mentioned that if it is considered an experiment, everything is fine and perhaps we could learn something from it. To be more specific now: since the film roles already exist in Wikipedia and IMDB, they could serve as "control groups" to help estimate how accurate any automatic conversions would be, how much manual cleanup would be needed and how much everything would cost. The results may be applicable to other materials. In the specific case of films, for the final result however, it would seem much more efficient to use the information that already exists by implementing the APIs. Otherwise it would quite clearly duplicating what already exists. -- James Weinheimer weinheimer.ji...@gmail.com First Thus http://catalogingmatters.blogspot.com/ First Thus Facebook Page https://www.facebook.com/FirstThus Cooperative Cataloging Rules http://sites.google.com/site/opencatalogingrules/ Cataloging Matters Podcasts http://blog.jweinheimer.net/p/cataloging-matters-podcasts.html To unsubscribe from RDA-L send an e-mail to the following address from the address you are subscribed under to: lists...@listserv.lac-bac.gc.ca In the body of the message: SIGNOFF RDA-L
Re: [RDA-L] Automatically adding relationship designators (was Cost of Retrospective Conversion for Legacy Data...)
James Weinheimer wrote: > But people have been backing away from those > user tasks for awhile now as it becomes more and more obvious that > people prefer other methods such as relevance ranking and algorithmic > connections that work in completely different ways. This statement proves the point that you do not understand what the FRBR tasks are. The FRBR tasks are not "methods". They are objectives. Relevance ranking and algorithmic connections are examples of methods which are used to accomplish the user tasks. The statement you made is as nonsensical as saying "People don't like to travel from Los Angeles to New York. They would rather take an airplane." > And as for the relationships, there remains that uncomfortable fact that > *if* they are to be implemented, then quite literally millions of > records will have to be updated by cataloging staffs that are decreasing > in numbers; staffs who are already overworked and in many cases with > morale not doing all that well. The OLAC Movie & Video Credit Annotation Experiment which you decried is addressing exactly this point. As they say: "Eventually, we intend to automate most of this conversion. For now, we need help from human volunteers, who can train our software to recognize the many ways names and roles have been listed in library records for movies." > I just wish that we could declare victory for FRBR now because modern > computing has allowed people to do them right now, as I have shown > often > enough with using the facets in Worldcat. What you fail to realize is that the faceting will become better and more powerful as the data is refined. But if you're happy with how things are now, so be it. It would just be nice if we didn't hear constant complaints about us hoping for and working toward a better future. Kevin M. Randall Principal Serials Cataloger Northwestern University Library k...@northwestern.edu (847) 491-2939 Proudly wearing the sensible shoes since 1978! To unsubscribe from RDA-L send an e-mail to the following address from the address you are subscribed under to: lists...@listserv.lac-bac.gc.ca In the body of the message: SIGNOFF RDA-L
Re: [RDA-L] Automatically adding relationship designators (was Cost of Retrospective Conversion for Legacy Data...)
On 11/12/2013 15.44, Brenndorfer, Thomas wrote: Why are you repeating your egregious misinterpretation of FRBR? FRBR is not Find-Identify-Select-Obtain by author, title, subject headings. If I "select" a book because it won the Booker Prize then I am engaging in one of the FRBR user tasks. The user tasks apply to the entire spectrum of attributes and relationships-- not just the ones found and implemented in traditional catalogs through the heading structure. In addition, the entity-relationship model can be applied to many other data systems used in libraries. For example, the circulation module in my system is a feature-rich implementation of entity-relationship principles that if anything showcases how much the potential of the data locked in AACR2-MARC records lies untapped. ... It is so remarkable that these basic facts are glaringly absent in your posts, and yet not surprising since many of your cited sources are to your own blog posts and podcasts. For example, in emphasizing the power of free text searching in your blog post, why would not that same power be brought to bear on the topic at hand-- retrospective conversion? If the technology is so "incredible" then how is it that some problems are then so insurmountable that we should give up? From you blog post -- "it is the reliance on alphabetical order that has become obsolete in our new environment" indicates that your interpretation of FRBR is incorrect as FRBR and RDA are not just repeating the alphabetical structure of traditional catalogs. Notably in RDA, the instructions for authorized access points (the equivalent to headings) are relegated to second-class citizen status by being put at the back of chapters, after discrete data elements are covered (with the expectation that these discrete data elements, along with new forms of controlled identifiers which are always emphasized first in RDA, will become the primary operative pieces in catalogs in the future). I have been on ruder lists, but this one of the rudest lists I have ever been on. Once again, there is a correlation with "understanding" and "agreement": if someone disagrees, there must be something wrong with *that person's* understanding. Because, if the person understands, he or she must see the light and be in agreement. Of course, that is a very modern interpretation. Or it could be argued that it is actually a very old attitude, harking back to the medieval church. Still, when questioning an *unproven system* among a group of true believers, I realize it can become quite difficult. Yes, I quote myself, but otherwise I would keep writing many of the same things. I *link* to what already exists. After all, that is the essence of what linked data is all about and that is supposed to be the salvation of us all. Still, it is true that I also quote lots of others. For instance, I quoted Amanda Cossham's excellent paper where she in turn quoted many others who all have major problems with FRBR and have offered dozens of alternative models. Therefore, it seems that I am far from alone in questioning the utility of FRBR. So now, it seems that FRBR has nothing to do with the user tasks? That's news. Makes it hard to justify those entities and attributes and relationships since that is the very first step in building an entity-relationship model. But people have been backing away from those user tasks for awhile now as it becomes more and more obvious that people prefer other methods such as relevance ranking and algorithmic connections that work in completely different ways. The public is moving on at a terrifying rate. And as for the relationships, there remains that uncomfortable fact that *if* they are to be implemented, then quite literally millions of records will have to be updated by cataloging staffs that are decreasing in numbers; staffs who are already overworked and in many cases with morale not doing all that well. Plus, money does not seem to be pouring in for any of this. And do it all without the *slightest evidence* that adding those relationships will bring anybody back to our catalogs. Why shouldn't people question? It is logical to assume that adding the relationships may have as much of an impact on the public as did those hundreds of thousands of updates to the authorized headings that occurred recently, where the cataloging abbreviations were spelled out. We know what kind of impact that had: that was a "shot *not* heard around the world". But I guess there is the assumption that, as you say, "... in emphasizing the power of free text searching in your blog post, why would not that same power be brought to bear on the topic at hand-- retrospective conversion?" Technology will come in as a deus ex machina, and save RDA and FRBR. And not in 15 or 20 years. True, it may, but it is just as probable that after all those relationships will exist and it will still not make a difference to t
Re: [RDA-L] Automatically adding relationship designators (was Cost of Retrospective Conversion for Legacy Data...)
Of course if one wants to answer the question "What movies did John Huston direct?" one goes to a reference source. If, however, one wants to answer the question "What movies does the library provide access to that John Huston directed?" then one would go to the library catalog, PROVIDED such information is present. I recall reports of some early video tape collections organized by director, then title, in an attempt to address this information need. Recording this information in the record provides a much more useful and adaptable framework than a mere physical arrangement. And while the specifics of a "post-MARC" world are still tenuous, the place of relational structures, such as those articulated in a linked data scenario, would appear to be promising. Improving our existing data, however crudely, is a key step to making this future transition. I do like the idea of leveraging existing data sources outside the catalog that already specify this information. How amenable the organizations providing such data, and how amenable the data is to leverage are important caveats to my enthusiasm. John Myers, Catalog Librarian Schaffer Library, Union College Schenectady NY 12308 518-388-6623 mye...@union.edu On Tue, Dec 10, 2013 at 7:34 AM, James Weinheimer < weinheimer.ji...@gmail.com> wrote: > [snip] > I hesitate to bring this up because most probably everybody already thinks > of me as a purveyor of doom and gloom, but I still believe that we must > consider these things in realistic terms. Although the attempt is laudable, > I still say that we must first of all see through the eyes of the users who > would be interested in this kind of information. For instance, if I am a > regular user and I wanted to know the movies directed by John Huston, what > would be the first thing I would think of? > > [snip] > There is also the option that the library catalog could interact with the > IMDB (and/or Wikipedia) using the APIs. > > This opens up a highly pertinent question for me: I don't even know what a > library catalog is supposed to provide in today's semi-total information > environment. This is a great example. We can't ignore these wonderful > sites. What should the catalog do today? > > To unsubscribe from RDA-L send an e-mail to the following address from the address you are subscribed under to: lists...@listserv.lac-bac.gc.ca In the body of the message: SIGNOFF RDA-L
Re: [RDA-L] Automatically adding relationship designators (was Cost of Retrospective Conversion for Legacy Data...)
>From: Resource Description and Access / Resource Description and Access >[mailto:RDA-L@LISTSERV.LAC-BAC.GC.CA] On Behalf Of James Weinheimer >Sent: December-11-13 6:22 AM >To: RDA-L@LISTSERV.LAC-BAC.GC.CA >Subject: Re: [RDA-L] Automatically adding relationship designators (was Cost >of Retrospective Conversion for Legacy Data...) >On 12/10/2013 8:52 PM, Kyrios, Alex (akyr...@uidaho.edu) wrote: > >James, would it be too cynical of me to summarize your position as "Our data >isn't good enough, so why bother improving it?" Is it wrong to hope that a >catalog can do more than help someone locate >an item on a shelf? > ... >When you went to a library in the past where all the librarians had done their >jobs in a professional manner (and had high morale!), the public saw what I >described above, even though they were >probably not aware of it. The >information the public got wasn't necessarily always the "best" information or >the "newest" or today's strange idea of the "most relevant"--but the library >always >offered something different. What people found in a library was also >not the FRBR user tasks, which in the past was only one type of a >*method*--and LOTS of people complained loudly about that >method from the >beginning and were happy when it bothered them no more. With earlier >technology however, there was little room for flexibility and genuine >cooperation. Today, there are >*many* methods in addition to the FRBR user >tasks that lead to the same goals I laid out above. Why are you repeating your egregious misinterpretation of FRBR? FRBR is not Find-Identify-Select-Obtain by author, title, subject headings. If I "select" a book because it won the Booker Prize then I am engaging in one of the FRBR user tasks. The user tasks apply to the entire spectrum of attributes and relationships-- not just the ones found and implemented in traditional catalogs through the heading structure. In addition, the entity-relationship model can be applied to many other data systems used in libraries. For example, the circulation module in my system is a feature-rich implementation of entity-relationship principles that if anything showcases how much the potential of the data locked in AACR2-MARC records lies untapped. In addition, the other user tasks in FRAD and FRSAD will likely be incorporated into a single Functional Requirements model, and these altogether will apply to bibliographic "data" of all kinds and not just "records." The last "R" of FRBR will be replaced by a "D" for data, as it already has in FRAD and FRSAD. The user tasks apply to every single entity, every single attribute or bit of data, and every single relationship (even those not implemented in traditional catalogs). It is so remarkable that these basic facts are glaringly absent in your posts, and yet not surprising since many of your cited sources are to your own blog posts and podcasts. For example, in emphasizing the power of free text searching in your blog post, why would not that same power be brought to bear on the topic at hand-- retrospective conversion? If the technology is so "incredible" then how is it that some problems are then so insurmountable that we should give up? >From you blog post -- "it is the reliance on alphabetical order that has >become obsolete in our new environment" indicates that your interpretation of >FRBR is incorrect as FRBR and RDA are not just repeating the alphabetical >structure of traditional catalogs. Notably in RDA, the instructions for >authorized access points (the equivalent to headings) are relegated to >second-class citizen status by being put at the back of chapters, after >discrete data elements are covered (with the expectation that these discrete >data elements, along with new forms of controlled identifiers which are always >emphasized first in RDA, will become the primary operative pieces in catalogs >in the future). Thomas Brenndorfer Guelph Public Library To unsubscribe from RDA-L send an e-mail to the following address from the address you are subscribed under to: lists...@listserv.lac-bac.gc.ca In the body of the message: SIGNOFF RDA-L
Re: [RDA-L] Automatically adding relationship designators (was Cost of Retrospective Conversion for Legacy Data...)
On 12/10/2013 8:52 PM, Kyrios, Alex (akyr...@uidaho.edu) wrote: James, would it be too cynical of me to summarize your position as "Our data isn't good enough, so why bother improving it?" Is it wrong to hope that a catalog can do more than help someone locate an item on a shelf? No, that is not at all my position and I have always been concerned that this is how it may seem when I criticize something. I blame myself. That is always the danger of criticism. I believe that our catalogs--*if they worked correctly*--and they haven't for many, many years now--would give the public something that is unique today and something that the Googles and Yahoos and Bings will not and cannot provide. And that is: reliable, consistent, and confidential access to materials that have been specially selected and organized by experts, all who work without any goals of personal monetary gain, and who do not advance personal or organizational ideals that are political or moral or religious. All methods of selection and arrangement we use are open to anyone who is interested in examining them. It seems to me that this represents the antithesis of what people experience with the web today and also the antithesis of the future direction the web promises to take. I think people are concerned about that today. Perhaps 5 or 10 years ago, a person would have read something like what I just wrote and would roll their eyes because they would have thought it was too backward or just plain silly, but now I believe that some may be beginning to think, "If only I could have something like that!" When you went to a library in the past where all the librarians had done their jobs in a professional manner (and had high morale!), the public saw what I described above, even though they were probably not aware of it. The information the public got wasn't necessarily always the "best" information or the "newest" or today's strange idea of the "most relevant"--but the library always offered something different. What people found in a library was also not the FRBR user tasks, which in the past was only one type of a *method*--and LOTS of people complained loudly about that method from the beginning and were happy when it bothered them no more. With earlier technology however, there was little room for flexibility and genuine cooperation. Today, there are *many* methods in addition to the FRBR user tasks that lead to the same goals I laid out above. With the power and flexibility of today's tools, and the fact that cooperation among people of all levels is much simpler than ever before, the possibility of employing multiple methods of access can be considered very seriously. This was never possible before. The public has actually been crying for a tool that does exactly what I described. I say: let's give it to them! But in a form that means something to the society of today. Achieving this would take work and genuine cooperation, plus a sense of humility. I have tried my best to illustrate the "unique" kind of access made possible by catalogs along with the problems achieving that kind of access today in my podcast "Cataloging Matters Podcast no. 18: Problems with Library Catalogs" http://blog.jweinheimer.net/2013/02/catalog-matters-podcast-no-18-problems.html So yes, I think there is a lot that the catalog can offer. As I keep pointing out: the problem is the *catalog* and not the *catalog records*. -- James Weinheimer weinheimer.ji...@gmail.com First Thus http://catalogingmatters.blogspot.com/ First Thus Facebook Page https://www.facebook.com/FirstThus Cooperative Cataloging Rules http://sites.google.com/site/opencatalogingrules/ Cataloging Matters Podcasts http://blog.jweinheimer.net/p/cataloging-matters-podcasts.html To unsubscribe from RDA-L send an e-mail to the following address from the address you are subscribed under to: lists...@listserv.lac-bac.gc.ca In the body of the message: SIGNOFF RDA-L
Re: [RDA-L] Automatically adding relationship designators (was Cost of Retrospective Conversion for Legacy Data...)
I don't think of you as a doom and gloom person James. I understand your concern. I also think that the library catalog and the people who catalog in them play a part as defenders of truth and accuracy. A few years ago, an actor sued IMDb because they printed her real date of birth as opposed to the "younger" one she distributes on resumes to casting directors. She felt when her real age was revealed, her opportunities dropped. The suit was dismissed, but IMDb had to spend the time and money to defend itself. What if IMDb just caved? This probably won't be the last time IMDb gets sued either. I think the important thing about the catalog (as it is used) is that it isn't a commercial enterprise. I also think there is a lot to be said for subject specialists who may debunk popular assumptions about information regarding popular media such as films. If something is repeated often enough, it will be regarded as truth, which will have been overwritten by popularity as opposed to accuracy. People don't seem to consult librarians anymore because others have convinced them librarians are not needed. I think it is our job as catalogers to take that argument back and defend the academic and educational value of the catalog. So, what should the catalog do today? Keep the bullshitters honest. Cindy Wolff > I hesitate to bring this up because most probably everybody already > thinks of me as a purveyor of doom and gloom, but I still believe that > we must consider these things in realistic terms. Although the attempt > is laudable, I still say that we must first of all see through the eyes > of the users who would be interested in this kind of information. For > instance, if I am a regular user and I wanted to know the movies > directed by John Huston, what would be the first thing I would think of? > > "Google it". I am sure almost everybody would. So I did a natural > language search: "what movies did john huston direct" and what happens? > https://www.google.it/search?q=what+movies+did+john+huston+direct (This > is linked data in action!) We find that down below in the links area (at > least in the results I get), #1 is a link to John Huston in Wikipedia, > #2 goes to "Category:Films directed by John Huston" also in Wikipedia, > and #3 goes into his page at the IMDB (which I personally prefer). All > have lists of the movies he directed. This is incredibly easy to do and > free to all. > > Putting aside for the moment the linked data result, the 3 links perform > exactly the same function as in the past when someone would ask a > reference librarian, "I need a list of the movies John Huston directed" > and the knowledgeable reference librarian would reply: "Here. You can > find the list in this book." and would hand the user the latest issue of > this title http://lccn.loc.gov/sn99044419 (or something similar) which > was very possibly shelved in the reference collection for quick and easy > access. > > Therefore, just as the reference librarian would take the user's > question and convert it into, "He needs to look in Film directors : a > complete guide", today a reference librarian would do the same thing but > answer/include, "He needs to look in the IMDB". Without any doubt, that > is the ethical answer for such a question and will remain so for a long, > long time in the future. > > The huge difference is that today, people rarely consult reference > librarians. The librarian would already know that if you want to find > the films of specific directors, the library catalog is currently not > the right place to look for this information and when viewed > realistically, it never will be the right place. There is nothing at all > wrong with that. Not every tool is good for every use, just as if you > want the latest business news or to find out why your XML won't > validate, the best place is not JSTOR, and it never will be. That > doesn't mean JSTOR is no good--it just means that you have to look in > other places for that kind of information. Today, the correct place to > look for the films people have directed is the IMDB or perhaps a few > other places on the web. We are *really lucky* that we have such options > for free today. The reference librarians would be able to help the > searcher in these directions *if* they were asked, but sadly, that is > happening less and less. > > So, adding the relator codes automatically will still demand manual > cleanup, perhaps (probably) on a massive scale, if it is ever to become > as good as IMDB is *right now*. I suggest that the correct method for a > library catalog is to lead the person to the *right resource* that he or > she wants and perhaps even do it *better* than Google. In this case of > film directors, I find it very difficult even to imagine how we could do > better than Google because the Google search works so incredibly well. > Perhaps a film librarian could discover that the IMDB and Wikipedia are > incorrect or incomplete. In that respec
Re: [RDA-L] Automatically adding relationship designators (was Cost of Retrospective Conversion for Legacy Data...)
James, would it be too cynical of me to summarize your position as "Our data isn't good enough, so why bother improving it?" Is it wrong to hope that a catalog can do more than help someone locate an item on a shelf? Alex Kyrios Metadata and Catalog Librarian University of Idaho 208-885-2513 akyr...@uidaho.edu From: Resource Description and Access / Resource Description and Access [mailto:RDA-L@LISTSERV.LAC-BAC.GC.CA] On Behalf Of James Weinheimer Sent: Tuesday, December 10, 2013 4:35 AM To: RDA-L@LISTSERV.LAC-BAC.GC.CA Subject: Re: [RDA-L] Automatically adding relationship designators (was Cost of Retrospective Conversion for Legacy Data...) On 09/12/2013 0.04, Kelley McGrath wrote: OLAC is attempting a project of this sort for film and video credits. We are trying to teach a computer to recognize the names and roles that appear in 245$c, 260+$b, 508 and 511 (and if we get really brave maybe 505) and also connect them to the correct 1xx/7xx if present. The current program, which uses natural language processing (NLP) techniques, is reasonably successful with personal names and with roles given in English. We are working on building a multilingual vocabulary. It tends to choke on complicated statements that involve a lot of corporate bodies. I hesitate to bring this up because most probably everybody already thinks of me as a purveyor of doom and gloom, but I still believe that we must consider these things in realistic terms. Although the attempt is laudable, I still say that we must first of all see through the eyes of the users who would be interested in this kind of information. For instance, if I am a regular user and I wanted to know the movies directed by John Huston, what would be the first thing I would think of? "Google it". I am sure almost everybody would. So I did a natural language search: "what movies did john huston direct" and what happens? https://www.google.it/search?q=what+movies+did+john+huston+direct (This is linked data in action!) We find that down below in the links area (at least in the results I get), #1 is a link to John Huston in Wikipedia, #2 goes to "Category:Films directed by John Huston" also in Wikipedia, and #3 goes into his page at the IMDB (which I personally prefer). All have lists of the movies he directed. This is incredibly easy to do and free to all. Putting aside for the moment the linked data result, the 3 links perform exactly the same function as in the past when someone would ask a reference librarian, "I need a list of the movies John Huston directed" and the knowledgeable reference librarian would reply: "Here. You can find the list in this book." and would hand the user the latest issue of this title http://lccn.loc.gov/sn99044419 (or something similar) which was very possibly shelved in the reference collection for quick and easy access. Therefore, just as the reference librarian would take the user's question and convert it into, "He needs to look in Film directors : a complete guide", today a reference librarian would do the same thing but answer/include, "He needs to look in the IMDB". Without any doubt, that is the ethical answer for such a question and will remain so for a long, long time in the future. The huge difference is that today, people rarely consult reference librarians. The librarian would already know that if you want to find the films of specific directors, the library catalog is currently not the right place to look for this information and when viewed realistically, it never will be the right place. There is nothing at all wrong with that. Not every tool is good for every use, just as if you want the latest business news or to find out why your XML won't validate, the best place is not JSTOR, and it never will be. That doesn't mean JSTOR is no good--it just means that you have to look in other places for that kind of information. Today, the correct place to look for the films people have directed is the IMDB or perhaps a few other places on the web. We are *really lucky* that we have such options for free today. The reference librarians would be able to help the searcher in these directions *if* they were asked, but sadly, that is happening less and less. So, adding the relator codes automatically will still demand manual cleanup, perhaps (probably) on a massive scale, if it is ever to become as good as IMDB is *right now*. I suggest that the correct method for a library catalog is to lead the person to the *right resource* that he or she wants and perhaps even do it *better* than Google. In this case of film directors, I find it very difficult even to imagine how we could do better than Google because the Google search works so incredibly well. Perhaps a film librarian could discover that the IMDB and Wikipedia are incorrect or incomplete. In that respect perhaps library e
Re: [RDA-L] Automatically adding relationship designators (was Cost of Retrospective Conversion for Legacy Data...)
The OLAC project is a wonderful example of the ways people are finding imaginative ways to automate the creation of metadata. Yes, there is much data in places like IMDb and Wikipedia that duplicates data in library catalogs; and there is also much data in those resources that is unique to those resources. By the same token, there is also data in library catalogs that cannot be found in IMDb or Wikipedia. The OLAC project will help greatly toward perfecting the semantic links between library metadata and data elsewhere. It's visionary stuff like this that helps to further the advancement of information science and technology. Kevin M. Randall Principal Serials Cataloger Northwestern University Library k...@northwestern.edu<mailto:k...@northwestern.edu> (847) 491-2939 Proudly wearing the sensible shoes since 1978! From: Resource Description and Access / Resource Description and Access [mailto:RDA-L@LISTSERV.LAC-BAC.GC.CA] On Behalf Of James Weinheimer Sent: Tuesday, December 10, 2013 6:35 AM To: RDA-L@LISTSERV.LAC-BAC.GC.CA Subject: Re: [RDA-L] Automatically adding relationship designators (was Cost of Retrospective Conversion for Legacy Data...) On 09/12/2013 0.04, Kelley McGrath wrote: OLAC is attempting a project of this sort for film and video credits. We are trying to teach a computer to recognize the names and roles that appear in 245$c, 260+$b, 508 and 511 (and if we get really brave maybe 505) and also connect them to the correct 1xx/7xx if present. The current program, which uses natural language processing (NLP) techniques, is reasonably successful with personal names and with roles given in English. We are working on building a multilingual vocabulary. It tends to choke on complicated statements that involve a lot of corporate bodies. I hesitate to bring this up because most probably everybody already thinks of me as a purveyor of doom and gloom, but I still believe that we must consider these things in realistic terms. Although the attempt is laudable, I still say that we must first of all see through the eyes of the users who would be interested in this kind of information. For instance, if I am a regular user and I wanted to know the movies directed by John Huston, what would be the first thing I would think of? "Google it". I am sure almost everybody would. So I did a natural language search: "what movies did john huston direct" and what happens? https://www.google.it/search?q=what+movies+did+john+huston+direct (This is linked data in action!) We find that down below in the links area (at least in the results I get), #1 is a link to John Huston in Wikipedia, #2 goes to "Category:Films directed by John Huston" also in Wikipedia, and #3 goes into his page at the IMDB (which I personally prefer). All have lists of the movies he directed. This is incredibly easy to do and free to all. Putting aside for the moment the linked data result, the 3 links perform exactly the same function as in the past when someone would ask a reference librarian, "I need a list of the movies John Huston directed" and the knowledgeable reference librarian would reply: "Here. You can find the list in this book." and would hand the user the latest issue of this title http://lccn.loc.gov/sn99044419 (or something similar) which was very possibly shelved in the reference collection for quick and easy access. Therefore, just as the reference librarian would take the user's question and convert it into, "He needs to look in Film directors : a complete guide", today a reference librarian would do the same thing but answer/include, "He needs to look in the IMDB". Without any doubt, that is the ethical answer for such a question and will remain so for a long, long time in the future. The huge difference is that today, people rarely consult reference librarians. The librarian would already know that if you want to find the films of specific directors, the library catalog is currently not the right place to look for this information and when viewed realistically, it never will be the right place. There is nothing at all wrong with that. Not every tool is good for every use, just as if you want the latest business news or to find out why your XML won't validate, the best place is not JSTOR, and it never will be. That doesn't mean JSTOR is no good--it just means that you have to look in other places for that kind of information. Today, the correct place to look for the films people have directed is the IMDB or perhaps a few other places on the web. We are *really lucky* that we have such options for free today. The reference librarians would be able to help the searcher in these directions *if* they were asked, but sadly, that is happening less and less. So, adding the relator codes automatically will still demand manual cleanup, perhaps (probably) on a massive scale,
Re: [RDA-L] Automatically adding relationship designators (was Cost of Retrospective Conversion for Legacy Data...)
On 09/12/2013 0.04, Kelley McGrath wrote: OLAC is attempting a project of this sort for film and video credits. We are trying to teach a computer to recognize the names and roles that appear in 245$c, 260+$b, 508 and 511 (and if we get really brave maybe 505) and also connect them to the correct 1xx/7xx if present. The current program, which uses natural language processing (NLP) techniques, is reasonably successful with personal names and with roles given in English. We are working on building a multilingual vocabulary. It tends to choke on complicated statements that involve a lot of corporate bodies. I hesitate to bring this up because most probably everybody already thinks of me as a purveyor of doom and gloom, but I still believe that we must consider these things in realistic terms. Although the attempt is laudable, I still say that we must first of all see through the eyes of the users who would be interested in this kind of information. For instance, if I am a regular user and I wanted to know the movies directed by John Huston, what would be the first thing I would think of? "Google it". I am sure almost everybody would. So I did a natural language search: "what movies did john huston direct" and what happens? https://www.google.it/search?q=what+movies+did+john+huston+direct (This is linked data in action!) We find that down below in the links area (at least in the results I get), #1 is a link to John Huston in Wikipedia, #2 goes to "Category:Films directed by John Huston" also in Wikipedia, and #3 goes into his page at the IMDB (which I personally prefer). All have lists of the movies he directed. This is incredibly easy to do and free to all. Putting aside for the moment the linked data result, the 3 links perform exactly the same function as in the past when someone would ask a reference librarian, "I need a list of the movies John Huston directed" and the knowledgeable reference librarian would reply: "Here. You can find the list in this book." and would hand the user the latest issue of this title http://lccn.loc.gov/sn99044419 (or something similar) which was very possibly shelved in the reference collection for quick and easy access. Therefore, just as the reference librarian would take the user's question and convert it into, "He needs to look in Film directors : a complete guide", today a reference librarian would do the same thing but answer/include, "He needs to look in the IMDB". Without any doubt, that is the ethical answer for such a question and will remain so for a long, long time in the future. The huge difference is that today, people rarely consult reference librarians. The librarian would already know that if you want to find the films of specific directors, the library catalog is currently not the right place to look for this information and when viewed realistically, it never will be the right place. There is nothing at all wrong with that. Not every tool is good for every use, just as if you want the latest business news or to find out why your XML won't validate, the best place is not JSTOR, and it never will be. That doesn't mean JSTOR is no good--it just means that you have to look in other places for that kind of information. Today, the correct place to look for the films people have directed is the IMDB or perhaps a few other places on the web. We are *really lucky* that we have such options for free today. The reference librarians would be able to help the searcher in these directions *if* they were asked, but sadly, that is happening less and less. So, adding the relator codes automatically will still demand manual cleanup, perhaps (probably) on a massive scale, if it is ever to become as good as IMDB is *right now*. I suggest that the correct method for a library catalog is to lead the person to the *right resource* that he or she wants and perhaps even do it *better* than Google. In this case of film directors, I find it very difficult even to imagine how we could do better than Google because the Google search works so incredibly well. Perhaps a film librarian could discover that the IMDB and Wikipedia are incorrect or incomplete. In that respect perhaps library efforts could be better focused on improving IMDB and Wikipedia than adding relator codes. There is also the option that the library catalog could interact with the IMDB (and/or Wikipedia) using the APIs. This opens up a highly pertinent question for me: I don't even know what a library catalog is supposed to provide in today's semi-total information environment. This is a great example. We can't ignore these wonderful sites. What should the catalog do today? -- James Weinheimer weinheimer.ji...@gmail.com First Thus http://catalogingmatters.blogspot.com/ First Thus Facebook Page https://www.facebook.com/FirstThus Cooperative Cataloging Rules http://sites.google.com/site/opencatalogingrules/ Cataloging Matters Podcasts http:/