OLAC is attempting a project of this sort for film and video credits. We are 
trying to teach a computer to recognize the names and roles that appear in 
245$c, 260+$b, 508 and 511 (and if we get really brave maybe 505) and also 
connect them to the correct 1xx/7xx if present. The current program, which uses 
natural language processing (NLP) techniques, is reasonably successful with 
personal names and with roles given in English. We are working on building a 
multilingual vocabulary. It tends to choke on complicated statements that 
involve a lot of corporate bodies.

There are a lot of enhancements that can be made to this process, but it will 
never be 100% accurate. I do think it will be good enough to be useful and 
hopefully also effective at identifying statements of responsibility that need 
human intervention.

As part of this process, we are hand-annotating a large pool of credits with 
the correct answers. This will enable us to assess the effectiveness of 
different strategies and may be useful for machine learning. It would be 
wonderful if you could help us out, especially if you are able to translate 
credits from other languages into English. I am challenging people to annotate 
ten credits per week for six weeks. This should take less than ten minutes per 
week. Go to http://olac-annotator.org to get started. (If you want to do 
non-English credits, please email me off-list as it is probably more effective 
for me to send you a list each week. The credits are separated by language of 
the film, but all the files have lots of English language credits from notes. 
We have lots of languages from Arabic to Urdu.)

I have also started a list to discuss problems with interpreting credits, which 
you're welcome to join. It's at 
https://lists.uoregon.edu/mailman/listinfo/olac-credits.

Please feel free to share this information with anyone else who might be 
interested in contributing.

Kelley


On Mon, Nov 25, 2013 at 1:54 PM, Benjamin A Abrahamse 
<babra...@mit.edu<mailto:babra...@mit.edu>> wrote:

I would be very curious to know if anyone with a systems background has thought 
about ways to batch-apply relators to existing records. Perhaps by making use 
of existing statements of responsibility?  It seems to me given the number of 
pre-RDA records out there that no one will ever have time and/or money to 
update them manually.

Reply via email to