Author, title, and publication year.... won't get you many false positives, but might get you lots of false negatives.

It's certainly true that there is no good "naive" approach to matching without identifiers and getting a good balance of minimal false positives and false negatives. There are tricky ways to approach it I haven't really tried yet, you can sometimes get closer to "good enough" than you think with just author/title or author/title/year.

Depends on the source of your data too. If you have an AACR2/NAF controlled heading for an author, instead of just a free-text author entry field, that certainly makes it easier.

Jonathan

Kyle Banerjee wrote:
So, the purpose of this would be to discover where a given item represented
by the OpenURL was held. A secondary purpose would be as a source of
bibliographic citation information This could be quite useful discovery
tool, especially for materials that are not widely held.


Still trying to wrap my mind around your use case. First of all, are you
thinking about journals as well as other materials? Or do you just want to
find matches based on title, author, and other OpenURL elements?

Be aware that for any search that doesn't involve a known identifier, you're
going to run into major issues with false matches and duplicate entries.
Also, quality/completeness of data in records for obscure items is often
poor.

kyle

Reply via email to