Dear all, At Ghent University the department of telecommunications and information processing is brainstorming on a project on citation linking. They have quite some expertise in flexible querying and information retrieval. They would like to try out their algorithms on public training sets of references and bibliographic data. The task is to train their algorithms to find all matches between citations and a corpus of publications. The challenge (as we all know from related projects/products) is to match the 'bad' citation data with 'good' publication data. Are there some public datasets available which were human tested & examined to really get good precision/recall numbers for the proposed algorithms? Datasets which are/can be used in current/future shootouts between citation matching algorithms?
Thanks Patrick Skype: patrick.hochstenbach Patrick Hochstenbach Digital Architect University Library +32(0)92647980 Ghent University * Rozier 9 * 9000 * Gent