[CODE4LIB] Test dataset for evaluation of automatic classification of research documents according to FAST and DDC

2012-06-29 Thread Arash.Joorabchi
Hi all, I am working on developing a software system designed to analyze the content of research documents (e.g., research papers, articles, etc.) archived in scientific repositories (e.g., http://citeseerx.ist.psu.edu http://citeseerx.ist.psu.edu/ , http://arxiv.org/ ) and automatically

Re: [CODE4LIB] Test dataset for evaluation of automatic classification of research documents according to FAST and DDC

2012-06-29 Thread Rene Wiermer
You might want to check out the BASE system from Bielefeld, Germany (http://www.base-search.net), which have access to a lot of OA sources, and implemented an classification system (metadata+fulltext) for DDC themselves on a semi-automatic generated training corpus across all disciplines

Re: [CODE4LIB] Test dataset for evaluation of automatic classification of research documents according to FAST and DDC

2012-06-29 Thread Nathan Tallman
Not sure if this is what you want, but the FAST dataset is available online from OCLC. http://www.oclc.org/research/activities/fast/download.htm Nathan On Fri, Jun 29, 2012 at 9:29 AM, Arash.Joorabchi arash.joorab...@ul.iewrote: Hi all, I am working on developing a software system