Hi all,
I am working on developing a software system designed to analyze the
content of research documents (e.g., research papers, articles, etc.)
archived in scientific repositories (e.g., http://citeseerx.ist.psu.edu
http://citeseerx.ist.psu.edu/ , http://arxiv.org/ ) and automatically
You might want to check out the BASE system from Bielefeld, Germany
(http://www.base-search.net), which have access to a lot of OA sources, and
implemented an classification system (metadata+fulltext) for DDC themselves on
a semi-automatic generated training corpus across all disciplines
Not sure if this is what you want, but the FAST dataset is available online
from OCLC. http://www.oclc.org/research/activities/fast/download.htm
Nathan
On Fri, Jun 29, 2012 at 9:29 AM, Arash.Joorabchi arash.joorab...@ul.iewrote:
Hi all,
I am working on developing a software system