Hi,

I am starting to play a bit with the EuroVoc 

<http://eurovoc.europa.eu/>

ontology in order to integrate it into OpenAIRE Orphan Record
Repository, for automatic keyword extraction for EU documents.

This ontology is *big*! and multilingual. I can't even load it with
RDFLIB on my laptop (4GB of RAM).

I am currently trying to open it on a 24GB machine: it has already
filled up 8GB and still loading!

I was wondering if it makes sense at all to try to store a huge RDF/SKOS
into a database table (see:

<http://code.google.com/p/rdflib/wiki/SQL_Backend>

) to improve performances. Would this be useless WRT the cache that
BibClassify is building? Maybe it would help before the cache has been
created?

Cheers,
Sam

P.s. I noticed in BibClassify code that the cache is created by using
cPickle with protocol version 1. Just for curiosity, why hasn't be used
protocol version 2 (or -1)? Have you experienced some degradation of the
performance with higher protocol versions?
-- 
Samuele Kaplun
Invenio Developer ** <http://invenio-software.org/>

Reply via email to