Dear all,
I am a R&D engineer, working at a French national institute (INRIA) for
Semanticpedia project ( http://www.semanticpedia.org/ ).
The idea of the project is to provide the French version of DBpedia and related
services (among them DBpedia Spotlight in French). For the moment I am focusing
on DBpedia Spotlight, trying to configure a French instance of it.
While exploring the process of building a Spotlight service with my own data, I
came across two following issues I think it might be useful to share with you:
1- In the internationalization page
(https://github.com/dbpedia-spotlight/dbpedia-spotlight/wiki/Internationalization-(Lucene-backed-core))
it is given an example of the configuration for Spanish and a link to the
index.properties fi le (
https://dl.dropboxusercontent.com/u/99877231/dbpedia/conf/indexing.es.properties
) with the configurations needed for this language. The names of data files
here do not consider the compressed version, so for example:
[a] org.dbpedia.spotlight.data.wikipediaDump =
/usr/local/spotlight/dbpedia_data/es/eswiki-latest-pages-articles.xml
But when running the index.sh provided with the service, it seems like there is
a problem with this data format, and it expects it to be in the compressed
format(bz2).
My suggestion would be to update the guide of this page by either changing the
index script delivered with the service so that it accepts the compressed
format too, or change the content of the example taken for Spanish (
https://dl.dropboxusercontent.com/u/99877231/dbpedia/conf/indexing.es.properties
) by having for example the following line [b] instead of [a] (applied this
for all the data files declared in the indexing file):
[b] org.dbpedia.spotlight.data.wikipediaDump =
/usr/local/spotlight/dbpedia_data/es/eswiki-latest-pages-articles.xml.bz2
I agree that this is not a major issue, but it can help developers saving some
time.
2- I encountered the problem described in this old post:
https://github.com/dbpedia-spotlight/dbpedia-spotlight/issues/138 regarding the
infinite loop generated by ExtractCandidateMap.scala. I would appreciate if
somebody could advice me on how to solve it, given the fact that the solutions
proposed on this post date a long time ago and I wouldn't be sure on their
coherence.
Hope my feedback can be useful to you and hope to have an answer regarding the
second issue I'm encountering.
Best,
Pajolma
Pajolma RUPI
Research and Development Engineer
Service de l'e-Information Scientifique et Multimédia (SEISM)
Research Centre INRIA Grenoble - Rhône-Alpes
655 Avenue de l'Europe
38330 Montbonnot-Saint-Martin
France
------------------------------------------------------------------------------
One dashboard for servers and applications across Physical-Virtual-Cloud
Widest out-of-the-box monitoring support with 50+ applications
Performance metrics, stats and reports that give you Actionable Insights
Deep dive visibility with transaction tracing using APM Insight.
http://ad.doubleclick.net/ddm/clk/290420510;117567292;y
_______________________________________________
Dbp-spotlight-users mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/dbp-spotlight-users