Re: [Dbp-spotlight-users] Issues while generating the data for Spotlight

Pajolma Rupi Wed, 13 May 2015 07:51:33 -0700

Hi David, 

Yes, I tried the 'Lucene' version. I was not aware of the fact that it is not 
maintained. Thiago (in cc) kindly informed me about it, as an answer to this 
email. 
I am going to try the statistical version. 
In the meantime, might be good to add the information that 'Lucene' version is 
not currently supported in the respective internationalization page, if you 
would see it appropriate of course.


I thank you for your answer and I keep you posted. 

Best, 
Pajolma 

Pajolma RUPI 
Research and Development Engineer 
Service de l'e-Information Scientifique et Multimédia (SEISM) 
Research Centre INRIA Grenoble - Rhône-Alpes 
655 Avenue de l'Europe 
38330 Montbonnot-Saint-Martin 
France 

----- Original Message -----

> From: "David Przybilla" <[email protected]>
> To: "Pajolma Rupi" <[email protected]>
> Cc: [email protected]
> Sent: Wednesday, May 13, 2015 3:57:37 PM
> Subject: Re: [Dbp-spotlight-users] Issues while generating the data for
> Spotlight

> Hi Pajolma,

> Are you compelled to build a model for the Lucene-Model ? It is kinda
> out-dated. Otherwise Building a statistical model (the most recent version
> and the one you have probably played with on the demo endpoint) should be
> the way to go. it is less of a hassle since lots of the process is
> semiautomatic (i.e: downloading the dumps, redirects etc)

> If you are willing to try the statistical model take a look at :
> https://github.com/dbpedia-spotlight/model-quickstarter

> keep us posted if you get stuck.

> On Wed, May 13, 2015 at 1:48 PM, Pajolma Rupi < [email protected] >
> wrote:

> > Dear all,
> 

> > I am a R&D engineer, working at a French national institute (INRIA) for
> > Semanticpedia project ( http://www.semanticpedia.org/ ).
> 
> > The idea of the project is to provide the French version of DBpedia and
> > related services (among them DBpedia Spotlight in French). For the moment I
> > am focusing on DBpedia Spotlight, trying to configure a French instance of
> > it.
> 

> > While exploring the process of building a Spotlight service with my own
> > data,
> > I came across two following issues I think it might be useful to share with
> > you:
> 

> > 1- In the internationalization page (
> > https://github.com/dbpedia-spotlight/dbpedia-spotlight/wiki/Internationalization-(Lucene-backed-core)
> > ) it is given an example of the configuration for Spanish and a link to the
> > index.properties fi le (
> > https://dl.dropboxusercontent.com/u/99877231/dbpedia/conf/indexing.es.properties
> > ) with the configurations needed for this language. The names of data files
> > here do not consider the compressed version, so for example:
> 

> > [a] org.dbpedia.spotlight.data.wikipediaDump =
> > /usr/local/spotlight/dbpedia_data/es/eswiki-latest-pages-articles.xml
> 
> > But when running the index.sh provided with the service, it seems like
> > there
> > is a problem with this data format, and it expects it to be in the
> > compressed format(bz2).
> 
> > My suggestion would be to update the guide of this page by either changing
> > the index script delivered with the service so that it accepts the
> > compressed format too, or change the content of the example taken for
> > Spanish (
> > https://dl.dropboxusercontent.com/u/99877231/dbpedia/conf/indexing.es.properties
> > ) by having for example the following line [b] instead of [a] (applied this
> > for all the data files declared in the indexing file):
> 

> > [b] org.dbpedia.spotlight.data.wikipediaDump =
> > /usr/local/spotlight/dbpedia_data/es/eswiki-latest-pages-articles.xml.bz2
> 
> > I agree that this is not a major issue, but it can help developers saving
> > some time.
> 

> > 2- I encountered the problem described in this old post:
> > https://github.com/dbpedia-spotlight/dbpedia-spotlight/issues/138 regarding
> > the infinite loop generated by ExtractCandidateMap.scala. I would
> > appreciate
> > if somebody could advice me on how to solve it, given the fact that the
> > solutions proposed on this post date a long time ago and I wouldn't be sure
> > on their coherence.
> 

> > Hope my feedback can be useful to you and hope to have an answer regarding
> > the second issue I'm encountering.
> 

> > Best,
> 
> > Pajolma
> 

> > Pajolma RUPI
> 
> > Research and Development Engineer
> 
> > Service de l'e-Information Scientifique et Multimédia (SEISM)
> 
> > Research Centre INRIA Grenoble - Rhône-Alpes
> 
> > 655 Avenue de l'Europe
> 
> > 38330 Montbonnot-Saint-Martin
> 
> > France
> 

> > ------------------------------------------------------------------------------
> 
> > One dashboard for servers and applications across Physical-Virtual-Cloud
> 
> > Widest out-of-the-box monitoring support with 50+ applications
> 
> > Performance metrics, stats and reports that give you Actionable Insights
> 
> > Deep dive visibility with transaction tracing using APM Insight.
> 
> > http://ad.doubleclick.net/ddm/clk/290420510;117567292;y
> 
> > _______________________________________________
> 
> > Dbp-spotlight-users mailing list
> 
> > [email protected]
> 
> > https://lists.sourceforge.net/lists/listinfo/dbp-spotlight-users
>

------------------------------------------------------------------------------
One dashboard for servers and applications across Physical-Virtual-Cloud 
Widest out-of-the-box monitoring support with 50+ applications
Performance metrics, stats and reports that give you Actionable Insights
Deep dive visibility with transaction tracing using APM Insight.
http://ad.doubleclick.net/ddm/clk/290420510;117567292;y

_______________________________________________
Dbp-spotlight-users mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/dbp-spotlight-users

Re: [Dbp-spotlight-users] Issues while generating the data for Spotlight

Reply via email to