Hi Giosia, Andrea,
I just added a pull request on dbpedia/lookup project that may interest you.
I added a script to generate the files needed for Lookup Indexer from the files
generated by Spotlight indexation (the surface form) .
I put some precisions on Andrea answer in the following
----- Mail original -----
> De: "Andrea Di Menna" <[email protected]>
> À: "Giosia Gentile" <[email protected]>
> Cc: [email protected]
> Envoyé: Lundi 25 Mars 2013 19:40:52
> Objet: Re: [Dbpedia-discussion] Create my Dbpedia lookup index
> Hi Giosia,
> some answers inline (maybe the other guys will give you more
> appropriate answers)
> 2013/3/25 Giosia Gentile < [email protected] >
> > Hi, I installed a local dbpedia lookup with success but now
>
> > I want recreate the new Index, using my own local DBpedia mirror.
>
> > From the documentation I've read it should be done with the
> > command:
>
> > mvn scala:run -Dlauncher=Indexer
> > "-DaddArgs=indexDir|redirectsFile|data"
>
> > But it's not very clear to me the meaning of these parameters:
>
> > 1- indexDir is the output directory that will hold the searchable
> > structures --> is this the dir where it put the new index files?
>
> That's correct.
> > 2- redirectsFile is the homonymous dbpedia data set --> what does
> > it
> > mean?
>
> That refers to redirects which are extracted from the DBpedia
> framework, e.g. [1]
I didn't try with a compressed file but I noted that comments (lines that begin
with #) would crash the Indexer (the NxParser in fact) .
So I unzipped it and runned "grep -v "^#" to get a cleaned redirect file.
> > 3- data is a collection of files with the properties you want to
> > index -->what does it mean? in what format must be this files?
> > Collection of files? How can I pass a collection of files?
>
> I think there is a typo in the doc you have read.
> If you refer to this page [2], you will see that you are expected to
> sort and merge together input files.
> (anyway, the scala code here [3] shows that you could specify a
> collection of files, using a pipe to separate the filenames, since
> the main method iterates over the remaining arguments)
In [2] (section rebuilding-the-index ), the files are actually merged, that's
what I did in the script.
Beware, the merged file must be sorted, otherwise data will be skipped.
As I understood from the code, you should also be able to give a list of files
in the "-DaddArgs" param of maven,something like that :
mvn scala:run -Dlauncher=Indexer
"-DaddArgs=c:\lucene_lookup_index|c:\redirects_en.nt|ref_counts.nt|surface_forms.nt|..."
I didn't try it, I wonder if the sort needed for the merged file is a problem
here ...
> Hope this helps a bit.
> Cheers
> Andrea
Hope I helped also a bit.
Good luck,
Julien
> > Please help me, if possible with some example.
>
> > Thank you
>
> > ------------------------------------------------------------------------------
>
> > Everyone hates slow websites. So do we.
>
> > Make your web apps faster with AppDynamics
>
> > Download AppDynamics Lite for free today:
>
> > http://p.sf.net/sfu/appdyn_d2d_mar
>
> > _______________________________________________
>
> > Dbpedia-discussion mailing list
>
> > [email protected]
>
> > https://lists.sourceforge.net/lists/listinfo/dbpedia-discussion
>
> [1] http://downloads.dbpedia.org/3.8/en/redirects_en.nt.bz2
> [2] https://github.com/dbpedia/lookup
> [3]
> https://github.com/dbpedia/lookup/blob/master/src/main/scala/org/dbpedia/lookup/lucene/Indexer.scala#L121
> ------------------------------------------------------------------------------
> Everyone hates slow websites. So do we.
> Make your web apps faster with AppDynamics
> Download AppDynamics Lite for free today:
> http://p.sf.net/sfu/appdyn_d2d_mar
> _______________________________________________
> Dbpedia-discussion mailing list
> [email protected]
> https://lists.sourceforge.net/lists/listinfo/dbpedia-discussion
------------------------------------------------------------------------------
Own the Future-Intel® Level Up Game Demo Contest 2013
Rise to greatness in Intel's independent game demo contest.
Compete for recognition, cash, and the chance to get your game
on Steam. $5K grand prize plus 10 genre and skill prizes.
Submit your demo by 6/6/13. http://p.sf.net/sfu/intel_levelupd2d
_______________________________________________
Dbpedia-discussion mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/dbpedia-discussion