Author: ogrisel
Date: Fri May 20 13:54:43 2011
New Revision: 1125397
URL: http://svn.apache.org/viewvc?rev=1125397&view=rev
Log:
commit
Modified:
incubator/stanbol/trunk/entityhub/indexing/dbpedia/README.md
Modified: incubator/stanbol/trunk/entityhub/indexing/dbpedia/README.md
URL:
http://svn.apache.org/viewvc/incubator/stanbol/trunk/entityhub/indexing/dbpedia/README.md?rev=1125397&r1=1125396&r2=1125397&view=diff
==============================================================================
--- incubator/stanbol/trunk/entityhub/indexing/dbpedia/README.md (original)
+++ incubator/stanbol/trunk/entityhub/indexing/dbpedia/README.md Fri May 20
13:54:43 2011
@@ -42,10 +42,18 @@ but before doing this please note the po
### (2) Download the dbPedia Dump Files:
-All RDF dumps need to be copied to the directory
+All RDF dumps need to be copied to the directory:
indexing/resources/rdfdata
+The files do not need to be decompressed. The raw ".nt.bz2" files from
+DBpedia can downloaded to that folder directly.
+
+At the time of writing, version 3.6 is the latest release. All available
+archives are referenced on this page:
+
+<http://wiki.dbpedia.org/Downloads36>
+
The RDF dump of DBpedia.org is splitted up in a number of different files.
The actual files needed depend on the configuration of the mappings
(indexing/config/mappings.txt). Generally one need to make sure that
@@ -65,14 +73,10 @@ interesting dump files:
* <http://downloads.dbpedia.org/3.6/en/category_labels_en.nt.bz2>
* <http://downloads.dbpedia.org/3.6/en/skos_categories_en.nt.bz2>
-At the time of writing, version 3.6 is the latest release. All available
-dumps are hence referenced on this page:
-
-<http://wiki.dbpedia.org/Downloads36>
-
-During the initialisation of the Indeing all the RDF files within the
-"indexing/resources/rdfdata" directory will be imported to an Jena TDB
-RDF triple store. The imported data are stored under:
+During the first part of the indexing (a.k.a. the initialisation step)
+all the RDF files within the "indexing/resources/rdfdata" directory
+will be imported to an Jena TDB RDF triple store. The imported data are
+stored under:
indexing/resources/tdb