Hi Robert,
Thanks for your attention..but my problem persist..
I will try to explain all my configuration, because I should be doing
something strange:
*config.properties*
dumpDir=/home/rober/Escritorio/dbpedia/datos/pages
outputDir=/home/rober/Escritorio/dbpedia/output
extractors=org.dbpedia.extraction.mappings.LabelExtractor \
org.dbpedia.extraction.mappings.WikiPageExtractor \
org.dbpedia.extraction.mappings.InfoboxExtractor \
org.dbpedia.extraction.mappings.PageLinksExtractor \
org.dbpedia.extraction.mappings.GeoExtractor
extractors.en=org.dbpedia.extraction.mappings.CategoryLabelExtractor \
org.dbpedia.extraction.mappings.ArticleCategoriesExtractor \
org.dbpedia.extraction.mappings.ImageExtractor \
org.dbpedia.extraction.mappings.ExternalLinksExtractor \
org.dbpedia.extraction.mappings.HomepageExtractor \
org.dbpedia.extraction.mappings.DisambiguationExtractor \
org.dbpedia.extraction.mappings.PersondataExtractor \
org.dbpedia.extraction.mappings.PndExtractor \
org.dbpedia.extraction.mappings.SkosCategoriesExtractor \
org.dbpedia.extraction.mappings.RedirectExtractor \
org.dbpedia.extraction.mappings.MappingExtractor
languages=es
Following what I undestand, my dumps are in this* paths*
- COMMONS:
/home/rober/Escritorio/dbpedia/datos/pages/20100311/commons/commonswiki-20100319-pages-articles.xml.bz2
- SPANISH:
/home/rober/Escritorio/dbpedia/datos/pages/20100319/eswiki/eswiki-20100311-pages-articles.xml.bz2
But this error appeared:
[INFO] Checking for multiple versions of scala
[INFO] launcher 'Extract' selected => org.dbpedia.extraction.Extract
Exception in thread "Thread-1" java.lang.Exception: Dump directory not
found: /home/rober/Escritorio/dbpedia/datos/pages/commons
at
org.dbpedia.extraction.ConfigLoader$Config.getDumpFile(ConfigLoader.scala:93)
at
org.dbpedia.extraction.ConfigLoader$Config.<init>(ConfigLoader.scala:85)
at org.dbpedia.extraction.ConfigLoader$.load(ConfigLoader.scala:28)
at org.dbpedia.extraction.Extract$ExtractionThread.run(Extract.scala:26)
[INFO]
------------------------------------------------------------------------
[INFO] BUILD SUCCESSFUL
[INFO]
------------------------------------------------------------------------
I also tried it with:
-
/home/rober/Escritorio/dbpedia/datos/pages/commons/20100311/commonswiki-20100319-pages-articles.xml.bz2
-
/home/rober/Escritorio/dbpedia/datos/pages/eswiki/20100319/eswiki-20100311-pages-articles.xml.bz2
and:
-
/home/rober/Escritorio/dbpedia/datos/pages/commons/commonswiki-20100319-pages-articles.xml.bz2
-
/home/rober/Escritorio/dbpedia/datos/pages/eswiki/eswiki-20100311-pages-articles.xml.bz2
But in this case:
[INFO] launcher 'Extract' selected => org.dbpedia.extraction.Extract
Exception in thread "Thread-1" java.lang.Exception: Dump not found:
/home/rober/Escritorio/dbpedia/datos/pages/commons/20100319/commonswiki-20100319-pages-articles.xml
at
org.dbpedia.extraction.ConfigLoader$Config.getDumpFile(ConfigLoader.scala:102)
at
org.dbpedia.extraction.ConfigLoader$Config.<init>(ConfigLoader.scala:85)
at org.dbpedia.extraction.ConfigLoader$.load(ConfigLoader.scala:28)
at org.dbpedia.extraction.Extract$ExtractionThread.run(Extract.scala:26)
[INFO]
------------------------------------------------------------------------
[INFO] BUILD SUCCESSFUL
[INFO]
------------------------------------------------------------------------
Any idea? It should be a stupid error
Thanks for everything
2010/3/23 Robert Isele <[email protected]>
> Hi Roberto,
>
> you can get the latest Wikipedia Commons dump at
>
> http://download.wikimedia.org/commonswiki/20100319/commonswiki-20100319-pages-articles.xml.bz2
> .
> The file is expected to be found in the directory
> {dumpDir}/20100319/commons/commonswiki-20100319-pages-articles.xml.bz2.
>
> Cheers
> Robert
>
> On Tue, Mar 23, 2010 at 10:27 AM, Roberto Nieto <[email protected]>
> wrote:
> > Hi everyone,
> >
> > I'm trying to use the Information Extraction Framework, but i should be
> > doing something wrong and I'm having problems with the dumps.
> >
> > I download the dump "eswikisource-20100317-pages-articles.xml.bz2" I
> saved
> > it in a folder, I setup the configuration dumpDir to the folder and I try
> to
> > run the extraction..but...
> >
> > [INFO] launcher 'Extract' selected => org.dbpedia.extraction.Extract
> > Exception in thread "Thread-1" java.lang.Exception: Dump directory not
> > found: /home/rober/Escritorio/dbpedia/datos/pages/commons
> > at
> >
> org.dbpedia.extraction.ConfigLoader$Config.getDumpFile(ConfigLoader.scala:93)
> > at
> > org.dbpedia.extraction.ConfigLoader$Config.<init>(ConfigLoader.scala:85)
> > at org.dbpedia.extraction.ConfigLoader$.load(ConfigLoader.scala:28)
> > at
> org.dbpedia.extraction.Extract$ExtractionThread.run(Extract.scala:26)
> > [INFO]
> > ------------------------------------------------------------------------
> > [INFO] BUILD SUCCESSFUL
> >
> >
> > Reading the doc I saw this "The dump files should be organized in the way
> as
> > they are on the wikipedia servers.
> > e.g. {dumpDir}/sc/20100306/scwiki-20100306-pages-articles.xml.bz2. In
> > addition to the dumps of the configured languages, you'll need the
> Wikipedia
> > Commons Dump."
> >
> > Now I'm not sure what is "the Wikipedia Commons Dump"... or if I'm using
> a
> > wrong dump
> >
> > Can anyone help me?
> >
> > Thanks for the attention.
> >
> >
> ------------------------------------------------------------------------------
> > Download Intel® Parallel Studio Eval
> > Try the new software tools for yourself. Speed compiling, find bugs
> > proactively, and fine-tune applications for parallel performance.
> > See why Intel Parallel Studio got high marks during beta.
> > http://p.sf.net/sfu/intel-sw-dev
> > _______________________________________________
> > Dbpedia-discussion mailing list
> > [email protected]
> > https://lists.sourceforge.net/lists/listinfo/dbpedia-discussion
> >
> >
>
------------------------------------------------------------------------------
Download Intel® Parallel Studio Eval
Try the new software tools for yourself. Speed compiling, find bugs
proactively, and fine-tune applications for parallel performance.
See why Intel Parallel Studio got high marks during beta.
http://p.sf.net/sfu/intel-sw-dev
_______________________________________________
Dbpedia-discussion mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/dbpedia-discussion