I checked and I did not have.

In the download properties file, I have

download=hy:pages-articles.xml.bz2

I also had:

unzip=false

and I changed to

unzip=true

but it remails the same.

I attach the properties files that I am using.

The commands that I execute are:

$ ../run download config=download.hy.minimal.properties

[INFO] Scanning for projects...
[INFO] ------------------------------------------------------------------------
[INFO] Building DBpedia Dump Extraction
[INFO]    task-segment: [scala:run]
[INFO] ------------------------------------------------------------------------
[INFO] Preparing scala:run
[INFO] [resources:resources {execution: default-resources}]
[INFO] Using 'UTF-8' encoding to copy filtered resources.
[INFO] skip non existing resourceDirectory
/home/labra/workspace/extraction_framework/dump/src/main/resources
[INFO] [scala:compile {execution: process-resources}]
[INFO] Checking for multiple versions of scala
[INFO] includes = [**/*.scala,**/*.java,]
[INFO] excludes = []
[INFO] Nothing to compile - all classes are up to date
[INFO] [compiler:compile {execution: default-compile}]
[INFO] Nothing to compile - all classes are up to date
[INFO] [scala:compile {execution: compile}]
[INFO] Checking for multiple versions of scala
[INFO] includes = [**/*.scala,**/*.java,]
[INFO] excludes = []
[INFO] Nothing to compile - all classes are up to date
[INFO] [resources:testResources {execution: default-testResources}]
[INFO] Using 'UTF-8' encoding to copy filtered resources.
[INFO] skip non existing resourceDirectory
/home/labra/workspace/extraction_framework/dump/src/test/resources
[INFO] [compiler:testCompile {execution: default-testCompile}]
[INFO] No sources to compile
[INFO] [scala:testCompile {execution: test-compile}]
[INFO] Checking for multiple versions of scala
[INFO] includes = [**/*.scala,**/*.java,]
[INFO] excludes = []
[WARNING] No source files found.
[INFO] [scala:run {execution: default-cli}]
[INFO] Checking for multiple versions of scala
[INFO] launcher 'download' selected =>
org.dbpedia.extraction.dump.download.Download
done: 0 -
todo: 1 - wiki=hy,locale=hy
downloading 'http://dumps.wikimedia.org/hywiki/' to
'/home/labra/DBPedia/WikipediaDumps/hywiki/index.html'
read 6.203125 KB of 6.203125 KB in 0.002 seconds (3.0288694 MB/s)
did not download any files to
'/home/labra/DBPedia/WikipediaDumps/hywiki/20121012' - all files
already complete
[INFO] ------------------------------------------------------------------------
[INFO] BUILD SUCCESSFUL
[INFO] ------------------------------------------------------------------------
[INFO] Total time: 4 seconds
[INFO] Finished at: Sun Oct 14 18:32:05 CEST 2012
[INFO] Final Memory: 22M/162M
[INFO] ------------------------------------------------------------------------

$ ls -l /home/labra/DBPedia/WikipediaDumps/hywiki/20121012
total 258538
-rw-rw-r-- 1 labra labra         0 oct 14 18:26
hywiki-20121012-download-complete
-rw-rw-r-- 1 labra labra 263695080 oct 12 05:16
hywiki-20121012-pages-articles.xml
-rw-rw-r-- 1 labra labra     12220 oct 12 06:32 index.html

$ ../run extraction extraction.hy.default.properties
[INFO] Scanning for projects...
[INFO] ------------------------------------------------------------------------
[INFO] Building DBpedia Dump Extraction
[INFO]    task-segment: [scala:run]
[INFO] ------------------------------------------------------------------------
[INFO] Preparing scala:run
[INFO] [resources:resources {execution: default-resources}]
[INFO] Using 'UTF-8' encoding to copy filtered resources.
[INFO] skip non existing resourceDirectory
/home/labra/workspace/extraction_framework/dump/src/main/resources
[INFO] [scala:compile {execution: process-resources}]
[INFO] Checking for multiple versions of scala
[INFO] includes = [**/*.scala,**/*.java,]
[INFO] excludes = []
[INFO] Nothing to compile - all classes are up to date
[INFO] [compiler:compile {execution: default-compile}]
[INFO] Nothing to compile - all classes are up to date
[INFO] [scala:compile {execution: compile}]
[INFO] Checking for multiple versions of scala
[INFO] includes = [**/*.scala,**/*.java,]
[INFO] excludes = []
[INFO] Nothing to compile - all classes are up to date
[INFO] [resources:testResources {execution: default-testResources}]
[INFO] Using 'UTF-8' encoding to copy filtered resources.
[INFO] skip non existing resourceDirectory
/home/labra/workspace/extraction_framework/dump/src/test/resources
[INFO] [compiler:testCompile {execution: default-testCompile}]
[INFO] No sources to compile
[INFO] [scala:testCompile {execution: test-compile}]
[INFO] Checking for multiple versions of scala
[INFO] includes = [**/*.scala,**/*.java,]
[INFO] excludes = []
[WARNING] No source files found.
[INFO] [scala:run {execution: default-cli}]
[INFO] Checking for multiple versions of scala
[INFO] launcher 'extraction' selected =>
org.dbpedia.extraction.dump.extract.Extraction
oct 14, 2012 6:31:37 PM org.dbpedia.extraction.mappings.Redirects$ loadFromCache
INFO: Loading redirects from cache file
/home/labra/DBPedia/WikipediaDumps/hywiki/20121012/hywiki-20121012-template-redirects.obj
oct 14, 2012 6:31:37 PM org.dbpedia.extraction.mappings.Redirects$ load
INFO: Will extract redirects from source for hy wiki, could not load
cache file 
'/home/labra/DBPedia/WikipediaDumps/hywiki/20121012/hywiki-20121012-template-redirects.obj':
java.io.FileNotFoundException:
/home/labra/DBPedia/WikipediaDumps/hywiki/20121012/hywiki-20121012-template-redirects.obj
(No existe el archivo o el directorio)
oct 14, 2012 6:31:37 PM org.dbpedia.extraction.mappings.Redirects$
loadFromSource
INFO: Loading redirects from source (hy)
java.lang.reflect.InvocationTargetException
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
        at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:601)
        at 
org_scala_tools_maven_executions.MainHelper.runMain(MainHelper.java:161)
        at 
org_scala_tools_maven_executions.MainWithArgsInFile.main(MainWithArgsInFile.java:26)
Caused by: javax.xml.stream.XMLStreamException: ParseError at [row,col]:[1,249]
Message: expected <mediawiki> with namespace
[http://www.mediawiki.org/xml/export-0.6/], found
[http://www.mediawiki.org/xml/export-0.7/]
        at 
org.dbpedia.util.text.xml.XMLStreamUtils.requireElement(XMLStreamUtils.java:120)
        at 
org.dbpedia.util.text.xml.XMLStreamUtils.requireStartElement(XMLStreamUtils.java:81)
        at 
org.dbpedia.extraction.sources.WikipediaDumpParser.requireStartElement(WikipediaDumpParser.java:411)
        at 
org.dbpedia.extraction.sources.WikipediaDumpParser.readDump(WikipediaDumpParser.java:130)
        at 
org.dbpedia.extraction.sources.WikipediaDumpParser.run(WikipediaDumpParser.java:114)
        at 
org.dbpedia.extraction.sources.XMLReaderSource.foreach(XMLSource.scala:64)
        at 
scala.collection.TraversableLike$class.flatMap(TraversableLike.scala:239)
        at 
org.dbpedia.extraction.sources.XMLReaderSource.flatMap(XMLSource.scala:60)
        at 
org.dbpedia.extraction.mappings.Redirects$.loadFromSource(Redirects.scala:165)
        at org.dbpedia.extraction.mappings.Redirects$.load(Redirects.scala:116)
        at 
org.dbpedia.extraction.dump.extract.ConfigLoader$$anon$1.<init>(ConfigLoader.scala:96)
        at 
org.dbpedia.extraction.dump.extract.ConfigLoader.org$dbpedia$extraction$dump$extract$ConfigLoader$$createExtractionJob(ConfigLoader.scala:51)
        at 
org.dbpedia.extraction.dump.extract.ConfigLoader$$anonfun$getExtractionJobs$1.apply(ConfigLoader.scala:36)
        at 
org.dbpedia.extraction.dump.extract.ConfigLoader$$anonfun$getExtractionJobs$1.apply(ConfigLoader.scala:36)
        at scala.collection.Iterator$$anon$19.next(Iterator.scala:401)
        at scala.collection.Iterator$class.foreach(Iterator.scala:772)
        at scala.collection.Iterator$$anon$19.foreach(Iterator.scala:399)
        at 
scala.collection.IterableViewLike$Transformed$class.foreach(IterableViewLike.scala:41)
        at 
scala.collection.IterableViewLike$$anon$3.foreach(IterableViewLike.scala:80)
        at 
org.dbpedia.extraction.dump.extract.Extraction$.main(Extraction.scala:29)
        at org.dbpedia.extraction.dump.extract.Extraction.main(Extraction.scala)
        ... 6 more
[INFO] ------------------------------------------------------------------------
[ERROR] BUILD ERROR
[INFO] ------------------------------------------------------------------------
[INFO] wrap: org.apache.commons.exec.ExecuteException: Process exited
with an error: 240(Exit value: 240)

[INFO] ------------------------------------------------------------------------
[INFO] For more information, run Maven with the -e switch
[INFO] ------------------------------------------------------------------------
[INFO] Total time: 4 seconds
[INFO] Finished at: Sun Oct 14 18:31:37 CEST 2012
[INFO] Final Memory: 21M/162M
[INFO] ------------------------------------------------------------------------





On Sun, Oct 14, 2012 at 4:48 PM, Dimitris Kontokostas <[email protected]> wrote:
> Can you check if you have this in your properties file?
> source=pages-articles.xml.bz2
>
> It seems like it is searching for an unzipped version but your download
> config doen't unzip it
>
>
> On Sun, Oct 14, 2012 at 12:48 AM, Jose Emilio Labra Gayo <[email protected]>
> wrote:
>>
>> Thanks for your answer...I have tried again with that change and now I
>> get the following error:
>>
>> INFO: Will extract redirects from source for hy wiki, could not load
>> cache file
>> '/home/labra/DBPedia/WikipediaDumps/hywiki/20121012/hywiki-20121012-template-redirects.obj':
>> java.io.FileNotFoundException:
>>
>> /home/labra/DBPedia/WikipediaDumps/hywiki/20121012/hywiki-20121012-template-redirects.obj
>> (No existe el archivo o el directorio)
>>
>> I noticed that this error is similar to the one posted in this message
>> (http://sourceforge.net/mailarchive/message.php?msg_id=29830233) but I
>> could not find an answer...
>>
>> The command that I executed is:
>>
>> ../run extraction extraction.hy.default.properties
>>
>>
>> Best regards, Labra
>>
>> The full output is the following:
>>
>> [INFO] Scanning for projects...
>> [INFO]
>> ------------------------------------------------------------------------
>> [INFO] Building DBpedia Dump Extraction
>> [INFO]    task-segment: [scala:run]
>> [INFO]
>> ------------------------------------------------------------------------
>> [INFO] Preparing scala:run
>> [INFO] [resources:resources {execution: default-resources}]
>> [INFO] Using 'UTF-8' encoding to copy filtered resources.
>> [INFO] skip non existing resourceDirectory
>> /home/labra/workspace/extraction_framework/dump/src/main/resources
>> [INFO] [scala:compile {execution: process-resources}]
>> [INFO] Checking for multiple versions of scala
>> [INFO] includes = [**/*.scala,**/*.java,]
>> [INFO] excludes = []
>> [INFO] Nothing to compile - all classes are up to date
>> [INFO] [compiler:compile {execution: default-compile}]
>> [INFO] Nothing to compile - all classes are up to date
>> [INFO] [scala:compile {execution: compile}]
>> [INFO] Checking for multiple versions of scala
>> [INFO] includes = [**/*.scala,**/*.java,]
>> [INFO] excludes = []
>> [INFO] Nothing to compile - all classes are up to date
>> [INFO] [resources:testResources {execution: default-testResources}]
>> [INFO] Using 'UTF-8' encoding to copy filtered resources.
>> [INFO] skip non existing resourceDirectory
>> /home/labra/workspace/extraction_framework/dump/src/test/resources
>> [INFO] [compiler:testCompile {execution: default-testCompile}]
>> [INFO] No sources to compile
>> [INFO] [scala:testCompile {execution: test-compile}]
>> [INFO] Checking for multiple versions of scala
>> [INFO] includes = [**/*.scala,**/*.java,]
>> [INFO] excludes = []
>> [WARNING] No source files found.
>> [INFO] [scala:run {execution: default-cli}]
>> [INFO] Checking for multiple versions of scala
>> [INFO] launcher 'extraction' selected =>
>> org.dbpedia.extraction.dump.extract.Extraction
>> oct 13, 2012 11:43:05 PM org.dbpedia.extraction.mappings.Redirects$
>> loadFromCache
>> INFO: Loading redirects from cache file
>>
>> /home/labra/DBPedia/WikipediaDumps/hywiki/20121012/hywiki-20121012-template-redirects.obj
>> oct 13, 2012 11:43:05 PM org.dbpedia.extraction.mappings.Redirects$ load
>> INFO: Will extract redirects from source for hy wiki, could not load
>> cache file
>> '/home/labra/DBPedia/WikipediaDumps/hywiki/20121012/hywiki-20121012-template-redirects.obj':
>> java.io.FileNotFoundException:
>>
>> /home/labra/DBPedia/WikipediaDumps/hywiki/20121012/hywiki-20121012-template-redirects.obj
>> (No existe el archivo o el directorio)
>> oct 13, 2012 11:43:05 PM org.dbpedia.extraction.mappings.Redirects$
>> loadFromSource
>> INFO: Loading redirects from source (hy)
>> java.lang.reflect.InvocationTargetException
>>         at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>>         at
>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>>         at
>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>>         at java.lang.reflect.Method.invoke(Method.java:601)
>>         at
>> org_scala_tools_maven_executions.MainHelper.runMain(MainHelper.java:161)
>>         at
>> org_scala_tools_maven_executions.MainWithArgsInFile.main(MainWithArgsInFile.java:26)
>> Caused by: java.io.FileNotFoundException:
>>
>> /home/labra/DBPedia/WikipediaDumps/hywiki/20121012/hywiki-20121012-pages-articles.xml
>> (No existe el archivo o el directorio)
>>         at java.io.FileInputStream.open(Native Method)
>>         at java.io.FileInputStream.<init>(FileInputStream.java:138)
>>         at
>> org.dbpedia.extraction.dump.extract.ConfigLoader$$anonfun$org$dbpedia$extraction$dump$extract$ConfigLoader$$reader$1.apply(ConfigLoader.scala:132)
>>         at
>> org.dbpedia.extraction.dump.extract.ConfigLoader$$anonfun$org$dbpedia$extraction$dump$extract$ConfigLoader$$reader$1.apply(ConfigLoader.scala:132)
>>         at
>> org.dbpedia.extraction.sources.XMLReaderSource.foreach(XMLSource.scala:63)
>>         at
>> scala.collection.TraversableLike$class.flatMap(TraversableLike.scala:239)
>>         at
>> org.dbpedia.extraction.sources.XMLReaderSource.flatMap(XMLSource.scala:60)
>>         at
>> org.dbpedia.extraction.mappings.Redirects$.loadFromSource(Redirects.scala:165)
>>         at
>> org.dbpedia.extraction.mappings.Redirects$.load(Redirects.scala:116)
>>         at
>> org.dbpedia.extraction.dump.extract.ConfigLoader$$anon$1.<init>(ConfigLoader.scala:96)
>>         at
>> org.dbpedia.extraction.dump.extract.ConfigLoader.org$dbpedia$extraction$dump$extract$ConfigLoader$$createExtractionJob(ConfigLoader.scala:51)
>>         at
>> org.dbpedia.extraction.dump.extract.ConfigLoader$$anonfun$getExtractionJobs$1.apply(ConfigLoader.scala:36)
>>         at
>> org.dbpedia.extraction.dump.extract.ConfigLoader$$anonfun$getExtractionJobs$1.apply(ConfigLoader.scala:36)
>>         at scala.collection.Iterator$$anon$19.next(Iterator.scala:401)
>>         at scala.collection.Iterator$class.foreach(Iterator.scala:772)
>>         at scala.collection.Iterator$$anon$19.foreach(Iterator.scala:399)
>>         at
>> scala.collection.IterableViewLike$Transformed$class.foreach(IterableViewLike.scala:41)
>>         at
>> scala.collection.IterableViewLike$$anon$3.foreach(IterableViewLike.scala:80)
>>         at
>> org.dbpedia.extraction.dump.extract.Extraction$.main(Extraction.scala:29)
>>         at
>> org.dbpedia.extraction.dump.extract.Extraction.main(Extraction.scala)
>>         ... 6 more
>> [INFO]
>> ------------------------------------------------------------------------
>> [ERROR] BUILD ERROR
>> [INFO]
>> ------------------------------------------------------------------------
>> [INFO] wrap: org.apache.commons.exec.ExecuteException: Process exited
>> with an error: 240(Exit value: 240)
>>
>> [INFO]
>> ------------------------------------------------------------------------
>> [INFO] For more information, run Maven with the -e switch
>> [INFO]
>> ------------------------------------------------------------------------
>> [INFO] Total time: 4 seconds
>> [INFO] Finished at: Sat Oct 13 23:43:05 CEST 2012
>> [INFO] Final Memory: 21M/165M
>> [INFO]
>> ------------------------------------------------------------------------
>>
>> On Wed, Oct 10, 2012 at 11:57 AM, Dimitris Kontokostas
>> <[email protected]> wrote:
>> > Hi Labra,
>> >
>> >> In fact, I was able to download the armenian files running the
>> >> following instruction:
>> >>
>> >> ../run download config=download.minimal.properties
>> >>
>> >> However, when I try to run:
>> >>
>> >> ../run extract extraction.hy.default.properties
>> >
>> >
>> > Can you try ../run extraction extraction.hydefault.properties
>> >
>> >>
>> >>
>> >> The contents of the extraction.hy.default.properties are:
>> >>
>> >> # List of languages or article count ranges, e.g. 'en,de,fr' or
>> >> '10000-20000' or '10000-', or '@mappings'
>> >> languages=10000-
>> >
>> >
>> > you should change this to languages=hy for now
>> >
>> > Besides that, everything else seems ok...
>> >
>> > Best,
>> > Dimitris
>> >
>> >>
>> >>
>> >>
>> >> --
>> >> Saludos, Labra
>> >>
>> >>
>> >>
>> >> ------------------------------------------------------------------------------
>> >> Don't let slow site performance ruin your business. Deploy New Relic
>> >> APM
>> >> Deploy New Relic app performance management and know exactly
>> >> what is happening inside your Ruby, Python, PHP, Java, and .NET app
>> >> Try New Relic at no cost today and get our sweet Data Nerd shirt too!
>> >> http://p.sf.net/sfu/newrelic-dev2dev
>> >> _______________________________________________
>> >> Dbpedia-discussion mailing list
>> >> [email protected]
>> >> https://lists.sourceforge.net/lists/listinfo/dbpedia-discussion
>> >
>> >
>> >
>> >
>> > --
>> > Kontokostas Dimitris
>>
>>
>>
>> --
>> Saludos, Labra
>
>
>
>
> --
> Kontokostas Dimitris



-- 
Saludos, Labra

Attachment: download.hy.minimal.properties
Description: Binary data

Attachment: extraction.hy.default.properties
Description: Binary data

------------------------------------------------------------------------------
Don't let slow site performance ruin your business. Deploy New Relic APM
Deploy New Relic app performance management and know exactly
what is happening inside your Ruby, Python, PHP, Java, and .NET app
Try New Relic at no cost today and get our sweet Data Nerd shirt too!
http://p.sf.net/sfu/newrelic-dev2dev
_______________________________________________
Dbpedia-discussion mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/dbpedia-discussion

Reply via email to