Hi Jairo,

I saw that JC updated the configuration [1] but you should checkout the
'dump' branch to get it, it is not yet merged in master.

Cheers,
Dimitris


[1]
https://github.com/dbpedia/extraction-framework/commit/f5f146f5c5b08c8087fa1a6cdf134a77f0f1c972


On Tue, Jun 4, 2013 at 9:05 PM, Jairo Francisco de Souza <
[email protected]> wrote:

> Hi Max and Pablo,
>     Thanks for your help.
>
>     I checked the parser and the Module namespace is not prepared for
> portuguese language [1, line 225].
>     However, isn't clear to me how to config a new namespace and how to
> run GenerateWikiConfig.scala, since we shouldn't modify the
> Namespaces.scala file directly. Can you point to a template or a wiki page
> with instructions?
>
> Best,
> Jairo
>
> [1]
> https://github.com/dbpedia/extraction-framework/blob/master/core/src/main/scala/org/dbpedia/extraction/wikiparser/impl/wikipedia/Namespaces.scala#L225
>
>
> On Tue, Jun 4, 2013 at 12:14 PM, Pablo N. Mendes <[email protected]>wrote:
>
>> Hi Gabriel,
>> Max has given you some pretty clear and direct pointers to where you
>> should start looking. If you have already tried a bunch of things and still
>> cannot find the problem, please describe what you have tried, ask clear and
>> direct questions, and we can try to support you to the best of our
>> availability.
>>
>> Cheers,
>> Pablo
>>
>>
>> On Tue, Jun 4, 2013 at 1:34 AM, Gabriel Oliveira <[email protected]>wrote:
>>
>>> Hello Max,
>>>
>>> I'm working with Portuguese. I'm not sure if I understand what's going
>>> on and how to solve it.
>>>
>>> Cheers,
>>> Gabriel Oliveira
>>>
>>>
>>> 2013/5/29 Max Jakob <[email protected]>
>>>
>>>> Hi Gabriel, CCing dbpedia-developers list,
>>>>
>>>> this looks like a problem with the DBpedia parser, so it is not
>>>> directly a Spotlight problem. It's related to the (fairly new?) Module
>>>> namespace in Wikipedia [1] that is not handled by the parser for all
>>>> languages yet [2]. I assume you are not working with English, French
>>>> or Hungarian dumps. For all other language, the Module namespace is
>>>> not configured yet. Which language are you working with?
>>>> If you understand what's going on, you can add the appropriate
>>>> configuration yourself and send a pull request on GitHub to the
>>>> extraction-framework repo [3]. Otherwise, the developer community
>>>> might be able to help you.
>>>> After this is corrected, install the extraction-framework in your
>>>> local Maven repo by running mvn clean install. Afterwards, do the same
>>>> for Spotlight again. Finally, re-attempt to run the indexing.
>>>>
>>>> Cheers,
>>>> Max
>>>>
>>>> [1] http://en.wikipedia.org/wiki/Wikipedia:Namespace
>>>> [2]
>>>> https://github.com/dbpedia/extraction-framework/blob/master/core/src/main/scala/org/dbpedia/extraction/wikiparser/impl/wikipedia/Namespaces.scala
>>>> [3] https://github.com/dbpedia/extraction-framework
>>>>
>>>>
>>>> On Wed, May 29, 2013 at 10:27 PM, Gabriel Oliveira <
>>>> [email protected]> wrote:
>>>> > Hello Max,
>>>> >
>>>> > I did as you told me and I have managed to fix some problems. I am
>>>> still
>>>> > learning how to use Maven and IntelliJ, therefore I have missed a few
>>>> > details and it took me a while to realize that, but now I have made
>>>> some
>>>> > progress.
>>>> >
>>>> > Different from the last attempts though, now it has run for almost an
>>>> hour
>>>> > and has saved about 10800000 occurrences. However, after these
>>>> occurrences
>>>> > are saved an exception is thrown, and now I believe it is not my fault
>>>> > anymore.
>>>> > The output is as follows:
>>>> >
>>>> >  INFO 2013-05-29 16:55:58,441 main [AllOccurrenceSource$] - Processed
>>>> > 1300000 Wikipedia definition pages (average 9.73 links per page)
>>>> >  INFO 2013-05-29 16:56:23,092 main [FileOccurrenceSource$] -   saved
>>>> > 10600000 occurrences
>>>> >  INFO 2013-05-29 16:57:10,162 main [FileOccurrenceSource$] -   saved
>>>> > 10700000 occurrences
>>>> >  INFO 2013-05-29 16:58:00,311 main [FileOccurrenceSource$] -   saved
>>>> > 10800000 occurrences
>>>> > java.lang.reflect.InvocationTargetException
>>>> >
>>>> >     at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>>>> >     at
>>>> >
>>>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>>>> >     at
>>>> >
>>>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>>>> >     at java.lang.reflect.Method.invoke(Method.java:601)
>>>> >     at scala_maven_executions.MainHelper.runMain(MainHelper.java:164)
>>>> >     at
>>>> >
>>>> scala_maven_executions.MainWithArgsInFile.main(MainWithArgsInFile.java:26)
>>>> > Caused by: java.util.NoSuchElementException: key not found: 828
>>>> >     at scala.collection.MapLike$class.default(MapLike.scala:225)
>>>> >     at scala.collection.immutable.HashMap.default(HashMap.scala:38)
>>>> >     at scala.collection.MapLike$class.apply(MapLike.scala:135)
>>>> >     at scala.collection.immutable.HashMap.apply(HashMap.scala:38)
>>>> >     at
>>>> >
>>>> org.dbpedia.extraction.sources.WikipediaDumpParser.readPage(WikipediaDumpParser.java:218)
>>>> >     at
>>>> >
>>>> org.dbpedia.extraction.sources.WikipediaDumpParser.readPages(WikipediaDumpParser.java:179)
>>>> >     at
>>>> >
>>>> org.dbpedia.extraction.sources.WikipediaDumpParser.readDump(WikipediaDumpParser.java:137)
>>>> >     at
>>>> >
>>>> org.dbpedia.extraction.sources.WikipediaDumpParser.run(WikipediaDumpParser.java:108)
>>>> >     at
>>>> >
>>>> org.dbpedia.extraction.sources.XMLReaderSource.foreach(XMLSource.scala:57)
>>>> >     at
>>>> >
>>>> org.dbpedia.spotlight.io.AllOccurrenceSource$AllOccurrenceSource.foreach(AllOccurrenceSource.scala:80)
>>>> >     at
>>>> >
>>>> org.dbpedia.spotlight.filter.Filter$FilteredOccs.foreach(Filter.scala:58)
>>>> >     at
>>>> >
>>>> org.dbpedia.spotlight.filter.Filter$FilteredOccs.foreach(Filter.scala:58)
>>>> >     at
>>>> >
>>>> org.dbpedia.spotlight.filter.Filter$FilteredOccs.foreach(Filter.scala:58)
>>>> >     at
>>>> >
>>>> org.dbpedia.spotlight.io.FileOccurrenceSource$.writeToFile(FileOccurrenceSource.scala:57)
>>>> >     at
>>>> >
>>>> org.dbpedia.spotlight.lucene.index.ExtractOccsFromWikipedia$.main(ExtractOccsFromWikipedia.scala:82)
>>>> >     at
>>>> >
>>>> org.dbpedia.spotlight.lucene.index.ExtractOccsFromWikipedia.main(ExtractOccsFromWikipedia.scala)
>>>> >     ... 6 more
>>>> > [INFO]
>>>> >
>>>> ------------------------------------------------------------------------
>>>> > [INFO] BUILD FAILURE
>>>> > [INFO]
>>>> >
>>>> ------------------------------------------------------------------------
>>>> > [INFO] Total time: 56:48.220s
>>>> > [INFO] Finished at: Wed May 29 16:58:17 BRT 2013
>>>> > [INFO] Final Memory: 11M/216M
>>>> > [INFO]
>>>> >
>>>> ------------------------------------------------------------------------
>>>> >
>>>> > [ERROR] Failed to execute goal
>>>> > net.alchim31.maven:scala-maven-plugin:3.1.0:run (default-cli) on
>>>> project
>>>> > index: wrap: org.apache.commons.exec.ExecuteException: Process exited
>>>> with
>>>> > an error: 240(Exit value: 240) -> [Help 1]
>>>> > [ERROR]
>>>> > [ERROR] To see the full stack trace of the errors, re-run Maven with
>>>> the -e
>>>> > switch.
>>>> > [ERROR] Re-run Maven using the -X switch to enable full debug logging.
>>>> >
>>>> > [ERROR]
>>>> > [ERROR] For more information about the errors and possible solutions,
>>>> please
>>>> > read the following articles:
>>>> > [ERROR] [Help 1]
>>>> >
>>>> http://cwiki.apache.org/confluence/display/MAVEN/MojoExecutionException
>>>> >
>>>> > I will run it again with the -X switch and will send you the full
>>>> output as
>>>> > soon as it finishes. Probably about an hour from now.
>>>> >
>>>> > I really appreciate your support.
>>>> >
>>>> > Cheers,
>>>> > Gabriel Oliveira
>>>>
>>>
>>>
>>>
>>> ------------------------------------------------------------------------------
>>> How ServiceNow helps IT people transform IT departments:
>>> 1. A cloud service to automate IT design, transition and operations
>>> 2. Dashboards that offer high-level views of enterprise services
>>> 3. A single system of record for all IT processes
>>> http://p.sf.net/sfu/servicenow-d2d-j
>>>
>>> _______________________________________________
>>> Dbpedia-developers mailing list
>>> [email protected]
>>> https://lists.sourceforge.net/lists/listinfo/dbpedia-developers
>>>
>>>
>>
>>
>> --
>>
>> Pablo N. Mendes
>> http://pablomendes.com
>>
>>
>> ------------------------------------------------------------------------------
>> How ServiceNow helps IT people transform IT departments:
>> 1. A cloud service to automate IT design, transition and operations
>> 2. Dashboards that offer high-level views of enterprise services
>> 3. A single system of record for all IT processes
>> http://p.sf.net/sfu/servicenow-d2d-j
>> _______________________________________________
>> Dbpedia-developers mailing list
>> [email protected]
>> https://lists.sourceforge.net/lists/listinfo/dbpedia-developers
>>
>>
>
>
> ------------------------------------------------------------------------------
> How ServiceNow helps IT people transform IT departments:
> 1. A cloud service to automate IT design, transition and operations
> 2. Dashboards that offer high-level views of enterprise services
> 3. A single system of record for all IT processes
> http://p.sf.net/sfu/servicenow-d2d-j
> _______________________________________________
> Dbpedia-developers mailing list
> [email protected]
> https://lists.sourceforge.net/lists/listinfo/dbpedia-developers
>
>


-- 
Kontokostas Dimitris
------------------------------------------------------------------------------
How ServiceNow helps IT people transform IT departments:
1. A cloud service to automate IT design, transition and operations
2. Dashboards that offer high-level views of enterprise services
3. A single system of record for all IT processes
http://p.sf.net/sfu/servicenow-d2d-j
_______________________________________________
Dbpedia-developers mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/dbpedia-developers

Reply via email to