Hi Dimitris.
Thank you.
I checked the pt_namespace and it seems OK. I'll use it.
Cheers,
Jairo
On Tue, Jun 4, 2013 at 3:20 PM, Dimitris Kontokostas <[email protected]>wrote:
> Hi Jairo,
>
> I saw that JC updated the configuration [1] but you should checkout the
> 'dump' branch to get it, it is not yet merged in master.
>
> Cheers,
> Dimitris
>
>
> [1]
> https://github.com/dbpedia/extraction-framework/commit/f5f146f5c5b08c8087fa1a6cdf134a77f0f1c972
>
>
> On Tue, Jun 4, 2013 at 9:05 PM, Jairo Francisco de Souza <
> [email protected]> wrote:
>
>> Hi Max and Pablo,
>> Thanks for your help.
>>
>> I checked the parser and the Module namespace is not prepared for
>> portuguese language [1, line 225].
>> However, isn't clear to me how to config a new namespace and how to
>> run GenerateWikiConfig.scala, since we shouldn't modify the
>> Namespaces.scala file directly. Can you point to a template or a wiki page
>> with instructions?
>>
>> Best,
>> Jairo
>>
>> [1]
>> https://github.com/dbpedia/extraction-framework/blob/master/core/src/main/scala/org/dbpedia/extraction/wikiparser/impl/wikipedia/Namespaces.scala#L225
>>
>>
>> On Tue, Jun 4, 2013 at 12:14 PM, Pablo N. Mendes
>> <[email protected]>wrote:
>>
>>> Hi Gabriel,
>>> Max has given you some pretty clear and direct pointers to where you
>>> should start looking. If you have already tried a bunch of things and still
>>> cannot find the problem, please describe what you have tried, ask clear and
>>> direct questions, and we can try to support you to the best of our
>>> availability.
>>>
>>> Cheers,
>>> Pablo
>>>
>>>
>>> On Tue, Jun 4, 2013 at 1:34 AM, Gabriel Oliveira
>>> <[email protected]>wrote:
>>>
>>>> Hello Max,
>>>>
>>>> I'm working with Portuguese. I'm not sure if I understand what's going
>>>> on and how to solve it.
>>>>
>>>> Cheers,
>>>> Gabriel Oliveira
>>>>
>>>>
>>>> 2013/5/29 Max Jakob <[email protected]>
>>>>
>>>>> Hi Gabriel, CCing dbpedia-developers list,
>>>>>
>>>>> this looks like a problem with the DBpedia parser, so it is not
>>>>> directly a Spotlight problem. It's related to the (fairly new?) Module
>>>>> namespace in Wikipedia [1] that is not handled by the parser for all
>>>>> languages yet [2]. I assume you are not working with English, French
>>>>> or Hungarian dumps. For all other language, the Module namespace is
>>>>> not configured yet. Which language are you working with?
>>>>> If you understand what's going on, you can add the appropriate
>>>>> configuration yourself and send a pull request on GitHub to the
>>>>> extraction-framework repo [3]. Otherwise, the developer community
>>>>> might be able to help you.
>>>>> After this is corrected, install the extraction-framework in your
>>>>> local Maven repo by running mvn clean install. Afterwards, do the same
>>>>> for Spotlight again. Finally, re-attempt to run the indexing.
>>>>>
>>>>> Cheers,
>>>>> Max
>>>>>
>>>>> [1] http://en.wikipedia.org/wiki/Wikipedia:Namespace
>>>>> [2]
>>>>> https://github.com/dbpedia/extraction-framework/blob/master/core/src/main/scala/org/dbpedia/extraction/wikiparser/impl/wikipedia/Namespaces.scala
>>>>> [3] https://github.com/dbpedia/extraction-framework
>>>>>
>>>>>
>>>>> On Wed, May 29, 2013 at 10:27 PM, Gabriel Oliveira <
>>>>> [email protected]> wrote:
>>>>> > Hello Max,
>>>>> >
>>>>> > I did as you told me and I have managed to fix some problems. I am
>>>>> still
>>>>> > learning how to use Maven and IntelliJ, therefore I have missed a few
>>>>> > details and it took me a while to realize that, but now I have made
>>>>> some
>>>>> > progress.
>>>>> >
>>>>> > Different from the last attempts though, now it has run for almost
>>>>> an hour
>>>>> > and has saved about 10800000 occurrences. However, after these
>>>>> occurrences
>>>>> > are saved an exception is thrown, and now I believe it is not my
>>>>> fault
>>>>> > anymore.
>>>>> > The output is as follows:
>>>>> >
>>>>> > INFO 2013-05-29 16:55:58,441 main [AllOccurrenceSource$] - Processed
>>>>> > 1300000 Wikipedia definition pages (average 9.73 links per page)
>>>>> > INFO 2013-05-29 16:56:23,092 main [FileOccurrenceSource$] - saved
>>>>> > 10600000 occurrences
>>>>> > INFO 2013-05-29 16:57:10,162 main [FileOccurrenceSource$] - saved
>>>>> > 10700000 occurrences
>>>>> > INFO 2013-05-29 16:58:00,311 main [FileOccurrenceSource$] - saved
>>>>> > 10800000 occurrences
>>>>> > java.lang.reflect.InvocationTargetException
>>>>> >
>>>>> > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>>>>> > at
>>>>> >
>>>>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>>>>> > at
>>>>> >
>>>>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>>>>> > at java.lang.reflect.Method.invoke(Method.java:601)
>>>>> > at scala_maven_executions.MainHelper.runMain(MainHelper.java:164)
>>>>> > at
>>>>> >
>>>>> scala_maven_executions.MainWithArgsInFile.main(MainWithArgsInFile.java:26)
>>>>> > Caused by: java.util.NoSuchElementException: key not found: 828
>>>>> > at scala.collection.MapLike$class.default(MapLike.scala:225)
>>>>> > at scala.collection.immutable.HashMap.default(HashMap.scala:38)
>>>>> > at scala.collection.MapLike$class.apply(MapLike.scala:135)
>>>>> > at scala.collection.immutable.HashMap.apply(HashMap.scala:38)
>>>>> > at
>>>>> >
>>>>> org.dbpedia.extraction.sources.WikipediaDumpParser.readPage(WikipediaDumpParser.java:218)
>>>>> > at
>>>>> >
>>>>> org.dbpedia.extraction.sources.WikipediaDumpParser.readPages(WikipediaDumpParser.java:179)
>>>>> > at
>>>>> >
>>>>> org.dbpedia.extraction.sources.WikipediaDumpParser.readDump(WikipediaDumpParser.java:137)
>>>>> > at
>>>>> >
>>>>> org.dbpedia.extraction.sources.WikipediaDumpParser.run(WikipediaDumpParser.java:108)
>>>>> > at
>>>>> >
>>>>> org.dbpedia.extraction.sources.XMLReaderSource.foreach(XMLSource.scala:57)
>>>>> > at
>>>>> >
>>>>> org.dbpedia.spotlight.io.AllOccurrenceSource$AllOccurrenceSource.foreach(AllOccurrenceSource.scala:80)
>>>>> > at
>>>>> >
>>>>> org.dbpedia.spotlight.filter.Filter$FilteredOccs.foreach(Filter.scala:58)
>>>>> > at
>>>>> >
>>>>> org.dbpedia.spotlight.filter.Filter$FilteredOccs.foreach(Filter.scala:58)
>>>>> > at
>>>>> >
>>>>> org.dbpedia.spotlight.filter.Filter$FilteredOccs.foreach(Filter.scala:58)
>>>>> > at
>>>>> >
>>>>> org.dbpedia.spotlight.io.FileOccurrenceSource$.writeToFile(FileOccurrenceSource.scala:57)
>>>>> > at
>>>>> >
>>>>> org.dbpedia.spotlight.lucene.index.ExtractOccsFromWikipedia$.main(ExtractOccsFromWikipedia.scala:82)
>>>>> > at
>>>>> >
>>>>> org.dbpedia.spotlight.lucene.index.ExtractOccsFromWikipedia.main(ExtractOccsFromWikipedia.scala)
>>>>> > ... 6 more
>>>>> > [INFO]
>>>>> >
>>>>> ------------------------------------------------------------------------
>>>>> > [INFO] BUILD FAILURE
>>>>> > [INFO]
>>>>> >
>>>>> ------------------------------------------------------------------------
>>>>> > [INFO] Total time: 56:48.220s
>>>>> > [INFO] Finished at: Wed May 29 16:58:17 BRT 2013
>>>>> > [INFO] Final Memory: 11M/216M
>>>>> > [INFO]
>>>>> >
>>>>> ------------------------------------------------------------------------
>>>>> >
>>>>> > [ERROR] Failed to execute goal
>>>>> > net.alchim31.maven:scala-maven-plugin:3.1.0:run (default-cli) on
>>>>> project
>>>>> > index: wrap: org.apache.commons.exec.ExecuteException: Process
>>>>> exited with
>>>>> > an error: 240(Exit value: 240) -> [Help 1]
>>>>> > [ERROR]
>>>>> > [ERROR] To see the full stack trace of the errors, re-run Maven with
>>>>> the -e
>>>>> > switch.
>>>>> > [ERROR] Re-run Maven using the -X switch to enable full debug
>>>>> logging.
>>>>> >
>>>>> > [ERROR]
>>>>> > [ERROR] For more information about the errors and possible
>>>>> solutions, please
>>>>> > read the following articles:
>>>>> > [ERROR] [Help 1]
>>>>> >
>>>>> http://cwiki.apache.org/confluence/display/MAVEN/MojoExecutionException
>>>>> >
>>>>> > I will run it again with the -X switch and will send you the full
>>>>> output as
>>>>> > soon as it finishes. Probably about an hour from now.
>>>>> >
>>>>> > I really appreciate your support.
>>>>> >
>>>>> > Cheers,
>>>>> > Gabriel Oliveira
>>>>>
>>>>
>>>>
>>>>
>>>> ------------------------------------------------------------------------------
>>>> How ServiceNow helps IT people transform IT departments:
>>>> 1. A cloud service to automate IT design, transition and operations
>>>> 2. Dashboards that offer high-level views of enterprise services
>>>> 3. A single system of record for all IT processes
>>>> http://p.sf.net/sfu/servicenow-d2d-j
>>>>
>>>> _______________________________________________
>>>> Dbpedia-developers mailing list
>>>> [email protected]
>>>> https://lists.sourceforge.net/lists/listinfo/dbpedia-developers
>>>>
>>>>
>>>
>>>
>>> --
>>>
>>> Pablo N. Mendes
>>> http://pablomendes.com
>>>
>>>
>>> ------------------------------------------------------------------------------
>>> How ServiceNow helps IT people transform IT departments:
>>> 1. A cloud service to automate IT design, transition and operations
>>> 2. Dashboards that offer high-level views of enterprise services
>>> 3. A single system of record for all IT processes
>>> http://p.sf.net/sfu/servicenow-d2d-j
>>> _______________________________________________
>>> Dbpedia-developers mailing list
>>> [email protected]
>>> https://lists.sourceforge.net/lists/listinfo/dbpedia-developers
>>>
>>>
>>
>>
>> ------------------------------------------------------------------------------
>> How ServiceNow helps IT people transform IT departments:
>> 1. A cloud service to automate IT design, transition and operations
>> 2. Dashboards that offer high-level views of enterprise services
>> 3. A single system of record for all IT processes
>> http://p.sf.net/sfu/servicenow-d2d-j
>> _______________________________________________
>> Dbpedia-developers mailing list
>> [email protected]
>> https://lists.sourceforge.net/lists/listinfo/dbpedia-developers
>>
>>
>
>
> --
> Kontokostas Dimitris
>
------------------------------------------------------------------------------
How ServiceNow helps IT people transform IT departments:
1. A cloud service to automate IT design, transition and operations
2. Dashboards that offer high-level views of enterprise services
3. A single system of record for all IT processes
http://p.sf.net/sfu/servicenow-d2d-j
_______________________________________________
Dbpedia-developers mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/dbpedia-developers