Hey Pablo, Pedro,

I have a Semantic Web class at my Dutch university, for which I will need
to do a project. So I may help out with this and use the opportunity to
straighten out the DB-backed indexing process a bit.

I already wrote a mail to Pedro.

Best,
Jo

On Wed, Sep 19, 2012 at 5:08 PM, Pablo N. Mendes <[email protected]>wrote:

> Hi Pedro,
> Whenever you get all the DBpedia datasets you need, it's time to run
> DBpedia Spotlight indexing.
>
> You will be glad to know that we've been working on a step-by-step guide
> to build DBpedia Spotlight for other languages. Check this out:
>
> https://github.com/dbpedia-spotlight/dbpedia-spotlight/wiki/Internationalization
>
> You should also coordinate with Dimitris Kontokostas, who has been up to
> now my contact for the Dutch DBpedia, and the one I had been including in
> the DBpedia Spotlight i18n thread. Perhaps you can help each other out.
>
> We hope to have a much improved (more automated) indexing process in the
> next couple of days, so keep in touch.
> Please join dbp-spotlight-users for questions about DBpedia Spotlight.
> https://lists.sourceforge.net/lists/listinfo/dbp-spotlight-users
>
> Cheers,
> Pablo
>
>
> On Wed, Sep 19, 2012 at 4:45 PM, Max Jakob <[email protected]> wrote:
>
>> Hi,
>>
>> On Wed, Sep 19, 2012 at 3:46 PM, Pedro Debevere <[email protected]>
>> wrote:
>> > I’m interested in creating a Dutch port of DBpedia Spotlight. In order
>> to do
>> > this, I need a disambiguation data set for Dutch. This data set is
>> currently
>> > not available for download. However, based on some messages posted here
>> [1],
>> > I suspect that the latest version of the extraction framework supports
>> this.
>> > Is this correct?
>>
>> Generally yes, if all names of disambiguation templates are specified
>> in [4]. Please also note that there seems to be an issue with multiple
>> names for disambiguation page titles in dutch. See the TODO in [5].
>>
>>
>> > As a workaround I downloaded unpacked the nl-pages-articles.xml file
>> myself
>>
>> On your first attempt, it looks like something goes wrong during
>> download. So downloading and unpacking yourself was a good idea.
>>
>>
>> > Message: expected <mediawiki> with namespace
>> > [http://www.mediawiki.org/xml/export-0.6/], found
>> > [http://www.mediawiki.org/xml/export-0.7/]
>>
>> Wikipedia seems to have changed its export format version from 0.6 to
>> 0.7. The DBpedia parser should still be able to parse the dump,
>> assuming the changes mentioned in [6]. You can try to switch to the
>> dump branch (currently the stable one) and change the line in [7] to
>>
>>   private final String _namespace = "
>> http://www.mediawiki.org/xml/export-0.7/";;
>>
>> and try again. (Call  mvn clean install  on the project root before).
>>
>>
>> Cheers,
>> Max
>>
>> [4]
>> http://dbpedia.hg.sourceforge.net/hgweb/dbpedia/extraction_framework/file/2a322c5c6692/core/src/main/scala/org/dbpedia/extraction/wikiparser/impl/wikipedia/Disambiguation.scala#l165
>> [5]
>> http://dbpedia.hg.sourceforge.net/hgweb/dbpedia/extraction_framework/file/2a322c5c6692/core/src/main/scala/org/dbpedia/extraction/config/mappings/DisambiguationExtractorConfig.scala#l16
>> [6] http://www.mediawiki.org/xml/export-0.7.xsd
>> [7]
>> http://dbpedia.hg.sourceforge.net/hgweb/dbpedia/extraction_framework/file/2a322c5c6692/core/src/main/java/org/dbpedia/extraction/sources/WikipediaDumpParser.java#l74
>>
>>
>> ------------------------------------------------------------------------------
>> Live Security Virtual Conference
>> Exclusive live event will cover all the ways today's security and
>> threat landscape has changed and how IT managers can respond. Discussions
>> will include endpoint security, mobile security and the latest in malware
>> threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
>> _______________________________________________
>> Dbpedia-discussion mailing list
>> [email protected]
>> https://lists.sourceforge.net/lists/listinfo/dbpedia-discussion
>>
>
>
>
> --
> ---
> Pablo N. Mendes
> http://pablomendes.com
> Events: http://wole2012.eurecom.fr
>
>
>
> ------------------------------------------------------------------------------
> Live Security Virtual Conference
> Exclusive live event will cover all the ways today's security and
> threat landscape has changed and how IT managers can respond. Discussions
> will include endpoint security, mobile security and the latest in malware
> threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
> _______________________________________________
> Dbpedia-discussion mailing list
> [email protected]
> https://lists.sourceforge.net/lists/listinfo/dbpedia-discussion
>
>
------------------------------------------------------------------------------
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and 
threat landscape has changed and how IT managers can respond. Discussions 
will include endpoint security, mobile security and the latest in malware 
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
_______________________________________________
Dbp-spotlight-users mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/dbp-spotlight-users

Reply via email to