Hello Alex,

Alex wrote:
> Jens (or anyone else who might be able to help),
> 
> Sorry to bother you again... I was still wondering if you could give me
> some recommendations for creating the Navbox template extractor, as
> explained in my previous posts. I am quite keen to write such an
> extractor, but suspect it will be a significant amount of work and would
> like to know how to approach the task.

Sorry for the very long delays. Here are some steps you need to take (it
is not hard, but please aks me if I forget to mention intermediate steps):

* checkout the latest DBpedia source code from Subversion [1]
* create a file $YourExtractor in /extraction/extractors, which
  implements the Extractor interface
 - the MediaWiki markup of an article page is passed to your extractor,
   so the implementation usually involves pattern matching etc.
 - the result is a set of RDF triples
* to test your extractor run /extraction/extract_test.php
  - before you do this, you'll have to specify article and extractor
    name in the file
  - the article will automatically be downloaded and your extractor will
    be executed within the framework [2]
  - watch whether the returned RDF triples are what you expect

The PHP commands can/should be executed on the command line. If you want
to commit your extractor to DBpedia, please make sure to test it on a
sufficient number of articles first [3], then send me a message and I
can grant you the rights to commit it to SVN.

Thanks a lot for your interest in contributing to DBpedia!

Kind regards,

Jens

[1] http://sourceforge.net/svn/?group_id=190976
[2] http://wiki.dbpedia.org/Documentation
[3] As an alternative option for more complete testing, you can also
download the latest Wikipedia dumps via /importwiki/import.php and then
run your extractor on all articles using /extraction/extract_dataset.php.

-- 
Dipl. Inf. Jens Lehmann
Department of Computer Science, University of Leipzig
Homepage: http://www.jens-lehmann.org
GPG Key: http://jens-lehmann.org/jens_lehmann.asc


------------------------------------------------------------------------------
Create and Deploy Rich Internet Apps outside the browser with Adobe(R)AIR(TM)
software. With Adobe AIR, Ajax developers can use existing skills and code to
build responsive, highly engaging applications that combine the power of local
resources and data with the reach of the web. Download the Adobe AIR SDK and
Ajax docs to start building applications today-http://p.sf.net/sfu/adobe-com
_______________________________________________
Dbpedia-discussion mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/dbpedia-discussion

Reply via email to