Hi,

On Tue, Jul 19, 2011 at 03:19, Tommy Chheng <[email protected]> wrote:
> I'm trying to use the WikiParser to determine the category list of a
> wikipedia page.

You can use org.dbpedia.extraction.mappings.ArticleCategoriesExtractor
[1] for this task. It extracts triples with dc:subject as predicate.


> The category tags are represented as TextNode objects but when I print out
> the toWikiText, it get an empty string. Should categories be "TextNodes" and
> if so, what's the correct extract the category name from the wikipage?

The category tags are actually InternalLinkNodes. That might have been
the problem in your provided code.


Cheers,
Max

[1] 
http://dbpedia.hg.sourceforge.net/hgweb/dbpedia/extraction_framework/file/3ea1a79638a1/core/src/main/scala/org/dbpedia/extraction/mappings/ArticleCategoriesExtractor.scala

------------------------------------------------------------------------------
Got Input?   Slashdot Needs You.
Take our quick survey online.  Come on, we don't ask for help often.
Plus, you'll get a chance to win $100 to spend on ThinkGeek.
http://p.sf.net/sfu/slashdot-survey
_______________________________________________
Dbpedia-discussion mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/dbpedia-discussion

Reply via email to