Hi Kasun,

why don't you use the hierarchy specified in [1]?

Wikipedia categories are already organized using skos:broader property [2]
(see [3]).
I think that should be enough to build a graph or categories links.

Moreover, if I remember correctly, the category system is not a DAG, since
there are some cycles (at least in the wikipedia version from which DBpedia
3.8 was extracted from).

Regards
Andrea

[1] http://wiki.dbpedia.org/Downloads38#categories-skos
[2] http://www.w3.org/2009/08/skos-reference/skos.html#broader
[3]
http://downloads.dbpedia.org/preview.php?file=3.8_sl_en_sl_skos_categories_en.nt.bz2


2013/6/27 kasun perera <[email protected]>

> As discussed with Marco these are the next tasks that i would be working.
>
> 1. Identification of leaf categories
> 2. Prominent leaves discovery
> 3. Pages clustering based on prominent leaves
>
> For above task 1, I'm planing to use Wikipedia category and category_links
> SQL tables available here. http://dumps.wikimedia.org/enwiki/20130604/
>
> above dump files are somewhat larger 20mb and 1.2gb in size respectively.
> I'm thinking of putting these data in to a MySql database and do the
> processing rather than process these files in-memory. Also the amount of
> leaf categories and prominent nodes would be large and need to be push to a
> MySql tables.
>
> I want to know whether this code should be write under extraction-framwork
> code,if so where should I plug this code?
> or whether is it good idea to write it separately, and push to a new repo?
> If I write it separately can I use a language other than Scala?
>
>
> --
> Regards
>
> Kasun Perera
>
>
>
> ------------------------------------------------------------------------------
> This SF.net email is sponsored by Windows:
>
> Build for Windows Store.
>
> http://p.sf.net/sfu/windows-dev2dev
> _______________________________________________
> Dbpedia-developers mailing list
> [email protected]
> https://lists.sourceforge.net/lists/listinfo/dbpedia-developers
>
>
------------------------------------------------------------------------------
This SF.net email is sponsored by Windows:

Build for Windows Store.

http://p.sf.net/sfu/windows-dev2dev
_______________________________________________
Dbpedia-developers mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/dbpedia-developers

Reply via email to