That should be a couple of CLI commands.
Select all predicates that are not subjects in the dataset file
----
Send from my mobile
Στις 28 Ιουν 2013 11:26 π.μ., ο χρήστης "kasun perera" <
[email protected]> έγραψε:
> Hi Alessio
>
> On Thu, Jun 27, 2013 at 1:42 PM, Alessio Palmero Aprosio
> <[email protected]>wrote:
>
>> Dear Kasun,
>> I had to deal with the same problem some months ago,
>>
>
> Just curious about how did you stored the edges and vertices relationships
> when processing the categories.
> In-memory processing would be difficult since it has a huge number
> of edges and vertices, so I think it's good to store them in a database.
> I have heard about graph databases[1], but haven't worked with them. Did
> you use something like that or simple mysql database?
>
> [1]http://en.wikipedia.org/wiki/Graph_database
>
>
>> and I managed to use the XML article file: you can intercept categories
>> using the "Category:" prefix, and you can infer father-son relation using
>> the <title> tag (if the <title> starts with "Category:", all the categories
>> for this page are possible ancestors).
>> The Wikipedia category taxonomy is quite a mess, so good luck!
>>
>> Alessio
>>
>>
>> Il 27/06/13 05:24, kasun perera ha scritto:
>>
>> As discussed with Marco these are the next tasks that i would be
>> working.
>>
>> 1. Identification of leaf categories
>> 2. Prominent leaves discovery
>> 3. Pages clustering based on prominent leaves
>>
>> For above task 1, I'm planing to use Wikipedia category and
>> category_links SQL tables available here.
>> http://dumps.wikimedia.org/enwiki/20130604/
>>
>> above dump files are somewhat larger 20mb and 1.2gb in size
>> respectively.
>> I'm thinking of putting these data in to a MySql database and do the
>> processing rather than process these files in-memory. Also the amount of
>> leaf categories and prominent nodes would be large and need to be push to a
>> MySql tables.
>>
>> I want to know whether this code should be write under
>> extraction-framwork code,if so where should I plug this code?
>> or whether is it good idea to write it separately, and push to a new
>> repo? If I write it separately can I use a language other than Scala?
>>
>>
>> --
>> Regards
>>
>> Kasun Perera
>>
>>
>>
>> ------------------------------------------------------------------------------
>> This SF.net email is sponsored by Windows:
>>
>> Build for Windows Store.
>> http://p.sf.net/sfu/windows-dev2dev
>>
>>
>>
>> _______________________________________________
>> Dbpedia-developers mailing
>> [email protected]https://lists.sourceforge.net/lists/listinfo/dbpedia-developers
>>
>>
>>
>
>
> --
> Regards
>
> Kasun Perera
>
>
>
> ------------------------------------------------------------------------------
> This SF.net email is sponsored by Windows:
>
> Build for Windows Store.
>
> http://p.sf.net/sfu/windows-dev2dev
> _______________________________________________
> Dbpedia-developers mailing list
> [email protected]
> https://lists.sourceforge.net/lists/listinfo/dbpedia-developers
>
>
------------------------------------------------------------------------------
This SF.net email is sponsored by Windows:
Build for Windows Store.
http://p.sf.net/sfu/windows-dev2dev
_______________________________________________
Dbpedia-developers mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/dbpedia-developers