Hi Neil,
On Thu, Mar 7, 2013 at 3:25 PM, Neil Ireson <[email protected]> wrote:
> Thanks for the replies,
>
> Unfortunately it's not the subject==object bug. I tried to have a look at
> the extraction code to see if I could find a fix but unfortunately it's
> written in a language I don't read. I have to say I think it's a shame that
> you chose Scala over Java code as there must be many, many more people who
> would contribute if it was written in a commonly used language such as
> Java. Anyway I don't mean to rant as you are doing an amazing job with
> DBpedia and I sincerely thank you for the hard work.
>
> In terms of contributing, I'm working on some code to clean up the skos
> category hierarchy; removing the cycles (not too hard) and multiple paths
> (quite a pain).
>
> I'll identify obvious errors (such as the self-referential
> subject==object) and possibly will use these to correct Wikipedia directly,
> and hence DBpedia (eventually).
>
> For the less obvious errors I'm trying to develop heuristics to select the
> most appropriate edge to remove.
>
> If all goes well this might end up as a paper with a proper evaluation but
> I was wondering if you'd also be interested in potentially incorporating
> the outcome into DBpedia, maybe providing a cleaned_skos_categories_en.nt?
>
We are always interested in new ways to improve the DBpedia Framework and
you are very welcome to contribute :)
Best,
Dimtiris
>
> Neil
>
>
>
>
> ------------------------------
> From: [email protected]
> Date: Thu, 7 Mar 2013 13:20:36 +0200
> To: [email protected]
> CC: [email protected]
> Subject: Re: [Dbpedia-discussion] Duplicates in skos_categories_en.nt
>
>
> Yup, I got it wrong :)
> the titles are so much alike that I got confused to think that it was a
> same subject with object bug
>
> sorry!
> Dimitris
>
>
> On Thu, Mar 7, 2013 at 1:10 PM, Jona Christopher Sahnwaldt <
> [email protected]> wrote:
>
> Hi Neil, Dimitris,
>
> if I understand Neil correctly, he means that some triples are
> duplicated. For example, the triple
>
> <http://dbpedia.org/resource/Category:10th-century_Asian_people
> <http://www.w3.org/2004/02/skos/core#broader>
> <http://dbpedia.org/resource/Category:10th_century_in_Asia> .
>
> appears twice in the file skos_categories_en.nt . Neil, is that correct?
>
> On the other hand, if I correctly understand the patch that Dimitris
> pointed to, it excludes triples where subject and object URI are
> identical, which is a different problem.
>
> Cheers,
> JC
>
>
> On Thu, Mar 7, 2013 at 9:47 AM, Dimitris Kontokostas
> <[email protected]> wrote:
> > Hi Neil,
> >
> > Thanks for the bug report, we already fixed that [1] but effects will be
> > seen in the next release
> >
> > Best,
> > Dimitris
> >
> > [1]
> >
> https://github.com/dbpedia/extraction-framework/commit/2cb7d621b45cf07c1c59638e0c2cc3fc71fa0cbb
> >
> >
> > On Wed, Mar 6, 2013 at 11:30 PM, Neil Ireson <[email protected]>
> wrote:
> >>
> >> Hi all,
> >>
> >> I'm just doing some processing of the skos_categories_en.nt file and
> >> discovered there are a number of duplicate triples, 1937 in total.
> >>
> >> For example the following are the (lexicographically) first duplicated
> >> lines:
> >>
> >> <http://dbpedia.org/resource/Category:10th-century_Asian_people>
> >> <http://www.w3.org/2004/02/skos/core#broader>
> >> <http://dbpedia.org/resource/Category:10th_century_in_Asia> .
> >> <http://dbpedia.org/resource/Category:11th-century_Asian_people>
> >> <http://www.w3.org/2004/02/skos/core#broader>
> >> <http://dbpedia.org/resource/Category:11th_century_in_Asia> .
> >> <http://dbpedia.org/resource/Category:12th-century_Asian_people>
> >> <http://www.w3.org/2004/02/skos/core#broader>
> >> <http://dbpedia.org/resource/Category:12th_century_in_Asia> .
> >> <http://dbpedia.org/resource/Category:13th-century_Asian_people>
> >> <http://www.w3.org/2004/02/skos/core#broader>
> >> <http://dbpedia.org/resource/Category:13th_century_in_Asia> .
> >> <http://dbpedia.org/resource/Category:13th-century_Byzantine_people>
> >> <http://www.w3.org/2004/02/skos/core#broader>
> >> <http://dbpedia.org/resource/Category:Byzantine_people_by_century> .
> >> <http://dbpedia.org/resource/Category:13th-century_writers>
> >> <http://www.w3.org/2004/02/skos/core#broader>
> >> <http://dbpedia.org/resource/Category:13th-century_people_by_occupation>
> .
> >> <http://dbpedia.org/resource/Category:14th-century_Asian_people>
> >> <http://www.w3.org/2004/02/skos/core#broader>
> >> <http://dbpedia.org/resource/Category:14th_century_in_Asia> .
> >>
> >> Is this a bug, or to be expected?
> >>
> >> N
> >>
> >>
> >>
> >>
> ------------------------------------------------------------------------------
> >> Symantec Endpoint Protection 12 positioned as A LEADER in The Forrester
> >> Wave(TM): Endpoint Security, Q1 2013 and "remains a good choice" in the
> >> endpoint security space. For insight on selecting the right partner to
> >> tackle endpoint security challenges, access the full report.
> >> http://p.sf.net/sfu/symantec-dev2dev
> >> _______________________________________________
> >> Dbpedia-discussion mailing list
> >> [email protected]
> >> https://lists.sourceforge.net/lists/listinfo/dbpedia-discussion
> >>
> >
> >
> >
> > --
> > Dimitris Kontokostas
> > Department of Computer Science, University of Leipzig
> > Research Group: http://aksw.org
> > Homepage:http://aksw.org/DimitrisKontokostas
> >
> >
> ------------------------------------------------------------------------------
> > Symantec Endpoint Protection 12 positioned as A LEADER in The Forrester
> > Wave(TM): Endpoint Security, Q1 2013 and "remains a good choice" in the
> > endpoint security space. For insight on selecting the right partner to
> > tackle endpoint security challenges, access the full report.
> > http://p.sf.net/sfu/symantec-dev2dev
> > _______________________________________________
> > Dbpedia-discussion mailing list
> > [email protected]
> > https://lists.sourceforge.net/lists/listinfo/dbpedia-discussion
> >
>
>
> ------------------------------------------------------------------------------
> Symantec Endpoint Protection 12 positioned as A LEADER in The Forrester
> Wave(TM): Endpoint Security, Q1 2013 and "remains a good choice" in the
> endpoint security space. For insight on selecting the right partner to
> tackle endpoint security challenges, access the full report.
> http://p.sf.net/sfu/symantec-dev2dev
> _______________________________________________
> Dbpedia-discussion mailing list
> [email protected]
> https://lists.sourceforge.net/lists/listinfo/dbpedia-discussion
>
>
>
>
> --
> Dimitris Kontokostas
> Department of Computer Science, University of Leipzig
> Research Group: http://aksw.org
> Homepage:http://aksw.org/DimitrisKontokostas
>
> ------------------------------------------------------------------------------
> Symantec Endpoint Protection 12 positioned as A LEADER in The Forrester
> Wave(TM): Endpoint Security, Q1 2013 and "remains a good choice" in the
> endpoint security space. For insight on selecting the right partner to
> tackle endpoint security challenges, access the full report.
> http://p.sf.net/sfu/symantec-dev2dev
> _______________________________________________ Dbpedia-discussion mailing
> list [email protected]
> https://lists.sourceforge.net/lists/listinfo/dbpedia-discussion
>
>
> ------------------------------------------------------------------------------
> Symantec Endpoint Protection 12 positioned as A LEADER in The Forrester
> Wave(TM): Endpoint Security, Q1 2013 and "remains a good choice" in the
> endpoint security space. For insight on selecting the right partner to
> tackle endpoint security challenges, access the full report.
> http://p.sf.net/sfu/symantec-dev2dev
> _______________________________________________
> Dbpedia-discussion mailing list
> [email protected]
> https://lists.sourceforge.net/lists/listinfo/dbpedia-discussion
>
>
--
Dimitris Kontokostas
Department of Computer Science, University of Leipzig
Research Group: http://aksw.org
Homepage:http://aksw.org/DimitrisKontokostas
------------------------------------------------------------------------------
Symantec Endpoint Protection 12 positioned as A LEADER in The Forrester
Wave(TM): Endpoint Security, Q1 2013 and "remains a good choice" in the
endpoint security space. For insight on selecting the right partner to
tackle endpoint security challenges, access the full report.
http://p.sf.net/sfu/symantec-dev2dev
_______________________________________________
Dbpedia-discussion mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/dbpedia-discussion