On 11/2/2011 1:38 PM, Dario wrote:
> Hello,
>
> for research purposes I'm interested in differentiating DBpedia entities
> into two types, those which are actual existing elements (i.e., things
> you can see and touch) and generalizations of elements (i.e., things
> which are abstractions of existing elements).
> Examples of the first ones could be: John Turturro, the Golden Gate
> Bridge, the Enola Gay bomber.
> Examples of generalizations could be: Football, the femur, Boeing B-29
> Superfortress bomber.
This is a great research topic, but it's also of considerable
commercial importance. If one were interested in converting DBpedia
facts to text or creating a user interface, it would be good to know
about "abstract" vs "concrete"
There is the possibility of defining classes
(:SomethingThatCanHaveAMember) or (:SomethingThatIsOnlyAnInstance) but
also the possibility of defining an abstract/concrete score which is
numerical. People tend to be very concrete, but we have no idea who
:D.B._Cooper was or what his fate was. :Captain_Kirk is more abstract
than :William_Shatner. When dealing with the more difficult stuff, a
numeric score might be the best you can do.
> After reading some documentation on the DBpedia, including the latest
> article published, it looks to me that such difference is never made.
> Furthermore I wonder if it is even possible to make that difference
> based on the information available. Unfortunately, I do not know enough
> about the Wikipedia semantics to answer that.
>
> The only solution I can think of is manually tagging entities. That
> could be facilitated by grouping elements (e.g., every entity of class
> Person is an existing entity). However, other classes would require
> individual treatment.
>
> So my questions are these:
> -Is there a difference in DBpedia between existing entities and general
> entities?
> -Is there information available in the Wikipedia to make such difference?
> -Based on the DBpedia, is there any other method beyond manual tagging
> to make that difference?
> -Of the DBpedia Ontology, which classes could be considered as holding
> existing entities? Person, Place, Planet, Work, ...?
>
> I know is quite an abstract question, and not fully related with
> technical aspects of the DBpedia, but I think this is the place to ask.
>
I think the strategy of starting with types and then refining
the results is best. You could probably get a large majority of topics
properly typed, particularly if you use type information from
Freebase, which is more accurate and comprehensive than DBpedia types.
The hard ones are going to be the things that fall through the cracks in
the type system, like
http://dbpedia.org/page/Fire
but note that Freebase has 18 types for this topic, so you're not
without hope.
http://www.freebase.com/edit/topic/en/fire
Maybe it's a fair guess to say that "things that fall through the
cracks" are abstract.
I say: try the obvious thing with types, then do some evaluation. If
you're not happy with it, maybe you'll think of another heuristic
(traditional knowledge engineering) or maybe you can train a machine
learning algorithm to make the distinction. Evaluate again and repeat
until you've got enough for a paper... or a product that's "good enough
to use".
I'd love to see a Turtle file published with these classifications
because I could use them.
------------------------------------------------------------------------------
RSA(R) Conference 2012
Save $700 by Nov 18
Register now
http://p.sf.net/sfu/rsa-sfdev2dev1
_______________________________________________
Dbpedia-discussion mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/dbpedia-discussion