On 11/2/2011 1:38 PM, Dario wrote:
> Hello,
>
> for research purposes I'm interested in differentiating DBpedia entities
> into two types, those which are actual existing elements (i.e., things
> you can see and touch) and generalizations of elements (i.e., things
> which are abstractions of existing elements).
> Examples of the first ones could be: John Turturro, the Golden Gate
> Bridge, the Enola Gay bomber.
> Examples of generalizations could be: Football, the femur, Boeing B-29
> Superfortress bomber.
     This is a great research topic,  but it's also of considerable 
commercial importance.  If one were interested in converting DBpedia 
facts to text or creating a user interface,  it would be good to know 
about "abstract" vs "concrete"

     There is the possibility of defining classes 
(:SomethingThatCanHaveAMember) or (:SomethingThatIsOnlyAnInstance) but 
also the possibility of defining an abstract/concrete score which is 
numerical.  People tend to be very concrete,  but we have no idea who 
:D.B._Cooper was or what his fate was.  :Captain_Kirk is more abstract 
than :William_Shatner.  When dealing with the more difficult stuff,  a 
numeric score might be the best you can do.
> After reading some documentation on the DBpedia, including the latest
> article published, it looks to me that such difference is never made.
> Furthermore I wonder if it is even possible to make that difference
> based on the information available. Unfortunately, I do not know enough
> about the Wikipedia semantics to answer that.
>
> The only solution I can think of is manually tagging entities. That
> could be facilitated by grouping elements (e.g., every entity of class
> Person is an existing entity). However, other classes would require
> individual treatment.
>
> So my questions are these:
> -Is there a difference in DBpedia between existing entities and general
> entities?
> -Is there information available in the Wikipedia to make such difference?
> -Based on the DBpedia, is there any other method beyond manual tagging
> to make that difference?
> -Of the DBpedia Ontology, which classes could be considered as holding
> existing entities? Person, Place, Planet, Work, ...?
>
> I know is quite an abstract question, and not fully related with
> technical aspects of the DBpedia, but I think this is the place to ask.
>
         I think the strategy of starting with types and then refining 
the results is best.  You could probably get a large majority of topics 
properly typed,  particularly if you use type information from 
Freebase,  which is more accurate and comprehensive than DBpedia types.  
The hard ones are going to be the things that fall through the cracks in 
the type system,  like

http://dbpedia.org/page/Fire

but note that Freebase has 18 types for this topic,  so you're not 
without hope.

http://www.freebase.com/edit/topic/en/fire

Maybe it's a fair guess to say that "things that fall through the 
cracks" are abstract.

I say:  try the obvious thing with types,  then do some evaluation.  If 
you're not happy with it,  maybe you'll think of another heuristic 
(traditional knowledge engineering) or maybe you can train a machine 
learning algorithm to make the distinction.  Evaluate again and repeat 
until you've got enough for a paper...  or a product that's "good enough 
to use".

I'd love to see a Turtle file published with these classifications 
because I could use them.




------------------------------------------------------------------------------
RSA(R) Conference 2012
Save $700 by Nov 18
Register now
http://p.sf.net/sfu/rsa-sfdev2dev1
_______________________________________________
Dbpedia-discussion mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/dbpedia-discussion

Reply via email to