On 2/2/2011 4:10 PM, Lushan Han wrote:
> Hi,
>
> FYI, the class type dbpedia-owl:City is missing for capitals, for
> example, http://dbpedia.org/page/London.
> And the example on DBPedia website "Cities with more than 2 million
> habitants" therefore failed to give out capitals.
>
> Best regards,
> Lushan
>
     Here we go again.

     (1) There's a fundamental ontological problem here.  Technically 
London (like Tokyo) is not a city.  London is a metropolitan area that 
is composed of 33 boroughs such as Westminister,  Kensington,  Hackney 
and Camden.  The actual "City of London" is the financial district and 
is about a square mile in area.

     (2) Dbpedia has poor recall for many common types such as human 
settlements and people.  The underlying issue is that it extracts type 
information from infoboxes,  which are used inconsistently...  There 
isn't a "city infobox",  but rather,  there are different infoboxes that 
are used in different regions and different areas.  The signal is 
imperfect (many people have no infoboxes at all) and the set of rules 
that dbpedia uses to extract types is also imperfect.  The flip side is 
that the precision of types in dbpedia is absolutely excellent,  and 
I've found quite literally a handful of cases where things were mistyped 
in a blatantly wrong way.

----

     The answer to (1) in commonsense reasoning systems is to maintain 
"vernacular types" that reflect popular understandings.

      It's still tricky;  the classification of human settlements is 
difficult because there's no clear line between "city",  "town" and 
"village";  people in other language areas,  such as de,  have concepts 
that are similar but different,  such as "stadt" and "dorf".  A 
vernacular type that would work in the en-zone is to say,  "anything 
that has town in it's name is a :Town" but a place that's called a 
"Town" in the U.S. could be a small city,  a village,  a rural area 
where 20-30% of people live in a few concentrated areas (the "Town" that 
I write a tax check to every year),  or a centerless suburban or 
posturban area like Derry, N.H.

      In New York State there are approximately 20 types of local 
government,  and the law for the establishment of local governments is 
different in all 50 states of the :United_States,  and different in the 
200 or so other countries that are out there.  One could imagine a very 
detailed data model that represents this very precisely,  but it would 
be a difficult model to work with and you'd still need some kind of 
vernacular layer to make it easier to work with.

     As for (2) the easy thing to do is get your types from Freebase.  
Precision in Freebase is slightly worse than Dbpedia,  but recall is 
better by a factor of 2 or more for many types.  Freebase has used both 
machine learning and crowdsourcing techniques to produce a type system 
that's easy to work with.

------------------------------------------------------------------------------
Special Offer-- Download ArcSight Logger for FREE (a $49 USD value)!
Finally, a world-class log management solution at an even better price-free!
Download using promo code Free_Logger_4_Dev2Dev. Offer expires 
February 28th, so secure your free ArcSight Logger TODAY! 
http://p.sf.net/sfu/arcsight-sfd2d
_______________________________________________
Dbpedia-discussion mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/dbpedia-discussion

Reply via email to