On 2/2/2011 4:10 PM, Lushan Han wrote:
> Hi,
>
> FYI, the class type dbpedia-owl:City is missing for capitals, for
> example, http://dbpedia.org/page/London.
> And the example on DBPedia website "Cities with more than 2 million
> habitants" therefore failed to give out capitals.
>
> Best regards,
> Lushan
>
Here we go again.
(1) There's a fundamental ontological problem here. Technically
London (like Tokyo) is not a city. London is a metropolitan area that
is composed of 33 boroughs such as Westminister, Kensington, Hackney
and Camden. The actual "City of London" is the financial district and
is about a square mile in area.
(2) Dbpedia has poor recall for many common types such as human
settlements and people. The underlying issue is that it extracts type
information from infoboxes, which are used inconsistently... There
isn't a "city infobox", but rather, there are different infoboxes that
are used in different regions and different areas. The signal is
imperfect (many people have no infoboxes at all) and the set of rules
that dbpedia uses to extract types is also imperfect. The flip side is
that the precision of types in dbpedia is absolutely excellent, and
I've found quite literally a handful of cases where things were mistyped
in a blatantly wrong way.
----
The answer to (1) in commonsense reasoning systems is to maintain
"vernacular types" that reflect popular understandings.
It's still tricky; the classification of human settlements is
difficult because there's no clear line between "city", "town" and
"village"; people in other language areas, such as de, have concepts
that are similar but different, such as "stadt" and "dorf". A
vernacular type that would work in the en-zone is to say, "anything
that has town in it's name is a :Town" but a place that's called a
"Town" in the U.S. could be a small city, a village, a rural area
where 20-30% of people live in a few concentrated areas (the "Town" that
I write a tax check to every year), or a centerless suburban or
posturban area like Derry, N.H.
In New York State there are approximately 20 types of local
government, and the law for the establishment of local governments is
different in all 50 states of the :United_States, and different in the
200 or so other countries that are out there. One could imagine a very
detailed data model that represents this very precisely, but it would
be a difficult model to work with and you'd still need some kind of
vernacular layer to make it easier to work with.
As for (2) the easy thing to do is get your types from Freebase.
Precision in Freebase is slightly worse than Dbpedia, but recall is
better by a factor of 2 or more for many types. Freebase has used both
machine learning and crowdsourcing techniques to produce a type system
that's easy to work with.
------------------------------------------------------------------------------
Special Offer-- Download ArcSight Logger for FREE (a $49 USD value)!
Finally, a world-class log management solution at an even better price-free!
Download using promo code Free_Logger_4_Dev2Dev. Offer expires
February 28th, so secure your free ArcSight Logger TODAY!
http://p.sf.net/sfu/arcsight-sfd2d
_______________________________________________
Dbpedia-discussion mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/dbpedia-discussion