On 2/3/11 10:03 AM, Paul Houle wrote:
>    On 2/2/2011 4:10 PM, Lushan Han wrote:
>> Hi,
>>
>> FYI, the class type dbpedia-owl:City is missing for capitals, for
>> example, http://dbpedia.org/page/London.
>> And the example on DBPedia website "Cities with more than 2 million
>> habitants" therefore failed to give out capitals.
>>
>> Best regards,
>> Lushan
>>
>       Here we go again.
>
>       (1) There's a fundamental ontological problem here.  Technically
> London (like Tokyo) is not a city.  London is a metropolitan area that
> is composed of 33 boroughs such as Westminister,  Kensington,  Hackney
> and Camden.  The actual "City of London" is the financial district and
> is about a square mile in area.
>
>       (2) Dbpedia has poor recall for many common types such as human
> settlements and people.  The underlying issue is that it extracts type
> information from infoboxes,  which are used inconsistently...  There
> isn't a "city infobox",  but rather,  there are different infoboxes that
> are used in different regions and different areas.  The signal is
> imperfect (many people have no infoboxes at all) and the set of rules
> that dbpedia uses to extract types is also imperfect.  The flip side is
> that the precision of types in dbpedia is absolutely excellent,  and
> I've found quite literally a handful of cases where things were mistyped
> in a blatantly wrong way.

So why don't you make a linkset that addresses these issues? You can 
tweak the DBpedia TBox or make your own. I can load it into a Named 
Graph distinct from the main DBpedia graph. Then it can be evaluated en 
route to becoming part of the main Graph, if you choose.

I performed a similar exercise [1]  (which I hope becomes the norm) with 
@danbri a few days ago. This process is a nice stop-gap while Wikipedia 
evolves re. structured data.

> ----
>
>       The answer to (1) in commonsense reasoning systems is to maintain
> "vernacular types" that reflect popular understandings.
>
>        It's still tricky;  the classification of human settlements is
> difficult because there's no clear line between "city",  "town" and
> "village";  people in other language areas,  such as de,  have concepts
> that are similar but different,  such as "stadt" and "dorf".  A
> vernacular type that would work in the en-zone is to say,  "anything
> that has town in it's name is a :Town" but a place that's called a
> "Town" in the U.S. could be a small city,  a village,  a rural area
> where 20-30% of people live in a few concentrated areas (the "Town" that
> I write a tax check to every year),  or a centerless suburban or
> posturban area like Derry, N.H.
>
>        In New York State there are approximately 20 types of local
> government,  and the law for the establishment of local governments is
> different in all 50 states of the :United_States,  and different in the
> 200 or so other countries that are out there.  One could imagine a very
> detailed data model that represents this very precisely,  but it would
> be a difficult model to work with and you'd still need some kind of
> vernacular layer to make it easier to work with.
>
>       As for (2) the easy thing to do is get your types from Freebase.
> Precision in Freebase is slightly worse than Dbpedia,  but recall is
> better by a factor of 2 or more for many types.  Freebase has used both
> machine learning and crowdsourcing techniques to produce a type system
> that's easy to work with.

Yes, so make a linkbase for now as I suggested.

Links:

1. http://danbri.org/words/2011/02/01/658


Kingsley

> ------------------------------------------------------------------------------
> Special Offer-- Download ArcSight Logger for FREE (a $49 USD value)!
> Finally, a world-class log management solution at an even better price-free!
> Download using promo code Free_Logger_4_Dev2Dev. Offer expires
> February 28th, so secure your free ArcSight Logger TODAY!
> http://p.sf.net/sfu/arcsight-sfd2d
> _______________________________________________
> Dbpedia-discussion mailing list
> [email protected]
> https://lists.sourceforge.net/lists/listinfo/dbpedia-discussion
>


-- 

Regards,

Kingsley Idehen 
President&  CEO
OpenLink Software
Web: http://www.openlinksw.com
Weblog: http://www.openlinksw.com/blog/~kidehen
Twitter/Identi.ca: kidehen






------------------------------------------------------------------------------
The modern datacenter depends on network connectivity to access resources
and provide services. The best practices for maximizing a physical server's
connectivity to a physical network are well understood - see how these
rules translate into the virtual world? 
http://p.sf.net/sfu/oracle-sfdevnlfb
_______________________________________________
Dbpedia-discussion mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/dbpedia-discussion

Reply via email to