Jürgen Jakobitsch wrote:
> predication : in a couple of years, everything will be rdf - but vocabularies 
> will only be understood
>               in limited geographical areas - gone be the vision of global 
> communication.
>
> http://dbpedia.org/page/New_York_City -> dbpprop:latd (xsd:integer)
> http://dbpedia.org/page/Paris -> dbpprop:latLong -> 
> dbpedia:Paris/latLong/coord -> dbpprop:coordProperty (some xsd:integer - not 
> interpretable)
> http://dbpedia.org/page/Berlin -> dbpprop:latD -> (xsd:double)
> http://dbpedia.org/page/Oslo -> dbpprop:latDeg -> (xsd:integer)
> http://dbpedia.org/page/Babylon -> geo:lat -> (xsd:float)
>
>   
    This is just the beginning of problems that you face if you try to 
do serious geospatial reasoning with dbpedia data (or even try to draw 
maps.)

    Imagine the meaning of a point coordinate for new york city,  as 
compared to a point coordinate for the statue of liberty.  The statue of 
liberty fills a footprint on the ground which is about 10 m in radius.  
It's reasonable to pretend that it's a point if you're drawing a map of 
NYC.  NYC represents a ground footprint that is more like 10 km in 
radius.  At best,  you can represent it with a centroid or a point 
that's particularly significant (Google maps,  for instance,  locates 
New York City at the 42nd and 7th intersection by the Port Authority Bus 
Terminal;)  the point for NYC is pretty much meaningless if you're 
drawing a map of the city,  but it would be useful if you were drawing a 
map of the Northeastern US.

    On top of that,  there are all kinds of errors in coordinates that 
come from Wikipedia.  Last time I looked,  I found a South Carolina 
cotton plantation out in Canada's Hudson bay in the "Wikipedia" layer of 
Google Maps.  I don't know about dbpedia 3.3,  but I do know that 
dbpedia 3.2 thinks that the Staten Island Railway is about a km east of 
where it really is,  situating it in the middle of New York Harbor 
(running right under the Tappan Zee Bridge.)

    Similarly,  there are places in wikipedia (freebase too) that have 
addresses but no coordinates.  These can be put through a geocoder and 
can have coordinates attached. 

    I've brought up different types of issues:

(i) insufficient data types:  "New York City" would be best represented 
at a region (a polygon or union of polygons) rather than as a point.
(ii) data quality:  how accurate does a point claim to be?  what do we 
do with the occasional point that's wildly wrong

    At this point in time,  it's not practical to build geospatial 
reasoning systems based on amnesiac mashups off SPARQL endpoints.

    A practical geospatial system based on linked data is going to need 
to reconcile inconsistent terminology,  clean up data,  and combine data 
from different sources.  For instance,  both Yahoo and the US Census can 
be used as a source for political area boundaries.  The problems that 
you're talking about are just the first ones that turn up on the journey ;-)



------------------------------------------------------------------------------
Enter the BlackBerry Developer Challenge  
This is your chance to win up to $100,000 in prizes! For a limited time, 
vendors submitting new applications to BlackBerry App World(TM) will have 
the opportunity to enter the BlackBerry Developer Challenge. See full prize 
details at: http://p.sf.net/sfu/blackberry
_______________________________________________
Dbpedia-discussion mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/dbpedia-discussion

Reply via email to