Re: [Dbpedia-discussion] Accuracy of coordinates in dbpedia/wikipedia & freebase
Il 29/09/2010 10:10, Jens Lehmann ha scritto: > You can check this by comparing the regular DBpedia and DBpedia Live: > > http://dbpedia.org/page/Oakville_Assembly > http://dbpedia-live.openlinksw.com/page/Oakville_Assembly > > Indeed, the coordinates have changed, so there was probably an error in > Wikipedia. imho I don't think that comparing DBpedia with its live version is the solution. In addition to having to double-check each resource, if something is different in the live version, it does not mean that surely there was an error in the previous version: the error could be in the live version. I remember some time ago, I was looking for the DBpedia live page about the programming language PHP, and the abstract reported: "PHP: Problèmes d'Hygiène Personnelle". :-) Unfortunately vandalism is quite widespread in WIkipedia. Probably, at least nowadays, there are better datasets than DBpedia that offer more accurate geolocation, as you pointed out with LinkedGeoData. Besides the problem of geographical coordinates, as I mentioned in the previous message, I think it's a problem doing "reasoning" on resources belonging to two different datasets, linked through owl: sameAs, and having inconsistent information. Probably reasoning on Linked Data (I mean, not a single dataset) is not yet completely mature. -- Regards, Roberto Mirizzi PhD Student at Politecnico di Bari (Italy) http://sisinflab.poliba.it/mirizzi -- Start uncovering the many advantages of virtual appliances and start using them to simplify application deployment and accelerate your shift to cloud computing. http://p.sf.net/sfu/novell-sfdev2dev ___ Dbpedia-discussion mailing list Dbpedia-discussion@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dbpedia-discussion
Re: [Dbpedia-discussion] Accuracy of coordinates in dbpedia/wikipedia & freebase
Hello Paul, On 27.09.2010 22:42, Paul Houle wrote: > My guess is that > wikipedia might have been wrong at one time, and has had it corrected. You can check this by comparing the regular DBpedia and DBpedia Live: http://dbpedia.org/page/Oakville_Assembly http://dbpedia-live.openlinksw.com/page/Oakville_Assembly Indeed, the coordinates have changed, so there was probably an error in Wikipedia. > From my viewpoint, I'd like to make a map that doesn't have > embarassing errors in it... What's the best way to clean up this mess? Consider using LinkedGeoData [1]. As member of both projects (DBpedia and LinkedGeoData) I am quite sure that the latter has more accurate coordinates. LinkedGeoData also has DBpedia links, although they need (and will be) updated, because of some changes in LinkedGeoData. Note that for large objects (e.g. Russia and other countries) both LinkedGeoData/OpenStreetMap and DBpedia/Wikipedia contain a reference point, which is a good representative for this object. However, there is no strict mathematical definition how to compute it, which means that reference points in LinkedGeoData and DBpedia do not necessarily coincide. Kind regards, Jens [1] http://linkedgeodata.org -- Dipl. Inf. Jens Lehmann Department of Computer Science, University of Leipzig Homepage: http://www.jens-lehmann.org GPG Key: http://jens-lehmann.org/jens_lehmann.asc -- Start uncovering the many advantages of virtual appliances and start using them to simplify application deployment and accelerate your shift to cloud computing. http://p.sf.net/sfu/novell-sfdev2dev ___ Dbpedia-discussion mailing list Dbpedia-discussion@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dbpedia-discussion
Re: [Dbpedia-discussion] Accuracy of coordinates in dbpedia/wikipedia & freebase
On 9/28/10 3:43 AM, Roberto Mirizzi wrote: >Il 28/09/2010 00:00, Kingsley Idehen wrote: >> You have two data spaces: DBpedia and Freebase, you should make a third >> -- yours, which I think you have via ookaboo. >> >> Place the fixed (cleansed) data in your ookaboo data space, connect the >> coreferenced entities using an "owl:sameAs" relation, scope queries that >> are accuracy sensitive to your ookaboo data space. Use inference rules >> for union expansion across DBpedia and Freebase via "owl:sameAs", when >> data quality requirements are low and data expanse requirements high. >> >> That's how you clean up the mess and potentially get compensated for >> doing so, in the process :-) >> > I think I'm missing something with this approach. If you link, with > owl:sameAs, a resource with e.g., the corresponding one in DBpedia, and > the two resources have two different coordinates, it would lead to an > inconsistency due to the semantics of owl:sameAs, isn't it? > After all, overcoming this problem is one of the reasons for the Okkam > Project [1]. > So, how do you plan to deal with this issue? > > [1] http://www.okkam.org/okkam-more No, you have a 3rd data space, and in that data space you fix the coordinates. You simply use owl:sameAs to keep links to the other data spaces that provide access to broader pool of data modulo questionable geo coordinates etc.. The owner of the third data space is providing specific value i.e., accurate coordinates. I think the single canonical URI misconception is what's clouding the obvious re. what I am suggesting :-) There are going to be lots of data spaces on the Web that focus on the quality aspects of Linked Data. The owners of these data spaces, will also discover intriguing business models in this area. -- Regards, Kingsley Idehen President& CEO OpenLink Software Web: http://www.openlinksw.com Weblog: http://www.openlinksw.com/blog/~kidehen Twitter/Identi.ca: kidehen -- Start uncovering the many advantages of virtual appliances and start using them to simplify application deployment and accelerate your shift to cloud computing. http://p.sf.net/sfu/novell-sfdev2dev ___ Dbpedia-discussion mailing list Dbpedia-discussion@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dbpedia-discussion
Re: [Dbpedia-discussion] Accuracy of coordinates in dbpedia/wikipedia & freebase
Il 28/09/2010 00:00, Kingsley Idehen wrote: > You have two data spaces: DBpedia and Freebase, you should make a third > -- yours, which I think you have via ookaboo. > > Place the fixed (cleansed) data in your ookaboo data space, connect the > coreferenced entities using an "owl:sameAs" relation, scope queries that > are accuracy sensitive to your ookaboo data space. Use inference rules > for union expansion across DBpedia and Freebase via "owl:sameAs", when > data quality requirements are low and data expanse requirements high. > > That's how you clean up the mess and potentially get compensated for > doing so, in the process :-) > I think I'm missing something with this approach. If you link, with owl:sameAs, a resource with e.g., the corresponding one in DBpedia, and the two resources have two different coordinates, it would lead to an inconsistency due to the semantics of owl:sameAs, isn't it? After all, overcoming this problem is one of the reasons for the Okkam Project [1]. So, how do you plan to deal with this issue? [1] http://www.okkam.org/okkam-more -- Regards, Roberto Mirizzi PhD Student at Politecnico di Bari (Italy) http://sisinflab.poliba.it/mirizzi -- Start uncovering the many advantages of virtual appliances and start using them to simplify application deployment and accelerate your shift to cloud computing. http://p.sf.net/sfu/novell-sfdev2dev ___ Dbpedia-discussion mailing list Dbpedia-discussion@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dbpedia-discussion
Re: [Dbpedia-discussion] Accuracy of coordinates in dbpedia/wikipedia & freebase
On 9/27/10 4:42 PM, Paul Houle wrote: >I've recently put up a site that uses coordinate information from > Freebase and Dbpedia, and I'm starting to think about how to clean up > certain data quality problems I'm encountering, for instance, see: > > http://ookaboo.com/o/pictures/topic/209440/Oakville_Assembly > > In this particular case, I've only got data from dbpedia, which drops > the point a few hundred km from where it really is... It's obvious that > this is a bad one because it's right in the middle of Lake Erie. > Freebase doesn't have any coordinate for this thing (seems to me that it > should), and at the moment, Wikipedia has the right coordinates (at > least on Google maps I see a big factory building) My guess is that > wikipedia might have been wrong at one time, and has had it corrected. > It's also possible that the conversion wasn't done right in dbpedia, > since coordinates are represented differently in a few hundred different > infoboxes. > > It seems to me that both the number of points and the quality of points > in Wikipedia has been improving dramatically over the last two years... > About a year ago I plotted the points for Staten Island Railroad > stations and found that the railroad was displaced a few km east and ran > right under the middle of the Tapan Zee bridge... Now it's much better. > > I can find examples where: > > (a) dbpedia is right and freebase is wrong (for instance, a town in > continental Europe gets its longitude sign flipped and ends up with the > wrecked ships west of the UK -- maybe here the point got fixed in > wikipedia but not in freebase) > (b) dbpedia is wrong and freebase is right > (c) a point is missing from dbpedia but is in freebase (I see a lot of > these in Switzerland), and > (d) a point is missing from freebase but in dbpedia > > An analysis of this is is tricky because there are a lot of things where > the coordinates are iffy: the location of 'Russia' could vary within a > few thousand kilometers, 'Tompkins County' could vary by ten or so > kilometers, etc. > > Looking at a handful of points that have diverged, I get the impression > that freebase is more accurate than dbpedia, but that I get better > results just looking at the coordinates on the human interface of > wikipedia -- currently, it seems like a scan of the current coordinates > in wikipedia (however wikipedia extracts them from the infoboxes) > benefits the most from the human labor being done to fix points and also > avoids errors& missed points from other people's extraction pipelines. > > From my viewpoint, I'd like to make a map that doesn't have > embarassing errors in it... What's the best way to clean up this mess? You have two data spaces: DBpedia and Freebase, you should make a third -- yours, which I think you have via ookaboo. Place the fixed (cleansed) data in your ookaboo data space, connect the coreferenced entities using an "owl:sameAs" relation, scope queries that are accuracy sensitive to your ookaboo data space. Use inference rules for union expansion across DBpedia and Freebase via "owl:sameAs", when data quality requirements are low and data expanse requirements high. That's how you clean up the mess and potentially get compensated for doing so, in the process :-) Kingsley > -- > Start uncovering the many advantages of virtual appliances > and start using them to simplify application deployment and > accelerate your shift to cloud computing. > http://p.sf.net/sfu/novell-sfdev2dev > ___ > Dbpedia-discussion mailing list > Dbpedia-discussion@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/dbpedia-discussion > -- Regards, Kingsley Idehen President& CEO OpenLink Software Web: http://www.openlinksw.com Weblog: http://www.openlinksw.com/blog/~kidehen Twitter/Identi.ca: kidehen -- Start uncovering the many advantages of virtual appliances and start using them to simplify application deployment and accelerate your shift to cloud computing. http://p.sf.net/sfu/novell-sfdev2dev ___ Dbpedia-discussion mailing list Dbpedia-discussion@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dbpedia-discussion