Re: [Dbpedia-discussion] Links to Geonames
I see. Is there a way to access the svn directly as opposed to downloading http://downloads.dbpedia.org/3.5/links/geonames_links.nt.bz2 ? I am not certain there is a better file, but if there is I would just like to know where I can access it... Thanks much, Carlo On Mon, Apr 19, 2010 at 10:45 PM, Jens Lehmann lehm...@informatik.uni-leipzig.de wrote: Hello, Carlo Brooks wrote: I'm a little perplexed re links to geonames: http://wiki.dbpedia.org/Downloads35#linkstogeonames I downloaded http://downloads.dbpedia.org/3.5/links/geonames_links.nt.bz2 and see it has only 86,547 records; this is a very small subset of Wikipedia. I joined Geonames and Wikipedia beaches on name and lat/long and got + 30 matches. Why would these not be included? For some link datasets in DBpedia, there is no proper update mechanism included in the DBpedia SVN repository. In such cases, the link data sets are copied from the previous release. For Geonames, this means that the links you see were not recently updated (and can be as old as one or two years). The best way to improve the situation, in case up-to-date links are important for you, is to add a script (or a SILK file etc.), which can efficiently compute the links between the two data sets to the SVN repository, such that it can be run regularly. Kind regards, Jens -- Dipl. Inf. Jens Lehmann Department of Computer Science, University of Leipzig Homepage: http://www.jens-lehmann.org GPG Key: http://jens-lehmann.org/jens_lehmann.asc -- Download Intel#174; Parallel Studio Eval Try the new software tools for yourself. Speed compiling, find bugs proactively, and fine-tune applications for parallel performance. See why Intel Parallel Studio got high marks during beta. http://p.sf.net/sfu/intel-sw-dev ___ Dbpedia-discussion mailing list Dbpedia-discussion@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dbpedia-discussion -- Download Intel#174; Parallel Studio Eval Try the new software tools for yourself. Speed compiling, find bugs proactively, and fine-tune applications for parallel performance. See why Intel Parallel Studio got high marks during beta. http://p.sf.net/sfu/intel-sw-dev___ Dbpedia-discussion mailing list Dbpedia-discussion@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dbpedia-discussion
Re: [Dbpedia-discussion] Links to Geonames
Hello, Carlo Brooks wrote: I see. Is there a way to access the svn directly as opposed to downloading http://downloads.dbpedia.org/3.5/links/geonames_links.nt.bz2 ? I am not certain there is a better file, but if there is I would just like to know where I can access it... SVN can be accessed as follows: http://sourceforge.net/scm/?type=svngroup_id=190976 However, the SVN contains the extraction framework (and not the data sets generated by it), so you won't find another Geonames link file there. Kind regards, Jens -- Dipl. Inf. Jens Lehmann Department of Computer Science, University of Leipzig Homepage: http://www.jens-lehmann.org GPG Key: http://jens-lehmann.org/jens_lehmann.asc -- ___ Dbpedia-discussion mailing list Dbpedia-discussion@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dbpedia-discussion
Re: [Dbpedia-discussion] Links to Geonames
On Tue, Apr 20, 2010 at 1:45 AM, Jens Lehmann lehm...@informatik.uni-leipzig.de wrote: For some link datasets in DBpedia, there is no proper update mechanism included in the DBpedia SVN repository. In such cases, the link data sets are copied from the previous release. For Geonames, this means that the links you see were not recently updated (and can be as old as one or two years). Is there a list someplace of who is responsible for each of these link sets and when they were last updated? I think I remember reading somewhere that the Freebase links were in a similar situation. Also, the last time the links were done they were made to the GUID form of the Freebase identifier, which I'm not sure is the best target (conversely, Freebase generates DBpedia for *every* Wikipedia article name, including redirects and misspellings, which doesn't seem right either). Tom -- ___ Dbpedia-discussion mailing list Dbpedia-discussion@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dbpedia-discussion
Re: [Dbpedia-discussion] Links to Geonames
Hello, Tom Morris wrote: On Tue, Apr 20, 2010 at 1:45 AM, Jens Lehmann lehm...@informatik.uni-leipzig.de wrote: For some link datasets in DBpedia, there is no proper update mechanism included in the DBpedia SVN repository. In such cases, the link data sets are copied from the previous release. For Geonames, this means that the links you see were not recently updated (and can be as old as one or two years). Is there a list someplace of who is responsible for each of these link sets and when they were last updated? If you go to the download page and click on a data set, you get some information (or scroll to the bottom of the page): http://wiki.dbpedia.org/Downloads35 To see whether data sets have changed compared to previous releases, you can go to http://downloads.dbpedia.org/ and compare different releases. Please note that within the last year the extraction framework was rewritten and the live extraction was implemented. It's difficult to improve all aspects of DBpedia within a short timeframe and most interlinking data sets were never designed for long term maintenance, but rather one time efforts. (Anyone is invited to contribute mapping code to DBpedia, of course, to improve the situation.) Kind regards, Jens -- Dipl. Inf. Jens Lehmann Department of Computer Science, University of Leipzig Homepage: http://www.jens-lehmann.org GPG Key: http://jens-lehmann.org/jens_lehmann.asc -- ___ Dbpedia-discussion mailing list Dbpedia-discussion@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dbpedia-discussion
Re: [Dbpedia-discussion] Links to Geonames
On Tue, Apr 20, 2010 at 3:22 PM, Jens Lehmann lehm...@informatik.uni-leipzig.de wrote: Hello, Tom Morris wrote: On Tue, Apr 20, 2010 at 1:45 AM, Jens Lehmann lehm...@informatik.uni-leipzig.de wrote: For some link datasets in DBpedia, there is no proper update mechanism included in the DBpedia SVN repository. In such cases, the link data sets are copied from the previous release. For Geonames, this means that the links you see were not recently updated (and can be as old as one or two years). Is there a list someplace of who is responsible for each of these link sets and when they were last updated? If you go to the download page and click on a data set, you get some information (or scroll to the bottom of the page): http://wiki.dbpedia.org/Downloads35 Thanks. I'd seen that. I was hoping for something more along the lines of an email address or a person's name. The entries in question are: Links to Freebase - Links between DBpedia and Freebase. Update mechanism: unclear/copy over from previous release. Links to Geonames - Links between geographic places in DBpedia and data about them in the Geonames database. Provided by the Geonames people. Update mechanism: unclear/copy over from previous release. Please note that within the last year the extraction framework was rewritten and the live extraction was implemented. It's difficult to improve all aspects of DBpedia within a short timeframe and most interlinking data sets were never designed for long term maintenance, but rather one time efforts. (Anyone is invited to contribute mapping code to DBpedia, of course, to improve the situation.) I'm willing to help out with that, but it would seem like the people who did the original mappings are likely to have knowledge, and perhaps even code, from their previous efforts which would be highly applicable to the task. Is all that knowledge really lost forever? Tom -- ___ Dbpedia-discussion mailing list Dbpedia-discussion@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dbpedia-discussion
Re: [Dbpedia-discussion] Links to Geonames
Tom Morris wrote: On Tue, Apr 20, 2010 at 1:45 AM, Jens Lehmann lehm...@informatik.uni-leipzig.de wrote: For some link datasets in DBpedia, there is no proper update mechanism included in the DBpedia SVN repository. In such cases, the link data sets are copied from the previous release. For Geonames, this means that the links you see were not recently updated (and can be as old as one or two years). Is there a list someplace of who is responsible for each of these link sets and when they were last updated? I think I remember reading somewhere that the Freebase links were in a similar situation. Also, the last time the links were done they were made to the GUID form of the Freebase identifier, which I'm not sure is the best target (conversely, Freebase generates DBpedia for *every* Wikipedia article name, including redirects and misspellings, which doesn't seem right either). Tom -- ___ Dbpedia-discussion mailing list Dbpedia-discussion@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dbpedia-discussion Both, Geonames and DBpedia data exist in one location at: http://lod.openlinksw.com/ (/sparql or /isparql or /fct). Does that endpoint deliver what you seek? -- Regards, Kingsley Idehen President CEO OpenLink Software Web: http://www.openlinksw.com Weblog: http://www.openlinksw.com/blog/~kidehen Twitter/Identi.ca: kidehen -- ___ Dbpedia-discussion mailing list Dbpedia-discussion@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dbpedia-discussion
[Dbpedia-discussion] Links to Geonames
I'm a little perplexed re links to geonames: http://wiki.dbpedia.org/Downloads35#linkstogeonames I downloaded http://downloads.dbpedia.org/3.5/links/geonames_links.nt.bz2and see it has only 86,547 records; this is a very small subset of Wikipedia. I joined Geonames and Wikipedia beaches on name and lat/long and got + 30 matches. Why would these not be included? Am I missing something? Thanks for any help! Carlo -- Download Intel#174; Parallel Studio Eval Try the new software tools for yourself. Speed compiling, find bugs proactively, and fine-tune applications for parallel performance. See why Intel Parallel Studio got high marks during beta. http://p.sf.net/sfu/intel-sw-dev___ Dbpedia-discussion mailing list Dbpedia-discussion@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dbpedia-discussion