Re: [Dbpedia-discussion] Links to Geonames

2010-04-20 Thread Carlo Brooks
I see.  Is there a way to access the svn directly as opposed to downloading
http://downloads.dbpedia.org/3.5/links/geonames_links.nt.bz2 ?

I am not certain there is a better file, but if there is I would just like
to know where I can access it...

Thanks much,


Carlo


On Mon, Apr 19, 2010 at 10:45 PM, Jens Lehmann 
lehm...@informatik.uni-leipzig.de wrote:


 Hello,

 Carlo Brooks wrote:
  I'm a little perplexed re links to geonames:
  http://wiki.dbpedia.org/Downloads35#linkstogeonames
 
  I downloaded
  http://downloads.dbpedia.org/3.5/links/geonames_links.nt.bz2 and see it
  has only 86,547 records; this is a very small subset of Wikipedia.
 
  I joined Geonames and Wikipedia beaches on name and lat/long and got +
  30 matches.  Why would these not be included?

 For some link datasets in DBpedia, there is no proper update mechanism
 included in the DBpedia SVN repository. In such cases, the link data
 sets are copied from the previous release. For Geonames, this means that
 the links you see were not recently updated (and can be as old as one or
 two years).

 The best way to improve the situation, in case up-to-date links are
 important for you, is to add a script (or a SILK file etc.), which can
 efficiently compute the links between the two data sets to the SVN
 repository, such that it can be run regularly.

 Kind regards,

 Jens

 --
 Dipl. Inf. Jens Lehmann
 Department of Computer Science, University of Leipzig
 Homepage: http://www.jens-lehmann.org
 GPG Key: http://jens-lehmann.org/jens_lehmann.asc



 --
 Download Intel#174; Parallel Studio Eval
 Try the new software tools for yourself. Speed compiling, find bugs
 proactively, and fine-tune applications for parallel performance.
 See why Intel Parallel Studio got high marks during beta.
 http://p.sf.net/sfu/intel-sw-dev
 ___
 Dbpedia-discussion mailing list
 Dbpedia-discussion@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/dbpedia-discussion

--
Download Intel#174; Parallel Studio Eval
Try the new software tools for yourself. Speed compiling, find bugs
proactively, and fine-tune applications for parallel performance.
See why Intel Parallel Studio got high marks during beta.
http://p.sf.net/sfu/intel-sw-dev___
Dbpedia-discussion mailing list
Dbpedia-discussion@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dbpedia-discussion


Re: [Dbpedia-discussion] Links to Geonames

2010-04-20 Thread Jens Lehmann

Hello,

Carlo Brooks wrote:
 I see.  Is there a way to access the svn directly as opposed to 
 downloading http://downloads.dbpedia.org/3.5/links/geonames_links.nt.bz2 ?
 
 I am not certain there is a better file, but if there is I would just 
 like to know where I can access it...

SVN can be accessed as follows:
http://sourceforge.net/scm/?type=svngroup_id=190976

However, the SVN contains the extraction framework (and not the data 
sets generated by it), so you won't find another Geonames link file there.

Kind regards,

Jens

-- 
Dipl. Inf. Jens Lehmann
Department of Computer Science, University of Leipzig
Homepage: http://www.jens-lehmann.org
GPG Key: http://jens-lehmann.org/jens_lehmann.asc


--
___
Dbpedia-discussion mailing list
Dbpedia-discussion@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dbpedia-discussion


Re: [Dbpedia-discussion] Links to Geonames

2010-04-20 Thread Tom Morris
On Tue, Apr 20, 2010 at 1:45 AM, Jens Lehmann
lehm...@informatik.uni-leipzig.de wrote:

 For some link datasets in DBpedia, there is no proper update mechanism
 included in the DBpedia SVN repository. In such cases, the link data
 sets are copied from the previous release. For Geonames, this means that
 the links you see were not recently updated (and can be as old as one or
 two years).

Is there a list someplace of who is responsible for each of these link
sets and when they were last updated?

I think I remember reading somewhere that the Freebase links were in a
similar situation.  Also, the last time the links were done they were
made to the GUID form of the Freebase identifier, which I'm not sure
is the best target (conversely, Freebase generates DBpedia for *every*
Wikipedia article name, including redirects and misspellings, which
doesn't seem right either).

Tom

--
___
Dbpedia-discussion mailing list
Dbpedia-discussion@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dbpedia-discussion


Re: [Dbpedia-discussion] Links to Geonames

2010-04-20 Thread Jens Lehmann

Hello,

Tom Morris wrote:
 On Tue, Apr 20, 2010 at 1:45 AM, Jens Lehmann
 lehm...@informatik.uni-leipzig.de wrote:
 
 For some link datasets in DBpedia, there is no proper update mechanism
 included in the DBpedia SVN repository. In such cases, the link data
 sets are copied from the previous release. For Geonames, this means that
 the links you see were not recently updated (and can be as old as one or
 two years).
 
 Is there a list someplace of who is responsible for each of these link
 sets and when they were last updated?

If you go to the download page and click on a data set, you get some 
information (or scroll to the bottom of the page):
http://wiki.dbpedia.org/Downloads35

To see whether data sets have changed compared to previous releases, you 
can go to http://downloads.dbpedia.org/ and compare different releases.

Please note that within the last year the extraction framework was 
rewritten and the live extraction was implemented. It's difficult to 
improve all aspects of DBpedia within a short timeframe and most 
interlinking data sets were never designed for long term maintenance, 
but rather one time efforts. (Anyone is invited to contribute mapping 
code to DBpedia, of course, to improve the situation.)

Kind regards,

Jens

-- 
Dipl. Inf. Jens Lehmann
Department of Computer Science, University of Leipzig
Homepage: http://www.jens-lehmann.org
GPG Key: http://jens-lehmann.org/jens_lehmann.asc


--
___
Dbpedia-discussion mailing list
Dbpedia-discussion@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dbpedia-discussion


Re: [Dbpedia-discussion] Links to Geonames

2010-04-20 Thread Tom Morris
On Tue, Apr 20, 2010 at 3:22 PM, Jens Lehmann
lehm...@informatik.uni-leipzig.de wrote:

 Hello,

 Tom Morris wrote:
 On Tue, Apr 20, 2010 at 1:45 AM, Jens Lehmann
 lehm...@informatik.uni-leipzig.de wrote:

 For some link datasets in DBpedia, there is no proper update mechanism
 included in the DBpedia SVN repository. In such cases, the link data
 sets are copied from the previous release. For Geonames, this means that
 the links you see were not recently updated (and can be as old as one or
 two years).

 Is there a list someplace of who is responsible for each of these link
 sets and when they were last updated?

 If you go to the download page and click on a data set, you get some
 information (or scroll to the bottom of the page):
 http://wiki.dbpedia.org/Downloads35

Thanks. I'd seen that.  I was hoping for something more along the
lines of an email address or a person's name.
The entries in question are:

Links to Freebase - Links between DBpedia and Freebase. Update
mechanism: unclear/copy over from previous release.

Links to Geonames - Links between geographic places in DBpedia and
data about them in the Geonames database. Provided by the Geonames
people. Update mechanism: unclear/copy over from previous release.

 Please note that within the last year the extraction framework was
 rewritten and the live extraction was implemented. It's difficult to
 improve all aspects of DBpedia within a short timeframe and most
 interlinking data sets were never designed for long term maintenance,
 but rather one time efforts. (Anyone is invited to contribute mapping
 code to DBpedia, of course, to improve the situation.)

I'm willing to help out with that, but it would seem like the people
who did the original mappings are likely to have knowledge, and
perhaps even code, from their previous efforts which would be highly
applicable to the task.  Is all that knowledge really lost forever?

Tom

--
___
Dbpedia-discussion mailing list
Dbpedia-discussion@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dbpedia-discussion


Re: [Dbpedia-discussion] Links to Geonames

2010-04-20 Thread Kingsley Idehen
Tom Morris wrote:
 On Tue, Apr 20, 2010 at 1:45 AM, Jens Lehmann
 lehm...@informatik.uni-leipzig.de wrote:

   
 For some link datasets in DBpedia, there is no proper update mechanism
 included in the DBpedia SVN repository. In such cases, the link data
 sets are copied from the previous release. For Geonames, this means that
 the links you see were not recently updated (and can be as old as one or
 two years).
 

 Is there a list someplace of who is responsible for each of these link
 sets and when they were last updated?

 I think I remember reading somewhere that the Freebase links were in a
 similar situation.  Also, the last time the links were done they were
 made to the GUID form of the Freebase identifier, which I'm not sure
 is the best target (conversely, Freebase generates DBpedia for *every*
 Wikipedia article name, including redirects and misspellings, which
 doesn't seem right either).

 Tom

 --
 ___
 Dbpedia-discussion mailing list
 Dbpedia-discussion@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/dbpedia-discussion

   
Both,

Geonames and DBpedia data exist in one location at: 
http://lod.openlinksw.com/ (/sparql or /isparql or /fct).

Does that endpoint deliver what you seek?


-- 

Regards,

Kingsley Idehen   
President  CEO 
OpenLink Software 
Web: http://www.openlinksw.com
Weblog: http://www.openlinksw.com/blog/~kidehen
Twitter/Identi.ca: kidehen 






--
___
Dbpedia-discussion mailing list
Dbpedia-discussion@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dbpedia-discussion


[Dbpedia-discussion] Links to Geonames

2010-04-19 Thread Carlo Brooks
I'm a little perplexed re links to geonames:
http://wiki.dbpedia.org/Downloads35#linkstogeonames

I downloaded http://downloads.dbpedia.org/3.5/links/geonames_links.nt.bz2and
see it has only 86,547 records; this is a very small subset of
Wikipedia.

I joined Geonames and Wikipedia beaches on name and lat/long and got + 30
matches.  Why would these not be included?

Am I missing something?

Thanks for any help!

Carlo
--
Download Intel#174; Parallel Studio Eval
Try the new software tools for yourself. Speed compiling, find bugs
proactively, and fine-tune applications for parallel performance.
See why Intel Parallel Studio got high marks during beta.
http://p.sf.net/sfu/intel-sw-dev___
Dbpedia-discussion mailing list
Dbpedia-discussion@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dbpedia-discussion