I highly recommend Chapter 6 of the Linked Data book which details different design approaches for Linked Data applications - sections 6.3 (http://linkeddatabook.com/editions/1.0/#htoc84) summarises the approaches as:
1. Crawling Pattern 2. On-the-fly dereferencing pattern 3. Query federation pattern Generally my view would be that (1) and (2) are viable approaches for different applications, but that (3) is generally a bad idea (having been through federated search before!) Owen Owen Stephens Owen Stephens Consulting Web: http://www.ostephens.com Email: o...@ostephens.com Telephone: 0121 288 6936 > On 26 Feb 2015, at 14:40, Eric Lease Morgan <emor...@nd.edu> wrote: > > On Feb 25, 2015, at 2:48 PM, Esmé Cowles <escow...@ticklefish.org> wrote: > >>> In the non-techie library world, linked data is being talked about (perhaps >>> only in listserv traffic) as if the data (bibliographic data, for instance) >>> will reside on remote sites (as a SPARQL endpoint??? We don't know the >>> technical implications of that), and be displayed by <your local >>> catalog/the centralized inter-national catalog> by calling data from that >>> remote site. But the original question was how the data on those remote >>> sites would be <access points> - how can I start my search by searching for >>> that remote content? I assume there has to be a database implementation >>> that visits that data and pre-indexes it for it to be searchable, and >>> therefore the index has to be local (or global a la Google or OCLC or its >>> bibliographic-linked-data equivalent). >> >> I think there are several options for how this works, and different >> applications may take different approaches. The most basic approach would >> be to just include the URIs in your local system and retrieve them any time >> you wanted to work with them. But the performance of that would be >> terrible, and your application would stop working if it couldn't retrieve >> the URIs. >> >> So there are lots of different approaches (which could be combined): >> >> - Retrieve the URIs the first time, and then cache them locally. >> - Download an entire data dump of the remote vocabulary and host it locally. >> - Add text fields in parallel to the URIs, so you at least have a label for >> it. >> - Index the data in Solr, Elasticsearch, etc. and use that most of the time, >> esp. for read-only operations. > > > Yes, exactly. I believe Esmé has articulated the possible solutions well. > escowles++ —ELM