thanks to you both On Tue, Jul 5, 2011 at 4:35 PM, Markus Jelsma <[email protected]>wrote:
> H, > > About geographical search: Solr will do this for you. Built-in for 3.x+ and > using third-party plugins for 1.4.x. Both provide different features. In > Solr > it's you'd not base similarity on geographical data but use spatial data to > boost textual similar documents instead, or filter. > > This keeps text similarity intact and offers spatial features on top. > > You'll get more feedback on the Solr list indeed :) > > Cheers > > > Thanks for this Markus, it had occured to me that DIH was a very > plausable > > solution to progress with. I think you have just confirmed due to the > > flexibility it offers amongst other attributes. > > > > I'm looking at creating a context aware web application which would use > > geographical search to obtain results based on location. This is required > > as the data will contain (amongst others) fields with integer values > which > > vary dependent upon a building location cost index. Similarity is > directly > > linked through geographical location factor. I wanted to have the data > > stored within the n number of distributed RDB's available in a cloud > > environment which could be searched as oppose to the non-trivial task of > > searching across a fragmented distrubuted number of DB's. > > > > As you mention, it does make more sense to save documents in a doc (or > > column) oriented DB. > > > > Essentially, using the DIH tool would remove requirement for Nutch? > > > > I think to progress with this, I'm best moving the thread to Solr-user@if > > I have further questions. > > > > Thank you > > > > On Tue, Jul 5, 2011 at 3:53 PM, Markus Jelsma > <[email protected]>wrote: > > > Hi Lewis, > > > > > > It sounds to me you'd be better of using Solr's very advanced > > > DataImportHandler [1]. It can (delta) import data from your RDBMS' and > > > offers > > > much flexibility on how to transform entities. > > > > > > Besides crawling you also mentions you'd like to push results (of what) > > > to another structured data store. But why would you want that? > Handling, > > > processing and serving search results is done by Solr (and ES in the > > > future) > > > and since our entities are flat (just a document) it makes more sense > to > > > me to > > > save documents in a document (or column) oriented DB. > > > > > > [1] :http://wiki.apache.org/solr/DataImportHandler > > > > > > Cheers, > > > > > > > Hi, > > > > > > > > I'm curious to hear if anyone has information for configuring Nutch > to > > > > crawl a RDB such as MySQL. In my hypothetical example there are N > > > > number of databases residing in various distributed geographical > > > > locations, to make a worst case scenario, say that they are NOT all > > > > the same type, and > > > > > > I > > > > > > > wish to use Nutch trunk 2.0 to push the results to some other > > > > structured data store which I can then connect to to serve search > > > > results. > > > > > > > > Does anyone have any information such as an overview of database > > > > crawling and serving using Nutch? I have been unsuccesful obtaining > > > > info on the > > > > > > Web > > > > > > > as query results are ambiguous and usually refer to crawldb or > linkdb. > > > > > > > > If I can get this it would be a real nice entry for inclusion in our > > > > > > wiki. > > > > > > > Thanks for any suggestions or info. > -- *Lewis*

