On Tue, Dec 7, 2010 at 9:12 AM, Mark <markjl...@gmail.com> wrote: [...] > What I'm trying to do is extract some (presumably) structured information > from non-uniform data (eg, prices from a nutch crawl) that needs to show in > search queries, and I've come up against a wall. > > I've been unable to figure out where is the best place to begin. > > I had a look through the solr wiki and did a search via Lucid's search tool > and I'm guessing this is handled at index time through my schema? But I've > also seen dismax being thrown around as a possible solution and this has > confused me. > > Basically, if you guys could point me in the right direction for resources > (even as much as saying, you need X, it's over there) that would be a huge > help. [...]
Sorry, the above is a little unclear, at least to me. The basic steps in running Solr are: * Installing, configuring, and getting Solr running * Indexing data, as also updating, and deleting: The best way to do this depends on where your data are coming from. Since you mention Nutch, that already integrates with Solr, although by default in a manner that dumps the entire content from a crawl into a Solr field. You will probably need to write a custom Nutch parser plugin in order to extract a subset from the content. Please see http://wiki.apache.org/nutch/RunningNutchAndSolr * Searching through Solr A good way of getting started is by going through the Solr tutorial: http://lucene.apache.org/solr/tutorial.html . The Solr Wiki is also fairly extensive: http://wiki.apache.org/solr/FrontPage . Finally, searching Google for "solr getting started" turns up many likely-looking links. Regards, Gora