ok. this is a very basic question so please bear with me. I see where the velocity templates are and I have looked at the documentation and get the idea of how to write them.
it looks to me as if Solr just brings back the URLs. what I want to do is to get the actual documents in the answer set, simplify their HTML and remove all the javascript, ads, etc., and append them into a single document. Now ... does Nutch already have the documents? can I get them from its db? or do I have to go get the documents again with something like a wget? Fred On Fri, Sep 23, 2011 at 16:02, Erik Hatcher <erik.hatc...@gmail.com> wrote: > conf/velocity by default. See Solr's example configuration. > > Erik > > On Sep 23, 2011, at 12:37, Fred Zimmerman <w...@nimblebooks.com> wrote: > > > ok, answered my own question, found velocity rw in solrconfig.xml. next > > question: > > > > where does velocity look for its templates? > > > > ----------------------------------------------------- > > Subscribe to the Nimble Books Mailing List http://eepurl.com/czS- for > > monthly updates > > > > > > > > On Fri, Sep 23, 2011 at 11:57, Fred Zimmerman <w...@nimblebooks.com> > wrote: > > > >> This seems to be out of date. I am running Solr 3.4 > >> > >> * the file structure of apachehome/contrib is different and I don't see > >> velocity anywhere underneath > >> * the page referenced below only talks about Solr 1.4 and 4.0 > >> > >> ? > >> > >> On Thu, Sep 22, 2011 at 19:51, Markus Jelsma < > markus.jel...@openindex.io>wrote: > >> > >>> Hi, > >>> > >>> Solr support the Velocity template engine and has veyr good support. > Ideal > >>> for > >>> generating properly formatted output from the search engine. There's a > >>> clustering example and it's easy to format documents indexed by Nutch. > >>> > >>> http://wiki.apache.org/solr/VelocityResponseWriter > >>> > >>> Cheers > >>> > >>>>> Hi, > >>>> > >>>> I would like to take the HTML documents that are the result of a Solr > >>>> search and combine them into a single HTML document that combines the > >>> body > >>>> text of each individual document. What is a good strategy for this? I > >>> am > >>>> crawling with Nutch and Carrot2 for clustering. > >>>> Fred > >>> > >> > >> >