Is there a drastic difference between this and TagSoup which is already included in Solr?
On Fri, Jan 14, 2011 at 6:57 AM, arnaud gaudinat <arnaud.gaudi...@gmail.com>wrote: > Hello, > > I would like to use BoilerPipe (a very good program which cleans the html > content from surplus "clutter"). > I saw that BoilerPipe is inside Tika 0.8 and so should be accessible from > solr, am I right? > > How I can Activate BoilerPipe in Solr? Do I need to change solrconfig.xml ( > with org.apache.solr.handler.extraction.ExtractingRequestHandler)? > > Or do I need to modify some code inside Solr? > > I so something like TikaCLI -F in the tika forum ( > http://www.lucidimagination.com/search/document/242ce3a17f30f466/boilerpipe_integration) > is it the right way? > > Thanks in advance, > > Arno. > >