Hi, It should not be so hard but it looks like the current SolrContentHandler builds up the document via SAX-events. You could pass a BoilerpipeContentHandler((ContentHandler)parsingHandler, BoilerpipeExtractor) to the parser in ExtractingDocumentLoader.java. It should work.
Markus -----Original message----- > From:Lance Norskog <goks...@gmail.com> > Sent: Thu 06-Sep-2012 05:51 > To: solr-user@lucene.apache.org > Subject: Is Boilerpipe usable through Solr ExtractingUpdateHandler or the DIH? > > Tika integrated Boilerpipe a few releases back. Is it possible to invoke it > when using the ExtractingUpdateHandler (simple Tika) or the > DataImportHandler? > > http://code.google.com/p/boilerpipe/ > > >