Hi There, I am a new Nutch user. I am using Nutch to crawl and then send crawl data to SOLR. I have a question about bin/nutch solrindex command. Which tika libraries are being used to index; Is it the tika libraries in Nutch or does Nutch let SOLR index so it uses Solr's tika libraries? I think I read it somewhere that Nutch is focusing on crawling and parsing and lets SOLR do the indexing so SOLR's libraries should get used.
Specifically, I am having problems in extracting tags I.e. Say <H1> from pdf files using Nutch/SOLR combination. The extract-contrib module defined in schema.xml should get used. Thanks in advance, Madhvi >

