Hi everyone!

Currently I'm submitting crawled pages and files to solr using "nutch
solrindex http://localhost:8983/solr/ ..." and it works. The problem is
that I need to extract metadata from PDF and MP3 files. To do this I can
submit the documents _manually_ using curl to solr
(http://localhost:8983/solr/update/extract).
Anyone knows how to configure nutch to do this ?
Maybe with chaining update processors on solr (solrconfig.xml) ?


Thanks,

Reply via email to