Nutch => Solr Cell (Extract Metadata)

Felipe Barriga Richards Sun, 22 May 2011 16:19:27 -0700

Hi everyone!

Currently I'm submitting crawled pages and files to solr using "nutch
solrindex http://localhost:8983/solr/ ..." and it works. The problem is
that I need to extract metadata from PDF and MP3 files. To do this I can
submit the documents _manually_ using curl to solr
(http://localhost:8983/solr/update/extract).
Anyone knows how to configure nutch to do this ?
Maybe with chaining update processors on solr (solrconfig.xml) ?



Thanks,

Nutch => Solr Cell (Extract Metadata)

Reply via email to