Hey James,
Just wondering if you ever had a chance to try out hadoop with solr? Would
appreciate any information/directions you could give.
I am particularly interested in indexing using a mapreduce job.
Cheers,
-Ali
--
View this message in context:
Hi,
We currently have a master-slave setup for solr with two slave servers. We
are using Solrj (stream-update-solr-server) to index master slave, which
takes 6 hours to index around 15 million documents.
I would like to explore hadoop, in particularly for indexing job using
mapreduce approach.
Thanks Marc,
Well I have an HBASE storage architecture and solr master-slave setup with
two slave servers.
Would this patch work with my setup? Do I need sharding in place? and what
tasks would be run at map and reduce phases?
I was thinking something like:
At Map: read documents as
Thanks guys.
I will try this with some test documents, fingers crossed.
And by the way, I got the minTokenLen parameter from one of the thread
replies (from Erik).
Cheerz,
Ali
--
View this message in context:
Hey Andrew,
Just wondering if you ever managed to run TextProfileSignature based
deduplication. I would appreciate it if you could send me the code fragment
for it from solrconfig.
I have currently something like this, but not sure if I am doing it right:
updateRequestProcessorChain