Well, I think I finally figured out how to get SolrEntityProcessor to work, but there are still some issues. I had to add a library path to solrconfig.xml, but the cores are finally coming up and i am now manually able to run a data import that does seem to index all of the documents on the remote SolrCloud. I ran into the issue here where I got version conflicts:
http://lucene.472066.n3.nabble.com/Version-conflict-during-data-import-from-another-Solr-instance-into-clean-Solr-td4046937.html I used the suggestion of adding fl="*,old_version:_version_" to the data-config.xml entity config line. This seems to be working but I don't know if this will cause a problem. When I do a manual data import i get the correct number of documents from the source SolrCloud (the total number of docs added up between both shards is 6357 in this test case) Indexing completed. Added/Updated: 6,357 documents. Deleted 0 documents. (Duration: 22s) Requests: 0 (0/s), Fetched: 6,357 (289/s), Skipped: 0, Processed: 6,357 However, when I check the number of docs indexed for each shard in the core admin UI on the destination SolrCloud, the numbers are way off and a lot less than 6357. Theres nothing in the logs to indicate collisions or dropped documents. What could account for the disparity? I would assume down the road what I need to do is configure multiple collections/cores on the failover cluster representing each DC its replicating from, but how would you create multiple collections when using zookeeper? How do you upload multiple sets of config files for each one and keep them separate? -- View this message in context: http://lucene.472066.n3.nabble.com/Replicating-Between-Solr-Clouds-tp4121196p4121737.html Sent from the Solr - User mailing list archive at Nabble.com.