Hi, I work on a python Solr Client <http://solrclient.readthedocs.io/en/latest/> library and there is a reindexing helper module that you can use if you are on Solr 4.9+. I use it all the time and I think it works pretty well. You can re-index all documents from a collection into another collection or dump them to the filesystem as JSON. It also supports parallel execution and can run independently on each shard. There is also a way to resume if your job craps out half way through if your existing schema is set up with a good date field and unique id.
You can read the documentation here: http://solrclient.readthedocs.io/en/latest/Reindexer.html Code is pretty short and is here: https://github.com/moonlitesolutions/SolrClient/blob/master/SolrClient/helpers/reindexer.py Here is sample: from SolrClient import SolrClient from SolrClient.helpers import Reindexer r = Reindexer(SolrClient('http://source_solr:8983/solr'), SolrClient(' http://destination_solr:8983/solr') , source_coll='source_collection', dest_coll='destination-collection') r.reindex() On Tue, Aug 9, 2016 at 9:56 AM, Shawn Heisey <apa...@elyograg.org> wrote: > On 8/9/2016 1:48 AM, bharath.mvkumar wrote: > > What would be the best way to re-index the data in the SOLR cloud? We > > have around 65 million data and we are planning to change the schema > > by changing the unique key type from long to string. How long does it > > take to re-index 65 million documents in SOLR and can you please > > suggest how to do that? > > There is no magic bullet. And there's no way for anybody but you to > determine how long it's going to take. There are people who have > achieved over 50K inserts per second, and others who have difficulty > reaching 1000 per second. Many factors affect indexing speed, including > the size of your documents, the complexity of your analysis, the > capabilities of your hardware, and how many threads/processes you are > using at the same time when you index. > > Here's some more detailed info about reindexing, but it's probably not > what you wanted to hear: > > https://wiki.apache.org/solr/HowToReindex > > Thanks, > Shawn > >