On 12/20/2016 3:43 AM, Vellaimary C wrote: > My organization is using SOLR for search handling . As we need to > index more volume of documents like 300 millions, we have moved to > SOLR 5.5.1. > To speed up the import, which takes more than three weeks now atleast > to 1 week we need parallel data import handler triggered. > Can anyone help me to implement multithreading in dataimport handler.
If it were easy to achieve this, it would have already been done. DIH actually used to have a parameter for the number of threads, but it didn't work, so it was removed. Implementing multi-threaded support is *NOT* a trivial undertaking. If you figure out how to do it, we welcome patches. The best option is for you to write your own indexing application that pulls data from the original source and uses multiple threads to index the data in parallel. To achieve this with DIH requires that you create multiple handlers with different URL paths for names, and start imports on them all that run at the same time, with "clean=false" so that the imports won't wipe the index when they start. Each one needs to handle part of the data in the source system. FYI, your question belongs on the solr-user mailing list. This list is for discussions around the development of Lucene/Solr itself. Thanks, Shawn --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
