We have around 5 million items in our index and each item has a description located on a separate physical database. These item descriptions vary in size and for the most part are quite large. Currently we are only indexing items and not their corresponding description and a full import takes around 4 hours. Ideally we want to index both our items and their descriptions but after some quick profiling I determined that a full import would take in excess of 24 hours.
- How would I profile the indexing process to determine if the bottleneck is Solr or our Database. - In either case, how would one speed up this process? Is there a way to run parallel import processes and then merge them together at the end? Possibly use some sort of distributed computing? Any ideas. Thanks -- View this message in context: http://lucene.472066.n3.nabble.com/Importing-large-datasets-tp863447p863447.html Sent from the Solr - User mailing list archive at Nabble.com.