I have a lot of data coming in SolrCloud and we create multiple collections
dynamically after a collection threshold is reached.Currently to maintain fast
search response speeds after 100M docs a new collection is triggered( 300G in
HDFS ) . After SolrCloud ( CDH solr 4.10.3) reaches 150 - 200 collections . So
i am trying merge 10 multiple collections to a single new collection shard
using merge api in solrj. But the merge index api is running very slow i.e
almost 5 mins to merge a collection a single shard.
CoreAdminResponse mergeIndexes = CoreAdminRequest.mergeIndexes(destShard,
arr, new String[0], secClient);
LOGGER.debug(" merge response - {} ", mergeIndexes);
Thread.sleep(1000l);
secClient.commit(destCollection);
I need information on if there is a way to make the merge faster . I tried
optimizing and committing a collection before we run the merge.
Also any information on how merge runs in the background (does it copy the
entire index folder ? ) will also be useful
Regards
Avinash Patil
Software Engineer
Securonix
Security Analytics. Delivered. <http://www.securonix.com/>
Mobile:
Email:
Winner of 12 2016 Information Security Product Guide Global Excellence Awards
<http://www.securonix.com/securonix-wins-12-info-security-products-guide-global-excellence-awards/>
Winner of Seven Golden Bridge Awards
<http://www.securonix.com/security-analytics-pioneer-securonix-wins-seven-golden-bridge-awards-including-grand-trophy-winner/>
Named 2015 Innovator in Cyber Threat Analysis and Intelligence by SC Magazine
<http://www.scmagazine.com/cyberthreat-analysis-and-intelligence-innovators-2015/article/458248/5/>
--
This message (including any attachments) contains confidential information
intended for a specific individual and purpose, and is protected by law. If
you are not the intended recipient, you should delete this message and any
disclosure, copying, or distribution of this message, or the taking of any
action based on it, by you is strictly prohibited.