We currently have a SolrCloud cluster that contains two collections which we toggle between for querying and indexing. When bulk indexing to our “offline" collection, our query performance from the “online” collection suffers somewhat. When segment merges occur, it gets downright abysmal. We have adjusted several settings that affect flushing and/or merging and have tried increasing the IOPs capacity of our volumes, without much success. The best recommendation seems to be to simply have enough ram on each node for the index to fit into memory (plus additional memory which may be required for indexing). If this isn’t feasible, it seems that there is no way around the fact that flushes and merges will potentially take up IO resources needed for responding to queries. We are currently experimenting with throttling flushes and merges using maxWriteMBPerSec* settings, which seems to help if set to fairly low values. Does anyone have any other recommendations for optimizing SolrCloud to handle both heavy indexing and querying?
Thanks, Will