Has anyone mitigated the potentially large IO impact of doing a backup of a large collection or just in general? If the collection is large enough, there very well could be many shards on one host and it could saturate the IO. I wonder if there should be a rate limit mechanism or some other mechanism.
Not the same but I know that at a segment level, the merges are rate limited -- ConcurrentMergeScheduler doesn't quite let you set it but adjusts itself automatically ("ioThrottle" boolean). ~ David Smiley Apache Lucene/Solr Search Developer http://www.linkedin.com/in/davidwsmiley