Looking deeper, it's entirely possible my experience is out of date; we're running a Solr ~5.2.1 installation, and I'm 100% sure that in 5.2.1 a split shard command completely blocks overseer. Even OVERSEERSTATUS times out while a split shard is happening.
Perhaps this was fixed as part of SOLR-7855? I don't grok all the new code, but it looks like as of 5.4 there's some support for overseer doing more things concurrently. On Mon, Jan 25, 2016 at 4:57 PM, Anshum Gupta <[email protected]> wrote: > Hi Scott, > > Shard splitting shouldn't block unrelated tasks. Here's the current > definition of 'unrelated': anything that involves a different collection. > Right now, the Overseer only processes one collection specific task at a > time, however, you should certainly be able to split shards from other > collections. It's a bug if it doesn't work that way. > > There is logic to check for mutual exclusion so that race conditions don't > come back to bite us e.g. if I send in add replica, shard split, delete > replica, AND/OR delete shard request for the same collection, we might run > into issues. > > > On Mon, Jan 25, 2016 at 1:02 PM, Scott Blum <[email protected]> wrote: > >> Hi dev, >> >> I searched around on this but couldn't find any related JIRA tickets or >> work, although perhaps I missed it. >> >> We've run into a major scaling problem in the shard splitting operation. >> The entire shard split is a single operation in overseer, and blocks any >> other queue items from executing while the shard split happens. Shard >> splits can take on the order of many minutes to complete, during this time >> no other overseer ops (including status updates) can occur. Additionally, >> this means you can only run a single shard split operation at a time, >> across an entire deployment. >> >> Is anyone already working on this? If not, I'm planning on working on it >> myself, because we have to solve this scaling issue one way or another. >> I'd love to get guidance from someone knowledgeable, both to make it more >> solid, and also hopefully so it could be upstreamed. >> >> Thanks! >> Scott >> >> > > > -- > Anshum Gupta >
