Hi Scott,

Shard splitting shouldn't block unrelated tasks. Here's the current
definition of 'unrelated': anything that involves a different collection.
Right now, the Overseer only processes one collection specific task at a
time, however, you should certainly be able to split shards from other
collections. It's a bug if it doesn't work that way.

There is logic to check for mutual exclusion so that race conditions don't
come back to bite us e.g. if I send in add replica, shard split, delete
replica, AND/OR delete shard request for the same collection, we might run
into issues.


On Mon, Jan 25, 2016 at 1:02 PM, Scott Blum <[email protected]> wrote:

> Hi dev,
>
> I searched around on this but couldn't find any related JIRA tickets or
> work, although perhaps I missed it.
>
> We've run into a major scaling problem in the shard splitting operation.
> The entire shard split is a single operation in overseer, and blocks any
> other queue items from executing while the shard split happens.  Shard
> splits can take on the order of many minutes to complete, during this time
> no other overseer ops (including status updates) can occur.  Additionally,
> this means you can only run a single shard split operation at a time,
> across an entire deployment.
>
> Is anyone already working on this?  If not, I'm planning on working on it
> myself, because we have to solve this scaling issue one way or another.
> I'd love to get guidance from someone knowledgeable, both to make it more
> solid, and also hopefully so it could be upstreamed.
>
> Thanks!
> Scott
>
>


-- 
Anshum Gupta

Reply via email to