I've been thinking (which is always a dangerous thing) about data 
locality lately.  

        If we look at file systems, there is this idea of 'reserved space'.  
This space is used for a variety of reasons, including to reduce fragmentation 
on busy file systems.  This allows the file system driver to make smarter 
decisions of block placement and helping the overall throughput.

        At LinkedIn, we're about to build a new grid with a few hundred nodes.  
I'm beginning to wonder if it wouldn't make sense to actually 'hold back' some 
task slots from usage with this same concept in mind.  Let's take a grid that 
is full:  all of the task slots are in use.  When a task ends, the scheduler 
has to make a decision as to which task gets used for any available task slots. 
 If we assume a fairly FIFO view of the world (default scheduler, capacity, 
maybe fair share?), it pulls the next task off the stack and pushes it to the 
task slot.  If only one task slot is free, locality doesn't enter into the 
picture at all.  In essence, we've fragmented our execution.

        If we were to leave even 1 slot 'always' free (and therefore 
sacrificing execution speed by 1 slot), the scheduler could potentially make 
sure the task is host or rack local.  If it can't, no loss--it wouldn't have 
been local anyway.  Obviously reserving more slots as 'always' free increases 
our likelihood of being local.  It just comes down to how much of a tradeoff it 
is worth.

        I guess the real question comes down to how much of an impact does data 
locality really have.  I know in the case of the bigger grids at Yahoo!, the 
ops team suspected (but never did the homework to verify) that our grids and 
their usage so massive that the data locality rarely happened, especially for 
"popular" data.  I can't help but wonder if the situation would have been 
better if we would have kept x% (say .005%?) of the grid free based upon the 
speculation above.

        Thoughts?

Reply via email to