Scenario: If I run huge number of jobs(all these jobs will use the same resources(input files)) on mini cluster(say 10-15 nodes), then every time namenode returning the first block of nearest data node. So in this case all the clients are trying to do read/write operations on same block.
So is there any other namenode scheduling which considers the block level overhead?