[ 
https://issues.apache.org/jira/browse/HDFS-6621?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14058496#comment-14058496
 ] 

Rafal Wojdyla commented on HDFS-6621:
-------------------------------------

We have also experience this problem with Balancer.
The problem in general is that balancer will prematurely finish iteration due 
to noPendingBlockIteration >= 5.
I was about to create JIRA ticket for this - but I have noticed this ticket.
The solutions that, we have applied is to:
 1. noPendingBlockIteration = 0 when pendingBlock != null, exactly the way you 
did
 2. notify only on source object when block transfer finishes 

Problem/Solutions 1 is well described above.
Problem/Solutions 2:

In org/apache/hadoop/hdfs/server/balancer/Balancer:
{code}
private void dispatch() {
     ...
     synchronized (Balancer.this) {
         Balancer.this.notifyAll();
     }
}
{code}
this will notify all scheduling threads, even the ones that are waiting and 
still have all 5 transfer threads occupied.
When occupied task wakes up, it will try to get next block to move, but because 
all 5 transfer threads are occupied
it will get null as next block to move - which will increase 
noPendingBlockIteration, and we are in the problem 1.

The solution is to notify threads waiting on source object and reset 
PendingBlockMove object afterwords.
Should I provide patch in this ticket, or create a separate ticket?


> Hadoop Balancer prematurely exits iterations
> --------------------------------------------
>
>                 Key: HDFS-6621
>                 URL: https://issues.apache.org/jira/browse/HDFS-6621
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: balancer
>    Affects Versions: 2.2.0, 2.4.0
>         Environment: Red Hat Enterprise Linux Server release 5.8 with Hadoop 
> 2.4.0
>            Reporter: Benjamin Bowman
>              Labels: balancer
>         Attachments: HDFS-6621.patch
>
>
> I have been having an issue with the balancing being too slow.  The issue was 
> not with the speed with which blocks were moved, but rather the balancer 
> would prematurely exit out of it's balancing iterations.  It would move ~10 
> blocks or 100 MB then exit the current iteration (in which it said it was 
> planning on moving about 10 GB). 
> I looked in the Balancer.java code and believe I found and solved the issue.  
> In the dispatchBlocks() function there is a variable, 
> "noPendingBlockIteration", which counts the number of iterations in which a 
> pending block to move cannot be found.  Once this number gets to 5, the 
> balancer exits the overall balancing iteration.  I believe the desired 
> functionality is 5 consecutive no pending block iterations - however this 
> variable is never reset to 0 upon block moves.  So once this number reaches 5 
> - even if there have been thousands of blocks moved in between these no 
> pending block iterations  - the overall balancing iteration will prematurely 
> end.  
> The fix I applied was to set noPendingBlockIteration = 0 when a pending block 
> is found and scheduled.  In this way, my iterations do not prematurely exit 
> unless there is 5 consecutive no pending block iterations.   Below is a copy 
> of my dispatchBlocks() function with the change I made.
>     private void dispatchBlocks() {
>       long startTime = Time.now();
>       long scheduledSize = getScheduledSize();
>       this.blocksToReceive = 2*scheduledSize;
>       boolean isTimeUp = false;
>       int noPendingBlockIteration = 0;
>       while(!isTimeUp && getScheduledSize()>0 &&
>           (!srcBlockList.isEmpty() || blocksToReceive>0)) {
>         PendingBlockMove pendingBlock = chooseNextBlockToMove();
>         if (pendingBlock != null) {
>           noPendingBlockIteration = 0;
>           // move the block
>           pendingBlock.scheduleBlockMove();
>           continue;
>         }
>         /* Since we can not schedule any block to move,
>          * filter any moved blocks from the source block list and
>          * check if we should fetch more blocks from the namenode
>          */
>         filterMovedBlocks(); // filter already moved blocks
>         if (shouldFetchMoreBlocks()) {
>           // fetch new blocks
>           try {
>             blocksToReceive -= getBlockList();
>             continue;
>           } catch (IOException e) {
>             LOG.warn("Exception while getting block list", e);
>             return;
>           }
>         } else {
>           // source node cannot find a pendingBlockToMove, iteration +1
>           noPendingBlockIteration++;
>           // in case no blocks can be moved for source node's task,
>           // jump out of while-loop after 5 iterations.
>           if (noPendingBlockIteration >= MAX_NO_PENDING_BLOCK_ITERATIONS) {
>             setScheduledSize(0);
>           }
>         }
>         // check if time is up or not
>         if (Time.now()-startTime > MAX_ITERATION_TIME) {
>           isTimeUp = true;
>           continue;
>         }
>         /* Now we can not schedule any block to move and there are
>          * no new blocks added to the source block list, so we wait.
>          */
>         try {
>           synchronized(Balancer.this) {
>             Balancer.this.wait(1000);  // wait for targets/sources to be idle
>           }
>         } catch (InterruptedException ignored) {
>         }
>       }
>     }
>   }



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to