[ 
https://issues.apache.org/jira/browse/CARBONDATA-16?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15351936#comment-15351936
 ] 

Mohammad Shahid Khan commented on CARBONDATA-16:
------------------------------------------------

Scenario:
cluster size : 10 node
no of blocks :  10 
number of executor : 3 
dfs.replication :  3 
As per the implementation:-
Number of block per node = no of block/node size i.e. 10/10 = 1 block per node.
Therefore the block distribution was based on the number all node not on the 
basic of executor node.
Because of this all the remote block were getting to a single node.
example: only  three ex1, ex2 and ex3 are active.
But based on data locality the carbon is allocating one block to each node 
based on data locality, even thought the executor if the is not running.
Due to this later on Remote Blocks are getting allocated to single executor.
ex1 - has 1 NODE_LOCAL 7 RACK_LOCAL
ex2 - has 1 NODE_LOCAL 
ex3- has 1 NODE_LOCAL

> BLOCK distribution in query is not correct in query when number of executors 
> are less than the cluster size.
> ------------------------------------------------------------------------------------------------------------
>
>                 Key: CARBONDATA-16
>                 URL: https://issues.apache.org/jira/browse/CARBONDATA-16
>             Project: CarbonData
>          Issue Type: Bug
>            Reporter: Mohammad Shahid Khan
>            Assignee: Mohammad Shahid Khan
>            Priority: Critical
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to