[ 
https://issues.apache.org/jira/browse/CARBONDATA-153?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15427727#comment-15427727
 ] 

ASF GitHub Bot commented on CARBONDATA-153:
-------------------------------------------

GitHub user mohammadshahidkhan opened a pull request:

    https://github.com/apache/incubator-carbondata/pull/77

    CARBONDATA-153 Record count is not matching while loading the data when one 
data node went down in HA setup

    
    
    Record count is not matching while loading the data when one data node went 
down in HA setup
    
    As per previous implementation : 
    Scenario: 
    No. of Running executors = 3
    No. Data Nodes = 2
    Total Unique blocks = 96
    Then as per the previous implementation no. of blocks per node = 32.
    While assigning blocks only 66 blocks getting allocated amongs two 
execuotors. 
    Third executor is not getting any blocks since we are considering only node 
local allocation.
    Solution:
    // so now we have a map of node vs blocks. allocate the block as per the 
order
    createOutputMap(nodeBlocksMap, blocksPerNode, uniqueBlocks, 
nodeAndBlockMapping, activeNodes);
    
    After doing node block mapping we will map the remaining activeNodes in the 
nodeBlocksMap in the  assignLeftOverBlocks so that assignLeftOverBlocks  can 
take care the assignment of remaining blocks.
    // if any blocks remain then assign them to nodes in round robin.
    assignLeftOverBlocks(nodeBlocksMap, uniqueBlocks, blocksPerNode);

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/mohammadshahidkhan/incubator-carbondata 
fixed_block_distribution_when_active_node_grt_node_nodehaving_data

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/incubator-carbondata/pull/77.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #77
    
----
commit e0fe3491d1750d586efde882d3a9882d325954a1
Author: mohammadshahidkhan <[email protected]>
Date:   2016-08-09T05:17:02Z

    CARBONDATA-153 Record count is not matching while loading the data when one 
data node went down in HA setup

----


> Record count is not matching while loading the data when one data node went 
> down in HA setup
> --------------------------------------------------------------------------------------------
>
>                 Key: CARBONDATA-153
>                 URL: https://issues.apache.org/jira/browse/CARBONDATA-153
>             Project: CarbonData
>          Issue Type: Bug
>         Environment: SUSE11SP3, standalone application with Spark 1.6.2 and 
> Hadoop 2.7.2 version
>            Reporter: Krishna Reddy
>            Assignee: Mohammad Shahid Khan
>
> Record count is not matching while loading the data when one data node went 
> down in HA setup
> 1. Make HA setup
> 2. Kill one data node
> 3. Load the data for 1000000 records CSV
> 4. Verify the record count
> Actual Result: Record count is not matching with actual records



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to