[
https://issues.apache.org/jira/browse/CARBONDATA-153?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15427727#comment-15427727
]
ASF GitHub Bot commented on CARBONDATA-153:
-------------------------------------------
GitHub user mohammadshahidkhan opened a pull request:
https://github.com/apache/incubator-carbondata/pull/77
CARBONDATA-153 Record count is not matching while loading the data when one
data node went down in HA setup
Record count is not matching while loading the data when one data node went
down in HA setup
As per previous implementation :
Scenario:
No. of Running executors = 3
No. Data Nodes = 2
Total Unique blocks = 96
Then as per the previous implementation no. of blocks per node = 32.
While assigning blocks only 66 blocks getting allocated amongs two
execuotors.
Third executor is not getting any blocks since we are considering only node
local allocation.
Solution:
// so now we have a map of node vs blocks. allocate the block as per the
order
createOutputMap(nodeBlocksMap, blocksPerNode, uniqueBlocks,
nodeAndBlockMapping, activeNodes);
After doing node block mapping we will map the remaining activeNodes in the
nodeBlocksMap in the assignLeftOverBlocks so that assignLeftOverBlocks can
take care the assignment of remaining blocks.
// if any blocks remain then assign them to nodes in round robin.
assignLeftOverBlocks(nodeBlocksMap, uniqueBlocks, blocksPerNode);
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/mohammadshahidkhan/incubator-carbondata
fixed_block_distribution_when_active_node_grt_node_nodehaving_data
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/incubator-carbondata/pull/77.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #77
----
commit e0fe3491d1750d586efde882d3a9882d325954a1
Author: mohammadshahidkhan <[email protected]>
Date: 2016-08-09T05:17:02Z
CARBONDATA-153 Record count is not matching while loading the data when one
data node went down in HA setup
----
> Record count is not matching while loading the data when one data node went
> down in HA setup
> --------------------------------------------------------------------------------------------
>
> Key: CARBONDATA-153
> URL: https://issues.apache.org/jira/browse/CARBONDATA-153
> Project: CarbonData
> Issue Type: Bug
> Environment: SUSE11SP3, standalone application with Spark 1.6.2 and
> Hadoop 2.7.2 version
> Reporter: Krishna Reddy
> Assignee: Mohammad Shahid Khan
>
> Record count is not matching while loading the data when one data node went
> down in HA setup
> 1. Make HA setup
> 2. Kill one data node
> 3. Load the data for 1000000 records CSV
> 4. Verify the record count
> Actual Result: Record count is not matching with actual records
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)