RaidNode should monitor and fix blocks that violate RAID block placement
-------------------------------------------------------------------------
Key: MAPREDUCE-2275
URL: https://issues.apache.org/jira/browse/MAPREDUCE-2275
Project: Hadoop Map/Reduce
Issue Type: Improvement
Components: contrib/raid
Reporter: Ramkumar Vadali
Assignee: Ramkumar Vadali
When files are RAIDed, it is important to keep blocks in each RAID stripe and
the corresponding parity blocks on as many different machines as possible. This
ensures minimal probability of data loss when data nodes go dead.
BlockPlacementPolicyRaid ensures that parity blocks are not located on the same
machines as the source blocks. But source blocks placement is not controlled
directly in this manner. Instead, source blocks are allowed to be created using
the default policy. After a source file is RAIDed, its replication is
increased, and then decreased. BlockPlacementPolicyRaid then tries to keep the
source blocks well-located when excess blocks are deleted. This is not
guaranteed to ensure the correct block placement for RAID.
Also, if blocks are moved around by the balancer, the block placement could be
violated.
We need periodic monitoring of block placement of RAIDed files and the
corresponding parity blocks.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.