[
https://issues.apache.org/jira/browse/HDFS-8220?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14651112#comment-14651112
]
Rakesh R commented on HDFS-8220:
--------------------------------
Thanks [~zhz] for the reminder. IMHO, we could split this case into two part:
# At client side, do validate the number of datanodes(say, blocklocations). If
{{number of datanodes < NUM_DATA_BLOCKS}} then throw IOException("Failed to get
datablocks number of datanodes!")
# At NN server side, for supporting the clusters which doesn't have enough
datanodes to satisfy the configured schema number of (data + parity) nodes. It
can be due to not enough live datanodes or small cluster etc. One idea is
modify {{PlacementPolicyEC}} by returning identical data nodes. Probably one
data node can occur more than one time in the blockGroup. For example, 5 node
cluster is mapped to schema {{RS-6-3}} as datablocks(DN1, DN2, DN3, DN4, DN5,
DN1) and parityblocks(DN2, DN3, DN4). Here in {{PlacementPolicyEC}} we should
carefully define maximum occurrences of single data node in a {{blockGroup}}.
By default, one data node can max appear only two times in a {{blockGroup}}.
If you agree will address 1) here in this jira and 2) will raise a separate
jira and do brainstorming.
> Erasure Coding: StripedDataStreamer fails to handle the blocklocations which
> doesn't satisfy BlockGroupSize
> -----------------------------------------------------------------------------------------------------------
>
> Key: HDFS-8220
> URL: https://issues.apache.org/jira/browse/HDFS-8220
> Project: Hadoop HDFS
> Issue Type: Sub-task
> Reporter: Rakesh R
> Assignee: Rakesh R
> Attachments: HDFS-8220-001.patch, HDFS-8220-002.patch,
> HDFS-8220-003.patch, HDFS-8220-004.patch, HDFS-8220-HDFS-7285.005.patch,
> HDFS-8220-HDFS-7285.006.patch, HDFS-8220-HDFS-7285.007.patch,
> HDFS-8220-HDFS-7285.007.patch, HDFS-8220-HDFS-7285.008.patch
>
>
> During write operations {{StripedDataStreamer#locateFollowingBlock}} fails to
> validate the available datanodes against the {{BlockGroupSize}}. Please see
> the exception to understand more:
> {code}
> 2015-04-22 14:56:11,313 WARN hdfs.DFSClient (DataStreamer.java:run(538)) -
> DataStreamer Exception
> java.lang.NullPointerException
> at
> java.util.concurrent.LinkedBlockingQueue.offer(LinkedBlockingQueue.java:374)
> at
> org.apache.hadoop.hdfs.StripedDataStreamer.locateFollowingBlock(StripedDataStreamer.java:157)
> at
> org.apache.hadoop.hdfs.DataStreamer.nextBlockOutputStream(DataStreamer.java:1332)
> at org.apache.hadoop.hdfs.DataStreamer.run(DataStreamer.java:424)
> at
> org.apache.hadoop.hdfs.StripedDataStreamer.run(StripedDataStreamer.java:1)
> 2015-04-22 14:56:11,313 INFO hdfs.MiniDFSCluster
> (MiniDFSCluster.java:shutdown(1718)) - Shutting down the Mini HDFS Cluster
> 2015-04-22 14:56:11,313 ERROR hdfs.DFSClient
> (DFSClient.java:closeAllFilesBeingWritten(608)) - Failed to close inode 16387
> java.io.IOException: DataStreamer Exception:
> at org.apache.hadoop.hdfs.DataStreamer.run(DataStreamer.java:544)
> at
> org.apache.hadoop.hdfs.StripedDataStreamer.run(StripedDataStreamer.java:1)
> Caused by: java.lang.NullPointerException
> at
> java.util.concurrent.LinkedBlockingQueue.offer(LinkedBlockingQueue.java:374)
> at
> org.apache.hadoop.hdfs.StripedDataStreamer.locateFollowingBlock(StripedDataStreamer.java:157)
> at
> org.apache.hadoop.hdfs.DataStreamer.nextBlockOutputStream(DataStreamer.java:1332)
> at org.apache.hadoop.hdfs.DataStreamer.run(DataStreamer.java:424)
> ... 1 more
> {code}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)