[
https://issues.apache.org/jira/browse/HDFS-13642?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16499852#comment-16499852
]
SammiChen commented on HDFS-13642:
----------------------------------
Some comments,
1. *private static final int BLOCK_SIZE = 1 << 20; // 16k*
change the comments from 16k to 1MB
2. {quote}
if (!shouldReplicate) {
final ErasureCodingPolicy ecPolicy = FSDirErasureCodingOp
.getErasureCodingPolicy(this, ecPolicyName, iip);
if (ecPolicy != null && (!ecPolicy.isReplicationPolicy())) {
if (blockSize < ecPolicy.getCellSize()) {
throw new IOException("Specified block size " + blockSize
+ " is less than the cell" + " size (" + ecPolicy.getCellSize()
+ ") of the erasure coding policy on this file.");
}
}
}
{quote}
When create a normal 3-replica file, {{shouldReplicate}} value is false.
This value is true when user set the {{CreateFlag.SHOULD_REPLICATE}}
explicitly when calling the create API. One suggestion is adding the block
size, cell size compare statements as the else statement of
{quote}
if (shouldReplicate ||
(org.apache.commons.lang.StringUtils.isEmpty(ecPolicyName) &&
!FSDirErasureCodingOp.hasErasureCodingPolicy(this, iip))) {
blockManager.verifyReplication(src, replication, clientMachine);
}
{quote}
Thanks for working on it, [~xiaochen].
> Creating a file with block size smaller than EC policy's cell size should
> throw
> -------------------------------------------------------------------------------
>
> Key: HDFS-13642
> URL: https://issues.apache.org/jira/browse/HDFS-13642
> Project: Hadoop HDFS
> Issue Type: Bug
> Components: erasure-coding
> Affects Versions: 3.0.0
> Reporter: Xiao Chen
> Assignee: Xiao Chen
> Priority: Major
> Attachments: HDFS-13642.01.patch, HDFS-13642.02.patch, editsStored
>
>
> The following command causes an exception:
> {noformat}
> hadoop fs -Ddfs.block.size=349696 -put -f lineitem_sixblocks.parquet
> /test-warehouse/tmp123ec
> {noformat}
> {noformat}
> 18/05/25 16:00:59 WARN hdfs.DataStreamer: DataStreamer Exception
> java.io.IOException: BlockSize 349696 < lastByteOffsetInBlock, #0:
> blk_-9223372036854574256_14634, packet seqno: 7 offsetInBlock: 349696
> lastPacketInBlock: false lastByteOffsetInBlock: 350208
> at org.apache.hadoop.hdfs.DataStreamer.run(DataStreamer.java:729)
> at
> org.apache.hadoop.hdfs.StripedDataStreamer.run(StripedDataStreamer.java:46)
> 18/05/25 16:00:59 WARN hdfs.DFSOutputStream: Failed: offset=4096, length=512,
> DFSStripedOutputStream:#0: failed, blk_-9223372036854574256_14634
> java.io.IOException: BlockSize 349696 < lastByteOffsetInBlock, #0:
> blk_-9223372036854574256_14634, packet seqno: 7 offsetInBlock: 349696
> lastPacketInBlock: false lastByteOffsetInBlock: 350208
> at org.apache.hadoop.hdfs.DataStreamer.run(DataStreamer.java:729)
> at
> org.apache.hadoop.hdfs.StripedDataStreamer.run(StripedDataStreamer.java:46)
> {noformat}
> Then the streamer is confused and hangs.
> The local file is under 6MB, the hdfs file has a RS-3-2-1024k EC policy.
>
> Credit to [~tarasbob] for reporting this issue.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]