[
https://issues.apache.org/jira/browse/HDFS-4305?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13628041#comment-13628041
]
Suresh Srinivas commented on HDFS-4305:
---------------------------------------
bq. We had a user who accidentally set the block size to something small in his
big job.
Out of curiosity what was the size of the block?
bq. The blocks were allocated and completed so frequently that the namenode
couldn't keep up
This is very interesting. I would have thought given namenode allocates only
waits for first block to be reported from a datanode before allocating third
block, I thought block allocation would be slower and cannot overwhelm the
namenode. Perhaps the users job was also creating a lot of files.
I am reluctant about this change since this incompatible change will affect QA
tests and integration tests. Thinking about it I am okay if by default the min
block size is 0 and an admin sets up minimum block size in production
deployment. Either admin sets it up as best practice or after an incident where
they get impacted by a job choosing very small block size.
> Add a configurable limit on number of blocks per file, and min block size
> -------------------------------------------------------------------------
>
> Key: HDFS-4305
> URL: https://issues.apache.org/jira/browse/HDFS-4305
> Project: Hadoop HDFS
> Issue Type: Bug
> Components: namenode
> Affects Versions: 1.0.4, 3.0.0, 2.0.2-alpha
> Reporter: Todd Lipcon
> Assignee: Andrew Wang
> Priority: Minor
> Attachments: hdfs-4305-1.patch
>
>
> We recently had an issue where a user set the block size very very low and
> managed to create a single file with hundreds of thousands of blocks. This
> caused problems with the edit log since the OP_ADD op was so large
> (HDFS-4304). I imagine it could also cause efficiency issues in the NN. To
> prevent users from making such mistakes, we should:
> - introduce a configurable minimum block size, below which requests are
> rejected
> - introduce a configurable maximum number of blocks per file, above which
> requests to add another block are rejected (with a suitably high default as
> to not prevent legitimate large files)
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira