[
https://issues.apache.org/jira/browse/HDFS-4305?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13628154#comment-13628154
]
Todd Lipcon commented on HDFS-4305:
-----------------------------------
bq. I am reluctant about this change since this incompatible change will affect
QA tests and integration tests. Thinking about it I am okay if by default the
min block size is 0 and an admin sets up minimum block size in production
deployment. Either admin sets it up as best practice or after an incident where
they get impacted by a job choosing very small block size.
Why push the effort on users to configure it to a reasonable 'best practice'
level, instead of pushing it on QA (hadoop experts) to set it to 0 in test
clusters where they want to abuse block size to stress test the NN?
I think we have too many defaults where we say "any reasonable production
cluster should not leave this at the default". Sure, it helps vendors pimp our
management tools (which would ship with the best practice defaults overriding
whatever's in hdfs-default.xml), but I don't think it's in the best interest of
the project if the setup instructions involve changing a bunch of things away
from unsafe defaults.
Similarly, waiting until _after_ a production incident to make the user then go
and learn about a better default is not very user-centric.
> Add a configurable limit on number of blocks per file, and min block size
> -------------------------------------------------------------------------
>
> Key: HDFS-4305
> URL: https://issues.apache.org/jira/browse/HDFS-4305
> Project: Hadoop HDFS
> Issue Type: Bug
> Components: namenode
> Affects Versions: 1.0.4, 3.0.0, 2.0.2-alpha
> Reporter: Todd Lipcon
> Assignee: Andrew Wang
> Priority: Minor
> Attachments: hdfs-4305-1.patch
>
>
> We recently had an issue where a user set the block size very very low and
> managed to create a single file with hundreds of thousands of blocks. This
> caused problems with the edit log since the OP_ADD op was so large
> (HDFS-4304). I imagine it could also cause efficiency issues in the NN. To
> prevent users from making such mistakes, we should:
> - introduce a configurable minimum block size, below which requests are
> rejected
> - introduce a configurable maximum number of blocks per file, above which
> requests to add another block are rejected (with a suitably high default as
> to not prevent legitimate large files)
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira