[jira] [Commented] (HDFS-4305) Add a configurable limit on number of blocks per file, and min block size

Todd Lipcon (JIRA) Wed, 10 Apr 2013 12:35:18 -0700

    [ 
https://issues.apache.org/jira/browse/HDFS-4305?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13628154#comment-13628154
 ]


Todd Lipcon commented on HDFS-4305:
-----------------------------------

bq. I am reluctant about this change since this incompatible change will affect 
QA tests and integration tests. Thinking about it I am okay if by default the 
min block size is 0 and an admin sets up minimum block size in production 
deployment. Either admin sets it up as best practice or after an incident where 
they get impacted by a job choosing very small block size.

Why push the effort on users to configure it to a reasonable 'best practice' 
level, instead of pushing it on QA (hadoop experts) to set it to 0 in test 
clusters where they want to abuse block size to stress test the NN?

I think we have too many defaults where we say "any reasonable production 
cluster should not leave this at the default". Sure, it helps vendors pimp our 
management tools (which would ship with the best practice defaults overriding 
whatever's in hdfs-default.xml), but I don't think it's in the best interest of 
the project if the setup instructions involve changing a bunch of things away 
from unsafe defaults.

Similarly, waiting until _after_ a production incident to make the user then go 
and learn about a better default is not very user-centric.
                
> Add a configurable limit on number of blocks per file, and min block size
> -------------------------------------------------------------------------
>
>                 Key: HDFS-4305
>                 URL: https://issues.apache.org/jira/browse/HDFS-4305
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: namenode
>    Affects Versions: 1.0.4, 3.0.0, 2.0.2-alpha
>            Reporter: Todd Lipcon
>            Assignee: Andrew Wang
>            Priority: Minor
>         Attachments: hdfs-4305-1.patch
>
>
> We recently had an issue where a user set the block size very very low and 
> managed to create a single file with hundreds of thousands of blocks. This 
> caused problems with the edit log since the OP_ADD op was so large 
> (HDFS-4304). I imagine it could also cause efficiency issues in the NN. To 
> prevent users from making such mistakes, we should:
> - introduce a configurable minimum block size, below which requests are 
> rejected
> - introduce a configurable maximum number of blocks per file, above which 
> requests to add another block are rejected (with a suitably high default as 
> to not prevent legitimate large files)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-4305) Add a configurable limit on number of blocks per file, and min block size

Reply via email to