[ 
https://issues.apache.org/jira/browse/HDFS-4305?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13628041#comment-13628041
 ] 

Suresh Srinivas commented on HDFS-4305:
---------------------------------------

bq. We had a user who accidentally set the block size to something small in his 
big job. 
Out of curiosity what was the size of the block?

bq. The blocks were allocated and completed so frequently that the namenode 
couldn't keep up
This is very interesting. I would have thought given namenode allocates only 
waits for first block to be reported from a datanode before allocating third 
block, I thought block allocation would be slower and cannot overwhelm the 
namenode. Perhaps the users job was also creating a lot of files.

I am reluctant about this change since this incompatible change will affect QA 
tests and integration tests. Thinking about it I am okay if by default the min 
block size is 0 and an admin sets up minimum block size in production 
deployment. Either admin sets it up as best practice or after an incident where 
they get impacted by a job choosing very small block size.

                
> Add a configurable limit on number of blocks per file, and min block size
> -------------------------------------------------------------------------
>
>                 Key: HDFS-4305
>                 URL: https://issues.apache.org/jira/browse/HDFS-4305
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: namenode
>    Affects Versions: 1.0.4, 3.0.0, 2.0.2-alpha
>            Reporter: Todd Lipcon
>            Assignee: Andrew Wang
>            Priority: Minor
>         Attachments: hdfs-4305-1.patch
>
>
> We recently had an issue where a user set the block size very very low and 
> managed to create a single file with hundreds of thousands of blocks. This 
> caused problems with the edit log since the OP_ADD op was so large 
> (HDFS-4304). I imagine it could also cause efficiency issues in the NN. To 
> prevent users from making such mistakes, we should:
> - introduce a configurable minimum block size, below which requests are 
> rejected
> - introduce a configurable maximum number of blocks per file, above which 
> requests to add another block are rejected (with a suitably high default as 
> to not prevent legitimate large files)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to