[jira] Commented: (HADOOP-2991) dfs.du.reserved not honored in 0.15/16 (regression from 0.14+patch for 2549)

Allen Wittenauer (JIRA) Thu, 13 Mar 2008 14:40:05 -0700

    [ 
https://issues.apache.org/jira/browse/HADOOP-2991?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12578500#action_12578500
 ]


Allen Wittenauer commented on HADOOP-2991:
------------------------------------------

Someone asked me privately how changing from dfs=capacity-reserved to dfs=fixed 
value impacts map reduce output, its storage usage, etc.

The quick answer is to say that it doesn't, but that's not a very complete 
answer. :)

I view the world this way:

I can limit HDFS if something like 2150 and what we've talked about here is 
implemented.  I can equally limit MR output by implementing file system quotas 
for the user(s)/group(s) that are running the JT/TT/tasks at the file system 
level.  [And remember everyone: you do not want the HDFS and the tasks running 
as the same user!]  This guarantees that both HDFS and MR can be fenced in and 
not take the blame for eating all the drive space.  Any file system fulls will 
either be a fault with how the system was configured (admins and rope can be 
dangerous combinations, but a necessary one) or with some less than polite 
process.

This type of system would actually work much better than, say, partitioning 
specific drives, given that this gives the admin some flexibility to 
reconfigure based upon workload.

pete: as to a plug-in... later.  My JIRA came first. :p



> dfs.du.reserved not honored in 0.15/16 (regression from 0.14+patch for 2549)
> ----------------------------------------------------------------------------
>
>                 Key: HADOOP-2991
>                 URL: https://issues.apache.org/jira/browse/HADOOP-2991
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: dfs
>    Affects Versions: 0.15.0, 0.15.1, 0.15.2, 0.15.3, 0.16.0
>            Reporter: Joydeep Sen Sarma
>            Priority: Critical
>
> changes for https://issues.apache.org/jira/browse/HADOOP-1463
> have caused a regression. earlier:
> - we could set dfs.du.reserve to 1G and be *sure* that 1G would not be used.
> now this is no longer true. I am quoting Pete Wyckoff's example:
> <example>
> Let's look at an example. 100 GB disk and /usr using 45 GB and dfs using 50 
> GBs now
> Df -kh shows:
> Capacity = 100 GB
> Available = 1 GB (remember ~4 GB chopped out for metadata and stuff)
> Used = 95 GBs   
> remaining = 100 GB - 50 GB - 1GB = 49 GB 
> Min(remaining, available) = 1 GB
> 98% of which is usable for DFS apparently - 
> So, we're at the limit, but are free to use 98% of the remaining 1GB.
> </example>
> this is broke. based on the discussion on 1463 - it seems like the notion of 
> 'capacity' as being the first field of 'df' is problematic. For example - 
> here's what our df output looks like:
> Filesystem            Size  Used Avail Use% Mounted on
> /dev/sda3             130G  123G   49M 100% /
> as u can see - 'Size' is a misnomer - that much space is not available. 
> Rather the actual usable space is 123G+49M ~ 123G. (not entirely sure what 
> the discrepancy is due to - but have heard this may be due to space reserved 
> for file system metadata). Because of this discrepancy - we end up in a 
> situation where file system is out of space.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HADOOP-2991) dfs.du.reserved not honored in 0.15/16 (regression from 0.14+patch for 2549)

Reply via email to