[jira] [Comment Edited] (YARN-3943) Use separate threshold configurations for disk-full detection and disk-not-full detection.

Jason Lowe (JIRA) Wed, 07 Oct 2015 14:37:44 -0700

    [ 
https://issues.apache.org/jira/browse/YARN-3943?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14947641#comment-14947641
 ]


Jason Lowe edited comment on YARN-3943 at 10/7/15 9:36 PM:
-----------------------------------------------------------

Thanks for the patch, [~zxu]!

I think this patch causes backwards compatibility problems for users who have 
increased the max-disk-utilization-per-disk-percentage from the default.  For 
example, if someone changed it to 100 and only wants disks to be treated full 
if they actually fill up completely then this patch will silently make those 
disks unusable once they fill until they free up at least 10% of their space.  
That's not going to be good if the cluster runs the disks over 90% most of the 
time, as it will effectively take the disk out of rotation for a long, long 
time.  A more compatible change would be to _not_ set a value in yarn-default 
for this property.  If the property is set then we use it, but if we get a null 
for it then we know it isn't specified and we should just use the same 
threshold for low as for high.

Shouldn't we be capping the max of diskUtilizationPercentageCutoffLow based on 
diskUtilizationPercentageCutoffHigh rather than 100.0?

Nit: I think using Math.max / Math.min could make the range capping more 
readable (certainly less redundant with the identifiers and constants), but 
it's not must-fix.

Nit: "Use" should be "Using" in the following log message.  "Use" implies we 
are giving a directive requiring the user to do it, while "Using" states the 
code is doing it for them.
{code}
+      if (lowUsableSpacePercentagePerDisk > highUsableSpacePercentagePerDisk) {
+        LOG.warn("Use " + YarnConfiguration.
+            NM_MAX_PER_DISK_UTILIZATION_PERCENTAGE + " as " +
+            YarnConfiguration.NM_WM_LOW_PER_DISK_UTILIZATION_PERCENTAGE +
+            ", because " + YarnConfiguration.
+            NM_WM_LOW_PER_DISK_UTILIZATION_PERCENTAGE +
+            " is not configured properly.");
{code}



was (Author: jlowe):
Thanks for the patch, [~zxu]!

I think this patch causes backwards compatibility problems for users who have 
increased the max-disk-utilization-per-disk-percentage from the default.  For 
example, if someone changed it to 100 and only wants disks to be treated full 
if they actually fill up completely then this patch will silently make those 
disks unusable once they fill until they free up at least 10% of their space.  
That's not going to be good if the cluster runs the disks over 90% most of the 
time, as it will effectively take the disk out of rotation for a long, long 
time.  A more compatible change would be to _not_ set a value in yarn-default 
for this property.  If the property is set then we use it, but if we get a null 
for it then we know it isn't specified and we shouldn't just use the same 
threshold for low as for high.

Shouldn't we be capping the max of diskUtilizationPercentageCutoffLow based on 
diskUtilizationPercentageCutoffHigh rather than 100.0?

Nit: I think using Math.max / Math.min could make the range capping more 
readable (certainly less redundant with the identifiers and constants), but 
it's not must-fix.

Nit: "Use" should be "Using" in the following log message.  "Use" implies we 
are giving a directive requiring the user to do it, while "Using" states the 
code is doing it for them.
{code}
+      if (lowUsableSpacePercentagePerDisk > highUsableSpacePercentagePerDisk) {
+        LOG.warn("Use " + YarnConfiguration.
+            NM_MAX_PER_DISK_UTILIZATION_PERCENTAGE + " as " +
+            YarnConfiguration.NM_WM_LOW_PER_DISK_UTILIZATION_PERCENTAGE +
+            ", because " + YarnConfiguration.
+            NM_WM_LOW_PER_DISK_UTILIZATION_PERCENTAGE +
+            " is not configured properly.");
{code}


> Use separate threshold configurations for disk-full detection and 
> disk-not-full detection.
> ------------------------------------------------------------------------------------------
>
>                 Key: YARN-3943
>                 URL: https://issues.apache.org/jira/browse/YARN-3943
>             Project: Hadoop YARN
>          Issue Type: Improvement
>          Components: nodemanager
>            Reporter: zhihai xu
>            Assignee: zhihai xu
>            Priority: Critical
>         Attachments: YARN-3943.000.patch, YARN-3943.001.patch
>
>
> Use separate threshold configurations to check when disks become full and 
> when disks become good. Currently the configuration 
> "yarn.nodemanager.disk-health-checker.max-disk-utilization-per-disk-percentage"
>  and "yarn.nodemanager.disk-health-checker.min-free-space-per-disk-mb" are 
> used to check both when disks become full and when disks become good. It will 
> be better to use two configurations: one is used when disks become full from 
> not-full and the other one is used when disks become not-full from full. So 
> we can avoid oscillating frequently.
> For example: we can set the one for disk-full detection higher than the one 
> for disk-not-full detection.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Comment Edited] (YARN-3943) Use separate threshold configurations for disk-full detection and disk-not-full detection.

Reply via email to