[
https://issues.apache.org/jira/browse/HDFS-10713?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15453359#comment-15453359
]
Mingliang Liu edited comment on HDFS-10713 at 8/31/16 9:42 PM:
---------------------------------------------------------------
One concern is that do we need to dump the longest lock interval information
during the suppressed interval, including lock-holding interval and its thread
stack. This should reveal more useful information. One extreme example is a
case where two threads (t1 and t2) holding the write lock in a sequence:
*t1-1s, t2-100s, t1-1s*, in the current implementation the t2 information will
be missing though it's more interesting.
{code:title=DFSConfigKeys.java}
414 // Threshold for how long the write lock warnings must be suppressed
415 public static final String DFS_LOCK_SUPPRESS_WARNING_INTERVAL_MS_KEY =
416 "dfs.lock.suppress.warning.interval.ms";
417 public static final long
DFS_LOCK_SUPPRESS_WARNING_INTERVAL_MS_DEFAULT =
418 120000L;
{code}
And
{code:title=hdfs-default.xml}
2626 <property>
2627 <name>dfs.lock.suppress.warning.interval.ms</name>
2628 <value>1000</value>
2629 <description>The interval between reporting lock warnings.
2630 </description>
2631 </property>
2632
{code}
I believe the default value of config key
{{dfs.lock.suppress.warning.interval.ms}} is 2 mins not 1 second?
Minor comments:
# In line 1571, message {{"Number of suppressed write-lock reports: " +
numSuppressedWarnings);}} should have a "\n" or "\t" before it.
# Let's make {{private final long writeLockReportingThreshold;}} and {{private
final long writeLockSuppressWarningInterval;}} final.
As the following work, [~jingzhao] also suggest we have a look at the
feasibility to expose this information to nntop metrics.
was (Author: liuml07):
{code:title=DFSConfigKeys.java}
414 // Threshold for how long the write lock warnings must be suppressed
415 public static final String DFS_LOCK_SUPPRESS_WARNING_INTERVAL_MS_KEY =
416 "dfs.lock.suppress.warning.interval.ms";
417 public static final long
DFS_LOCK_SUPPRESS_WARNING_INTERVAL_MS_DEFAULT =
418 120000L;
{code}
And
{code:title=hdfs-default.xml}
2626 <property>
2627 <name>dfs.lock.suppress.warning.interval.ms</name>
2628 <value>1000</value>
2629 <description>The interval between reporting lock warnings.
2630 </description>
2631 </property>
2632
{code}
I believe the default value of config key
{{dfs.lock.suppress.warning.interval.ms}} is 2 mins not 1 second?
Minor comments:
In line 1571, message {{"Number of suppressed write-lock reports: " +
numSuppressedWarnings);}} should have a "\n" or "\t" before it.
> Throttle FsNameSystem lock warnings
> -----------------------------------
>
> Key: HDFS-10713
> URL: https://issues.apache.org/jira/browse/HDFS-10713
> Project: Hadoop HDFS
> Issue Type: Bug
> Components: logging, namenode
> Reporter: Arpit Agarwal
> Assignee: Hanisha Koneru
> Attachments: HDFS-10713.000.patch, HDFS-10713.001.patch,
> HDFS-10713.002.patch
>
>
> The NameNode logs a message if the FSNamesystem write lock is held by a
> thread for over 1 second. These messages can be throttled to at one most one
> per x minutes to avoid potentially filling up NN logs. We can also log the
> number of suppressed notices since the last log message.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]