[ 
https://issues.apache.org/jira/browse/HDFS-10713?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15453359#comment-15453359
 ] 

Mingliang Liu edited comment on HDFS-10713 at 8/31/16 9:42 PM:
---------------------------------------------------------------

One concern is that do we need to dump the longest lock interval information 
during the suppressed interval, including lock-holding interval and its thread 
stack. This should reveal more useful information. One extreme example is a 
case where two threads (t1 and t2) holding the write lock in a sequence: 
*t1-1s, t2-100s, t1-1s*, in the current implementation the t2 information will 
be missing though it's more interesting.

{code:title=DFSConfigKeys.java}
414       // Threshold for how long the write lock warnings must be suppressed
415       public static final String DFS_LOCK_SUPPRESS_WARNING_INTERVAL_MS_KEY =
416           "dfs.lock.suppress.warning.interval.ms";
417       public static final long 
DFS_LOCK_SUPPRESS_WARNING_INTERVAL_MS_DEFAULT =
418           120000L;
{code}
And
{code:title=hdfs-default.xml}
        2626    <property>
2627      <name>dfs.lock.suppress.warning.interval.ms</name>
2628      <value>1000</value>
2629      <description>The interval between reporting lock warnings.
2630      </description>
2631    </property>
2632    
{code}
I believe the default value of config key 
{{dfs.lock.suppress.warning.interval.ms}} is 2 mins not 1 second?

Minor comments:
# In line 1571, message {{"Number of suppressed write-lock reports: " + 
numSuppressedWarnings);}} should have a "\n" or "\t" before it.
# Let's make {{private final long writeLockReportingThreshold;}} and {{private 
final long writeLockSuppressWarningInterval;}} final.

As the following work, [~jingzhao] also suggest we have a look at the 
feasibility to expose this information to nntop metrics.


was (Author: liuml07):
{code:title=DFSConfigKeys.java}
414       // Threshold for how long the write lock warnings must be suppressed
415       public static final String DFS_LOCK_SUPPRESS_WARNING_INTERVAL_MS_KEY =
416           "dfs.lock.suppress.warning.interval.ms";
417       public static final long 
DFS_LOCK_SUPPRESS_WARNING_INTERVAL_MS_DEFAULT =
418           120000L;
{code}
And
{code:title=hdfs-default.xml}
        2626    <property>
2627      <name>dfs.lock.suppress.warning.interval.ms</name>
2628      <value>1000</value>
2629      <description>The interval between reporting lock warnings.
2630      </description>
2631    </property>
2632    
{code}
I believe the default value of config key 
{{dfs.lock.suppress.warning.interval.ms}} is 2 mins not 1 second?

Minor comments:
In line 1571, message {{"Number of suppressed write-lock reports: " + 
numSuppressedWarnings);}} should have a "\n" or "\t" before it.

> Throttle FsNameSystem lock warnings
> -----------------------------------
>
>                 Key: HDFS-10713
>                 URL: https://issues.apache.org/jira/browse/HDFS-10713
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: logging, namenode
>            Reporter: Arpit Agarwal
>            Assignee: Hanisha Koneru
>         Attachments: HDFS-10713.000.patch, HDFS-10713.001.patch, 
> HDFS-10713.002.patch
>
>
> The NameNode logs a message if the FSNamesystem write lock is held by a 
> thread for over 1 second. These messages can be throttled to at one most one 
> per x minutes to avoid potentially filling up NN logs. We can also log the 
> number of suppressed notices since the last log message.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to