[
https://issues.apache.org/jira/browse/HDFS-14783?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Haibin Huang updated HDFS-14783:
--------------------------------
Description:
SlowPeersReport is generated by the SampleStat between tow dn, so it can
present on nn's jmx like this:
{code:java}
"SlowPeersReport" :[{"SlowNode":"dn2","ReportingNodes":["dn1"]}]
{code}
In each period, MutableRollingAverages will do a rollOverAvgs(), it will
generate a SumAndCount object which is based on SampleStat, and store it in a
LinkedBlockingDeque<SumAndCount>, the deque will be used to generate
SlowPeersReport. And the old member of deque won't be removed until the queue
is full. However, if dn1 don't send any packet to dn2 in the last of 36*300_000
ms, the deque will be filled with an old member, because the number of last
SampleStat never change.I think these old SampleStats should be considered as
expired message and ignore them when generating a new SlowPeersReport.
was:
SlowPeersReport is generated by the SampleStat between tow dn, so it can
present on nn's jmx like this:
{code:java}
"SlowPeersReport" :[{"SlowNode":"dn2","ReportingNodes":["dn1"]}]
{code}
In each period, MutableRollingAverages will do a rollOverAvgs(), it will
generate a SumAndCount object which is based on SampleStat, and store it in a
LinkedBlockingDeque<SumAndCount>, the deque will be used to generate
SlowPeersReport. And the old member of deque won't be removed until the queue
is full. However, if dn1 don't send any packet to dn2 in the last of 36*300_000
ms, the deque will be filled with an old member, because the number of last
SampleStat never change.I think this old SampleStat should consider to be
expired and ignore it when
the SampleStat is stored in a LinkedBlockingDeque<SumAndCount>, it won't be
removed until the queue is full and a newest one is generated. Therefore, if
dn1 don't send any packet to dn2 for a long time, the old SampleStat will keep
staying in the queue, and will be used to calculated slowpeer.I think these old
SampleStats should be considered as expired message and ignore them when
generating a new SlowPeersReport.
> Expired SampleStat should ignore when generating SlowPeersReport
> ----------------------------------------------------------------
>
> Key: HDFS-14783
> URL: https://issues.apache.org/jira/browse/HDFS-14783
> Project: Hadoop HDFS
> Issue Type: Bug
> Reporter: Haibin Huang
> Assignee: Haibin Huang
> Priority: Major
> Attachments: HDFS-14783, HDFS-14783-001.patch, HDFS-14783-002.patch,
> HDFS-14783-003.patch, HDFS-14783-004.patch, HDFS-14783-005.patch
>
>
> SlowPeersReport is generated by the SampleStat between tow dn, so it can
> present on nn's jmx like this:
> {code:java}
> "SlowPeersReport" :[{"SlowNode":"dn2","ReportingNodes":["dn1"]}]
> {code}
> In each period, MutableRollingAverages will do a rollOverAvgs(), it will
> generate a SumAndCount object which is based on SampleStat, and store it in a
> LinkedBlockingDeque<SumAndCount>, the deque will be used to generate
> SlowPeersReport. And the old member of deque won't be removed until the queue
> is full. However, if dn1 don't send any packet to dn2 in the last of
> 36*300_000 ms, the deque will be filled with an old member, because the
> number of last SampleStat never change.I think these old SampleStats should
> be considered as expired message and ignore them when generating a new
> SlowPeersReport.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]