Allen,
Do we have any solid evidence to show the HDFS unit tests going through
the roof are due to serious memory leak by HDFS? Normally, I don't expect
memory leak are identified in our UTs - mostly, it (test jvm gone) is just
because of test or deployment issues.
Unless there is concrete evidence, my concern on seriously memory leak for
HDFS on 2.8 is relatively low given some companies (Yahoo, Alibaba, etc.) have
deployed 2.8 on large production environment for months. Non-serious memory
leak (like forgetting to close stream in non-critical path, etc.) and other
non-critical bugs always happens here and there that we have to live with.
Thanks,
Junping
________________________________________
From: Allen Wittenauer <[email protected]>
Sent: Tuesday, October 24, 2017 8:27 AM
To: Hadoop Common
Cc: Hdfs-dev; [email protected]; [email protected]
Subject: Re: Apache Hadoop qbt Report: branch2+JDK7 on Linux/x86
> On Oct 23, 2017, at 12:50 PM, Allen Wittenauer <[email protected]>
> wrote:
>
>
>
> With no other information or access to go on, my current hunch is that one of
> the HDFS unit tests is ballooning in memory size. The easiest way to kill a
> Linux machine is to eat all of the RAM, thanks to overcommit and that’s what
> this “feels” like.
>
> Someone should verify if 2.8.2 has the same issues before a release goes out …
FWIW, I ran 2.8.2 last night and it has the same problems.
Also: the node didn’t die! Looking through the workspace (so the next
run will destroy them), two sets of logs stand out:
https://builds.apache.org/job/hadoop-qbt-branch2-java7-linux-x86/ws/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt
and
https://builds.apache.org/job/hadoop-qbt-branch2-java7-linux-x86/ws/sourcedir/hadoop-hdfs-project/hadoop-hdfs/
It looks like my hunch is correct: RAM in the HDFS unit tests are
going through the roof. It’s also interesting how MANY log files there are.
Is surefire not picking up that jobs are dying? Maybe not if memory is getting
tight.
Anyway, at the point, branch-2.8 and higher are probably fubar’d.
Additionally, I’ve filed YETUS-561 so that Yetus-controlled Docker containers
can have their RAM limits set in order to prevent more nodes going catatonic.
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]