Just curious, Junping what would "solid evidence" look like? Is the supposition here that the memory leak is within HDFS test code rather than library runtime code? How would such a distinction be shown?
On Tue, Oct 24, 2017 at 4:06 PM, Junping Du <j...@hortonworks.com> wrote: > Allen, > Do we have any solid evidence to show the HDFS unit tests going > through the roof are due to serious memory leak by HDFS? Normally, I don't > expect memory leak are identified in our UTs - mostly, it (test jvm gone) > is just because of test or deployment issues. > Unless there is concrete evidence, my concern on seriously memory > leak for HDFS on 2.8 is relatively low given some companies (Yahoo, > Alibaba, etc.) have deployed 2.8 on large production environment for > months. Non-serious memory leak (like forgetting to close stream in > non-critical path, etc.) and other non-critical bugs always happens here > and there that we have to live with. > > Thanks, > > Junping > > ________________________________________ > From: Allen Wittenauer <a...@effectivemachines.com> > Sent: Tuesday, October 24, 2017 8:27 AM > To: Hadoop Common > Cc: Hdfs-dev; mapreduce-...@hadoop.apache.org; yarn-...@hadoop.apache.org > Subject: Re: Apache Hadoop qbt Report: branch2+JDK7 on Linux/x86 > > > On Oct 23, 2017, at 12:50 PM, Allen Wittenauer <a...@effectivemachines.com> > wrote: > > > > > > > > With no other information or access to go on, my current hunch is that > one of the HDFS unit tests is ballooning in memory size. The easiest way > to kill a Linux machine is to eat all of the RAM, thanks to overcommit and > that’s what this “feels” like. > > > > Someone should verify if 2.8.2 has the same issues before a release goes > out … > > > FWIW, I ran 2.8.2 last night and it has the same problems. > > Also: the node didn’t die! Looking through the workspace (so the > next run will destroy them), two sets of logs stand out: > > https://builds.apache.org/job/hadoop-qbt-branch2-java7- > linux-x86/ws/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt > > and > > https://builds.apache.org/job/hadoop-qbt-branch2-java7- > linux-x86/ws/sourcedir/hadoop-hdfs-project/hadoop-hdfs/ > > It looks like my hunch is correct: RAM in the HDFS unit tests are > going through the roof. It’s also interesting how MANY log files there > are. Is surefire not picking up that jobs are dying? Maybe not if memory > is getting tight. > > Anyway, at the point, branch-2.8 and higher are probably fubar’d. > Additionally, I’ve filed YETUS-561 so that Yetus-controlled Docker > containers can have their RAM limits set in order to prevent more nodes > going catatonic. > > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org > For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org > > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org > For additional commands, e-mail: common-dev-h...@hadoop.apache.org > > -- busbey