Sean/Junping- Ignoring the epistemology, it's a problem. Let's figure out what's causing memory to balloon and then we can work out the appropriate remedy.
Is this reproducible outside the CI environment? To Junping's point, would YETUS-561 provide more detailed information to aid debugging? -C On Tue, Oct 24, 2017 at 2:50 PM, Junping Du <j...@hortonworks.com> wrote: > In general, the "solid evidence" of memory leak comes from analysis of > heapdump, jastack, gc log, etc. In many cases, we can locate/conclude which > piece of code are leaking memory from the analysis. > > Unfortunately, I cannot find any conclusion from previous comments and it > even cannot tell which daemons/components of HDFS consumes unexpected high > memory. Don't sounds like a solid bug report to me. > > > > Thanks,? > > > Junping > > > ________________________________ > From: Sean Busbey <bus...@cloudera.com> > Sent: Tuesday, October 24, 2017 2:20 PM > To: Junping Du > Cc: Allen Wittenauer; Hadoop Common; Hdfs-dev; > mapreduce-...@hadoop.apache.org; yarn-...@hadoop.apache.org > Subject: Re: Apache Hadoop qbt Report: branch2+JDK7 on Linux/x86 > > Just curious, Junping what would "solid evidence" look like? Is the > supposition here that the memory leak is within HDFS test code rather than > library runtime code? How would such a distinction be shown? > > On Tue, Oct 24, 2017 at 4:06 PM, Junping Du > <j...@hortonworks.com<mailto:j...@hortonworks.com>> wrote: > Allen, > Do we have any solid evidence to show the HDFS unit tests going through > the roof are due to serious memory leak by HDFS? Normally, I don't expect > memory leak are identified in our UTs - mostly, it (test jvm gone) is just > because of test or deployment issues. > Unless there is concrete evidence, my concern on seriously memory leak > for HDFS on 2.8 is relatively low given some companies (Yahoo, Alibaba, etc.) > have deployed 2.8 on large production environment for months. Non-serious > memory leak (like forgetting to close stream in non-critical path, etc.) and > other non-critical bugs always happens here and there that we have to live > with. > > Thanks, > > Junping > > ________________________________________ > From: Allen Wittenauer > <a...@effectivemachines.com<mailto:a...@effectivemachines.com>> > Sent: Tuesday, October 24, 2017 8:27 AM > To: Hadoop Common > Cc: Hdfs-dev; > mapreduce-...@hadoop.apache.org<mailto:mapreduce-...@hadoop.apache.org>; > yarn-...@hadoop.apache.org<mailto:yarn-...@hadoop.apache.org> > Subject: Re: Apache Hadoop qbt Report: branch2+JDK7 on Linux/x86 > >> On Oct 23, 2017, at 12:50 PM, Allen Wittenauer >> <a...@effectivemachines.com<mailto:a...@effectivemachines.com>> wrote: >> >> >> >> With no other information or access to go on, my current hunch is that one >> of the HDFS unit tests is ballooning in memory size. The easiest way to >> kill a Linux machine is to eat all of the RAM, thanks to overcommit and >> that's what this "feels" like. >> >> Someone should verify if 2.8.2 has the same issues before a release goes out >> ... > > > FWIW, I ran 2.8.2 last night and it has the same problems. > > Also: the node didn't die! Looking through the workspace (so the > next run will destroy them), two sets of logs stand out: > > https://builds.apache.org/job/hadoop-qbt-branch2-java7-linux-x86/ws/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt > > and > > https://builds.apache.org/job/hadoop-qbt-branch2-java7-linux-x86/ws/sourcedir/hadoop-hdfs-project/hadoop-hdfs/ > > It looks like my hunch is correct: RAM in the HDFS unit tests are > going through the roof. It's also interesting how MANY log files there are. > Is surefire not picking up that jobs are dying? Maybe not if memory is > getting tight. > > Anyway, at the point, branch-2.8 and higher are probably fubar'd. > Additionally, I've filed YETUS-561 so that Yetus-controlled Docker containers > can have their RAM limits set in order to prevent more nodes going catatonic. > > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: > yarn-dev-unsubscr...@hadoop.apache.org<mailto:yarn-dev-unsubscr...@hadoop.apache.org> > For additional commands, e-mail: > yarn-dev-h...@hadoop.apache.org<mailto:yarn-dev-h...@hadoop.apache.org> > > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: > common-dev-unsubscr...@hadoop.apache.org<mailto:common-dev-unsubscr...@hadoop.apache.org> > For additional commands, e-mail: > common-dev-h...@hadoop.apache.org<mailto:common-dev-h...@hadoop.apache.org> > > > > > -- > busbey --------------------------------------------------------------------- To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org