Re: Configuration set up questions - Container killed on request. Exit code is 143

Tsuyoshi OZAWA Fri, 18 Jul 2014 10:08:27 -0700

Hi Chris MacKenzie,

How about trying as follows to identify the reason of your problem?


1. Making both yarn.nodemanager.pmem-check-enabled and
yarn.nodemanager.vmem-check-enabled false
2. Making yarn.nodemanager.pmem-check-enabled true
3. Making yarn.nodemanager.pmem-check-enabled true and
yarn.nodemanager.vmem-pmem-ratio large value(e.g. 100)
4. Making yarn.nodemanager.pmem-check-enabled true and
yarn.nodemanager.vmem-pmem-ratio expected value(e.g. 2.1 or something)

If there is problem on 1, the reason may be JVM configuration problem
or another issue. If there is problem on 2, the reason is shortage of
physical memory.

Thanks,
- Tsuyoshi


On Fri, Jul 18, 2014 at 6:52 PM, Chris MacKenzie
<stu...@chrismackenziephotography.co.uk> wrote:
> Hi Guys,
>
> Thanks very much for getting back to me.
>
>
> Thanks Chris - the idea of slitting the data is a great suggestion.
> Yes Wangda, I was restarting after changing the configs
>
> I’ve been checking the relationship between what I thought was in my
> config files and what hadoop thought were in them.
>
> With:
>
> // Print out Config file settings for testing.
>                 for (Entry<String, String> entry: conf){
>         System.out.printf("%s=%s\n", entry.getKey(), entry.getValue());
>                 }
>
>
>
> There were anomalies ;0(
>
> Now that my hadoop reflects the values that are in my config files - I
> just get the message “Killed” without any explanation.
>
>
> Unfortunately, where I was applying changes incrementally and testing I’ve
> applied all the changes all at once.
>
> I’m now backing out the changes I made slowly to see where it starts to
> reflect what I expect.
>
> Regards,
>
> Chris MacKenzie
> telephone: 0131 332 6967
> email: stu...@chrismackenziephotography.co.uk
> corporate: www.chrismackenziephotography.co.uk
> <http://www.chrismackenziephotography.co.uk/>
> <http://plus.google.com/+ChrismackenziephotographyCoUk/posts>
> <http://www.linkedin.com/in/chrismackenziephotography/>
>
>
>
>
>
>
> From:  Chris Mawata <chris.maw...@gmail.com>
> Reply-To:  <user@hadoop.apache.org>
> Date:  Thursday, 17 July 2014 16:15
> To:  <user@hadoop.apache.org>
> Subject:  Re: Configuration set up questions - Container killed on
> request. Exit code is 143
>
>
> Another thing to try is smaller input splits if your data can be broken up
> into smaller files that can be independently processed. That way s
> you get more but smaller map tasks. You could also use more  but smaller
> reducers. The many files will tax your NameNode more but you might get to
> use all you cores.
> On Jul 17, 2014 9:07 AM, "Chris MacKenzie"
> <stu...@chrismackenziephotography.co.uk> wrote:
>
> Hi Chris,
>
> Thanks for getting back to me. I will set that value to 10
>
> I have just tried this.
> https://support.gopivotal.com/hc/en-us/articles/201462036-Mapreduce-YARN-Me
> mory-Parameters
>
> Setting both to mapreduce.map.memory.mb mapreduce.reduce.memory.mb. Though
> after setting it I didn’t get the expected change.
>
> As the output was still 2.1 GB of 2.1 GB virtual memory used. Killing
> container
>
>
> Regards,
>
> Chris MacKenzie
> telephone: 0131 332 6967
> email: stu...@chrismackenziephotography.co.uk
> corporate: www.chrismackenziephotography.co.uk
> <http://www.chrismackenziephotography.co.uk>
> <http://www.chrismackenziephotography.co.uk/>
> <http://plus.google.com/+ChrismackenziephotographyCoUk/posts>
> <http://www.linkedin.com/in/chrismackenziephotography/>
>
>
>
>
>
>
> From:  Chris Mawata <chris.maw...@gmail.com>
> Reply-To:  <user@hadoop.apache.org>
> Date:  Thursday, 17 July 2014 13:36
> To:  Chris MacKenzie <stu...@chrismackenziephotography.co.uk>
> Cc:  <user@hadoop.apache.org>
> Subject:  Re: Configuration set up questions - Container killed on
> request. Exit code is 143
>
>
> Hi Chris MacKenzie,      I have a feeling (I am not familiar with the kind
> of work you are doing) that your application is memory intensive.  8 cores
> per node and only 12GB is tight. Try bumping up the
> yarn.nodemanager.vmem-pmem-ratio
> Chris Mawata
>
>
>
>
> On Wed, Jul 16, 2014 at 11:37 PM, Chris MacKenzie
> <stu...@chrismackenziephotography.co.uk> wrote:
>
> Hi,
>
> Thanks Chris Mawata
> I’m working through this myself, but wondered if anyone could point me in
> the right direction.
>
> I have attached my configs.
>
>
> I’m using hadoop 2.41
>
> My system is:
> 32 Clusters
> 8 processors per machine
> 12 gb ram
> Available disk space per node 890 gb
>
> This is my current error:
>
> mapreduce.Job (Job.java:printTaskEvents(1441)) - Task Id :
> attempt_1405538067846_0006_r_000000_1, Status : FAILED
> Container [pid=25848,containerID=container_1405538067846_0006_01_000004]
> is running beyond virtual memory limits. Current usage: 439.0 MB of 1 GB
> physical memory used; 2.2 GB of 2.1 GB virtual memory used. Killing
> container.
> Dump of the process-tree for container_1405538067846_0006_01_000004 :
>         |- PID PPID PGRPID SESSID CMD_NAME USER_MODE_TIME(MILLIS)
> SYSTEM_TIME(MILLIS) VMEM_USAGE(BYTES) RSSMEM_USAGE(PAGES) FULL_CMD_LINE
>         |- 25853 25848 25848 25848 (java) 2262 193 2268090368 112050
> /usr/java/latest//bin/java -Djava.net.preferIPv4Stack=true
> -Dhadoop.metrics.log.level=WARN -Xmx768m
> -Djava.io.tmpdir=/tmp/hadoop-cm469/nm-local-dir/usercache/cm469/appcache/ap
> plication_1405538067846_0006/container_1405538067846_0006_01_000004/tmp
> -Dlog4j.configuration=container-log4j.properties
> -Dyarn.app.container.log.dir=/scratch/extra/cm469/hadoop-2.4.1/logs/userlog
> s/application_1405538067846_0006/container_1405538067846_0006_01_000004
> -Dyarn.app.container.log.filesize=0 -Dhadoop.root.logger=INFO,CLA
> org.apache.hadoop.mapred.YarnChild 137.195.143.103 59056
> attempt_1405538067846_0006_r_000000_1 4
>         |- 25848 25423 25848 25848 (bash) 0 0 108613632 333 /bin/bash -c
> /usr/java/latest//bin/java -Djava.net.preferIPv4Stack=true
> -Dhadoop.metrics.log.level=WARN  -Xmx768m
> -Djava.io.tmpdir=/tmp/hadoop-cm469/nm-local-dir/usercache/cm469/appcache/ap
> plication_1405538067846_0006/container_1405538067846_0006_01_000004/tmp
> -Dlog4j.configuration=container-log4j.properties
> -Dyarn.app.container.log.dir=/scratch/extra/cm469/hadoop-2.4.1/logs/userlog
> s/application_1405538067846_0006/container_1405538067846_0006_01_000004
> -Dyarn.app.container.log.filesize=0 -Dhadoop.root.logger=INFO,CLA
> org.apache.hadoop.mapred.YarnChild 137.195.143.103 59056
> attempt_1405538067846_0006_r_000000_1 4
> 1>/scratch/extra/cm469/hadoop-2.4.1/logs/userlogs/application_1405538067846
> _0006/container_1405538067846_0006_01_000004/stdout
> 2>/scratch/extra/cm469/hadoop-2.4.1/logs/userlogs/application_1405538067846
> _0006/container_1405538067846_0006_01_000004/stderr
>
> Container killed on request. Exit code is 143
> Container exited with a non-zero exit code 143
>
>
>
>
>
>
> Regards,
>
> Chris MacKenzie
> telephone: 0131 332 6967
> email: stu...@chrismackenziephotography.co.uk
> corporate: www.chrismackenziephotography.co.uk
> <http://www.chrismackenziephotography.co.uk>
> <http://www.chrismackenziephotography.co.uk>
> <http://www.chrismackenziephotography.co.uk/>
> <http://plus.google.com/+ChrismackenziephotographyCoUk/posts>
> <http://www.linkedin.com/in/chrismackenziephotography/>
>
>
>
>
>
>
> From:  Chris Mawata <chris.maw...@gmail.com>
> Reply-To:  <user@hadoop.apache.org>
> Date:  Thursday, 17 July 2014 02:10
> To:  <user@hadoop.apache.org>
> Subject:  Re: Can someone shed some light on this ? - java.io.IOException:
> Spill failed
>
>
> I would post the configuration files -- easier for someone to spot
> something wrong than to imagine what configuration would get you to that
> stacktrace. The part
> Caused by: org.apache.hadoop.util.DiskChecker$DiskErrorException: Could
> not find any valid local directory for
> attempt_1405523201400_0006_m_000000_0_spill_8.out
>
> would suggest you might not have hadoop.tmp.dir set (?)
>
>
>
> On Wed, Jul 16, 2014 at 1:02 PM, Chris MacKenzie
> <stu...@chrismackenziephotography.co.uk> wrote:
>
> Hi,
>
> Is this a coding or a setup issue ?
>
> I¹m using Hadoop 2.41
> My program is doing a concordance on 500,000 sequences of 400 chars.
> My cluster set is 32 data nodes and two masters.
>
> The exact error is:
> Error: java.io.IOException: Spill failed
>         at
> org.apache.hadoop.mapred.MapTask$MapOutputBuffer.checkSpillException(MapTas
> k.java:1535)
>         at
> org.apache.hadoop.mapred.MapTask$MapOutputBuffer.collect(MapTask.java:1062)
>         at
> org.apache.hadoop.mapred.MapTask$NewOutputCollector.write(MapTask.java:692)
>         at
> org.apache.hadoop.mapreduce.task.TaskInputOutputContextImpl.write(TaskInput
> OutputContextImpl.java:89)
>         at
> org.apache.hadoop.mapreduce.lib.map.WrappedMapper$Context.write(WrappedMapp
> er.java:112)
>         at
> par.gene.align.v3.concordance.ConcordanceMapper.map(ConcordanceMapper.java:
> 96)
>         at
> par.gene.align.v3.concordance.ConcordanceMapper.map(ConcordanceMapper.java:
> 1)
>         at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:145)
>         at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764)
>         at org.apache.hadoop.mapred.MapTask.run(MapTask.java:340)
>         at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:167)
>         at java.security.AccessController.doPrivileged(Native Method)
>         at javax.security.auth.Subject.doAs(Subject.java:415)
>         at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.j
> ava:1556)
>         at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:162)
> Caused by: org.apache.hadoop.util.DiskChecker$DiskErrorException: Could
> not find any valid local directory for
> attempt_1405523201400_0006_m_000000_0_spill_8.out
>         at
> org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.getLocalPathForW
> rite(LocalDirAllocator.java:402)
>         at
> org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocat
> or.java:150)
>         at
> org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocat
> or.java:131)
>         at
> org.apache.hadoop.mapred.YarnOutputFiles.getSpillFileForWrite(YarnOutputFil
> es.java:159)
>         at
> org.apache.hadoop.mapred.MapTask$MapOutputBuffer.sortAndSpill(MapTask.java:
> 1566)
>         at
> org.apache.hadoop.mapred.MapTask$MapOutputBuffer.access$900(MapTask.java:85
> 3)
>         at
> org.apache.hadoop.mapred.MapTask$MapOutputBuffer$SpillThread.run(MapTask.ja
> va:1505)
>
> Regards,
>
> Chris
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>



-- 
- Tsuyoshi

Re: Configuration set up questions - Container killed on request. Exit code is 143

Reply via email to