[
https://issues.apache.org/jira/browse/MAPREDUCE-6343?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14515851#comment-14515851
]
Ray Chiang commented on MAPREDUCE-6343:
---------------------------------------
I ran into this a few weeks ago as well. My notes on the issue:
- When either of the properties:
mapreduce.map.java.opts.max.heap
mapreduce.reduce.java.opts.max.heap
is set to a value greater than Integer.MAX_VALUE (or less than
Integer.MIN_VALUE, technically), the AM log throws the exception during
TaskAttemptImpl instantiation:
2015-04-27 16:24:39,938 FATAL [AsyncDispatcher event handler]
org.apache.hadoop.yarn.event.AsyncDispatcher: Error in dispatcher thread
java.lang.NumberFormatException: For input string: "3221225472"
at
java.lang.NumberFormatException.forInputString(NumberFormatException.java:65)
at java.lang.Integer.parseInt(Integer.java:495)
at java.lang.Integer.parseInt(Integer.java:527)
at
org.apache.hadoop.mapred.JobConf.parseMaximumHeapSizeMB(JobConf.java:2105)
at org.apache.hadoop.mapred.JobConf.getMemoryRequired(JobConf.java:2151)
at
org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl.getMemoryRequired(TaskAttemptImpl.java:563)
at
org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl.<init>(TaskAttemptImpl.java:542)
at
org.apache.hadoop.mapred.MapTaskAttemptImpl.<init>(MapTaskAttemptImpl.java:47)
at
org.apache.hadoop.mapreduce.v2.app.job.impl.MapTaskImpl.createAttempt(MapTaskImpl.java:62)
at
org.apache.hadoop.mapreduce.v2.app.job.impl.TaskImpl.addAttempt(TaskImpl.java:610)
at
org.apache.hadoop.mapreduce.v2.app.job.impl.TaskImpl.addAndScheduleAttempt(TaskImpl.java:597)
at
org.apache.hadoop.mapreduce.v2.app.job.impl.TaskImpl.access$1300(TaskImpl.java:101)
at
org.apache.hadoop.mapreduce.v2.app.job.impl.TaskImpl$InitialScheduleTransition.transition(TaskImpl.java:887)
at
org.apache.hadoop.mapreduce.v2.app.job.impl.TaskImpl$InitialScheduleTransition.transition(TaskImpl.java:882)
at
org.apache.hadoop.yarn.state.StateMachineFactory$SingleInternalArc.doTransition(StateMachineFactory.java:362)
at
org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:302)
at
org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:46)
at
org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:448)
at
org.apache.hadoop.mapreduce.v2.app.job.impl.TaskImpl.handle(TaskImpl.java:648)
at
org.apache.hadoop.mapreduce.v2.app.job.impl.TaskImpl.handle(TaskImpl.java:100)
at
org.apache.hadoop.mapreduce.v2.app.MRAppMaster$TaskEventDispatcher.handle(MRAppMaster.java:1299)
at
org.apache.hadoop.mapreduce.v2.app.MRAppMaster$TaskEventDispatcher.handle(MRAppMaster.java:1293)
at
org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:173)
at
org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:106)
at java.lang.Thread.run(Thread.java:745)
Note that the exception occurs during the constructor of TaskAttemptImpl.
- Modifying methods and members to change int to a long isn't really viable
unless trunk is ready to change the ResourceProto definition for memory from
int32 to int64 in yarn_protos.proto. This would be an incompatible change.
I'm not sure how strongly people feel about changing proto definitions.
- Capping the value might be reasonable by catching the NumberFormatException.
Setting the value to some "uninitialized" value (like -1) had some side effects
that needed to be checked for elsewhere in the code.
- It's fairly easy to pass such a value to JobConf in the unit test and get the
exception.
Does anyone suggestions for how best to fix this?
> JobConf.parseMaximumHeapSizeMB() fails to parse value greater than 2GB
> expressed in bytes
> -----------------------------------------------------------------------------------------
>
> Key: MAPREDUCE-6343
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6343
> Project: Hadoop Map/Reduce
> Issue Type: Bug
> Affects Versions: 3.0.0
> Reporter: Hao Xia
>
> It currently tries to parse the value as an integer, which blows up whenever
> the value is greater than 2GB and expressed in bytes.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)