[ https://issues.apache.org/jira/browse/MAPREDUCE-5705?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13861913#comment-13861913 ]
Gera Shegalov commented on MAPREDUCE-5705: ------------------------------------------ Output buffer is implemented using a single byte array kvbuffer. Java Language Spec defines arrays as being accessed using [non-negative integer index values|http://docs.oracle.com/javase/specs/jls/se7/html/jls-10.html], Hence Integer.MAX_VALUE , 2^31 - 1 is the max size. In order to work with a larger buffer, one would either need a multi-array structure or use a wider primitive type like long with special encoding. > mapreduce.task.io.sort.mb hardcoded cap at 2047 > ----------------------------------------------- > > Key: MAPREDUCE-5705 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-5705 > Project: Hadoop Map/Reduce > Issue Type: Bug > Affects Versions: 2.2.0 > Environment: Multinode Dell XD720 cluster Centos6 running HDP2 > Reporter: Joseph Niemiec > Assignee: Joseph Niemiec > > mapreduce.task.io.sort.mb is hardcoded to not allow values larger then 2047. > If you enter a value larger then this the map tasks will always crash at this > line - > https://github.com/apache/hadoop-mapreduce/blob/HDFS-641/src/java/org/apache/hadoop/mapred/MapTask.java?source=cc#L746 > The nodes at dev site have over 380 GB of Ram each, we are not able to make > the best use of large mappers (15GB mappers) because of the hardcoded buffer > max. Is there a reason this value has been hardcoded? > -- > Also validated on my dev VM. Indeed setting io.sort.mb to 2047 works but 2048 > fails. -- This message was sent by Atlassian JIRA (v6.1.5#6160)