[jira] [Commented] (MAPREDUCE-5705) mapreduce.task.io.sort.mb hardcoded cap at 2047
[ https://issues.apache.org/jira/browse/MAPREDUCE-5705?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15225221#comment-15225221 ] Karthik Kambatla commented on MAPREDUCE-5705: - Isn't this a duplicate of MAPREDUCE-5028? > mapreduce.task.io.sort.mb hardcoded cap at 2047 > --- > > Key: MAPREDUCE-5705 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-5705 > Project: Hadoop Map/Reduce > Issue Type: Bug >Affects Versions: 2.2.0 > Environment: Multinode Dell XD720 cluster Centos6 running HDP2 >Reporter: Joseph Niemiec > > mapreduce.task.io.sort.mb is hardcoded to not allow values larger then 2047. > If you enter a value larger then this the map tasks will always crash at this > line - > https://github.com/apache/hadoop-mapreduce/blob/HDFS-641/src/java/org/apache/hadoop/mapred/MapTask.java?source=cc#L746 > The nodes at dev site have over 380 GB of Ram each, we are not able to make > the best use of large mappers (15GB mappers) because of the hardcoded buffer > max. Is there a reason this value has been hardcoded? > -- > Also validated on my dev VM. Indeed setting io.sort.mb to 2047 works but 2048 > fails. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MAPREDUCE-5705) mapreduce.task.io.sort.mb hardcoded cap at 2047
[ https://issues.apache.org/jira/browse/MAPREDUCE-5705?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15224234#comment-15224234 ] Junping Du commented on MAPREDUCE-5705: --- MAPREDUCE-2308 is a very old JIRA for MRv1 age. Let's reopen this and fix it in 2.x. > mapreduce.task.io.sort.mb hardcoded cap at 2047 > --- > > Key: MAPREDUCE-5705 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-5705 > Project: Hadoop Map/Reduce > Issue Type: Bug >Affects Versions: 2.2.0 > Environment: Multinode Dell XD720 cluster Centos6 running HDP2 >Reporter: Joseph Niemiec > > mapreduce.task.io.sort.mb is hardcoded to not allow values larger then 2047. > If you enter a value larger then this the map tasks will always crash at this > line - > https://github.com/apache/hadoop-mapreduce/blob/HDFS-641/src/java/org/apache/hadoop/mapred/MapTask.java?source=cc#L746 > The nodes at dev site have over 380 GB of Ram each, we are not able to make > the best use of large mappers (15GB mappers) because of the hardcoded buffer > max. Is there a reason this value has been hardcoded? > -- > Also validated on my dev VM. Indeed setting io.sort.mb to 2047 works but 2048 > fails. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MAPREDUCE-5705) mapreduce.task.io.sort.mb hardcoded cap at 2047
[ https://issues.apache.org/jira/browse/MAPREDUCE-5705?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13862483#comment-13862483 ] Harsh J commented on MAPREDUCE-5705: Correction: Right JIRA for map side limitation is MAPREDUCE-2308. > mapreduce.task.io.sort.mb hardcoded cap at 2047 > --- > > Key: MAPREDUCE-5705 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-5705 > Project: Hadoop Map/Reduce > Issue Type: Bug >Affects Versions: 2.2.0 > Environment: Multinode Dell XD720 cluster Centos6 running HDP2 >Reporter: Joseph Niemiec > > mapreduce.task.io.sort.mb is hardcoded to not allow values larger then 2047. > If you enter a value larger then this the map tasks will always crash at this > line - > https://github.com/apache/hadoop-mapreduce/blob/HDFS-641/src/java/org/apache/hadoop/mapred/MapTask.java?source=cc#L746 > The nodes at dev site have over 380 GB of Ram each, we are not able to make > the best use of large mappers (15GB mappers) because of the hardcoded buffer > max. Is there a reason this value has been hardcoded? > -- > Also validated on my dev VM. Indeed setting io.sort.mb to 2047 works but 2048 > fails. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (MAPREDUCE-5705) mapreduce.task.io.sort.mb hardcoded cap at 2047
[ https://issues.apache.org/jira/browse/MAPREDUCE-5705?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13861913#comment-13861913 ] Gera Shegalov commented on MAPREDUCE-5705: -- Output buffer is implemented using a single byte array kvbuffer. Java Language Spec defines arrays as being accessed using [non-negative integer index values|http://docs.oracle.com/javase/specs/jls/se7/html/jls-10.html], Hence Integer.MAX_VALUE , 2^31 - 1 is the max size. In order to work with a larger buffer, one would either need a multi-array structure or use a wider primitive type like long with special encoding. > mapreduce.task.io.sort.mb hardcoded cap at 2047 > --- > > Key: MAPREDUCE-5705 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-5705 > Project: Hadoop Map/Reduce > Issue Type: Bug >Affects Versions: 2.2.0 > Environment: Multinode Dell XD720 cluster Centos6 running HDP2 >Reporter: Joseph Niemiec >Assignee: Joseph Niemiec > > mapreduce.task.io.sort.mb is hardcoded to not allow values larger then 2047. > If you enter a value larger then this the map tasks will always crash at this > line - > https://github.com/apache/hadoop-mapreduce/blob/HDFS-641/src/java/org/apache/hadoop/mapred/MapTask.java?source=cc#L746 > The nodes at dev site have over 380 GB of Ram each, we are not able to make > the best use of large mappers (15GB mappers) because of the hardcoded buffer > max. Is there a reason this value has been hardcoded? > -- > Also validated on my dev VM. Indeed setting io.sort.mb to 2047 works but 2048 > fails. -- This message was sent by Atlassian JIRA (v6.1.5#6160)