[ https://issues.apache.org/jira/browse/TEZ-3452?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15545549#comment-15545549 ]
Jonathan Eagles commented on TEZ-3452: -------------------------------------- [~hitesh], [~rajesh.balamohan], [~bikassaha], [~sseth]. Total source output size calculation now is only approximate and could cause some confusion. Let me know if you prefer and I can use exact method and fallback if overflow is detected. > Auto-reduce parallelism calculation can overflow with large inputs > ------------------------------------------------------------------ > > Key: TEZ-3452 > URL: https://issues.apache.org/jira/browse/TEZ-3452 > Project: Apache Tez > Issue Type: Bug > Reporter: Jonathan Eagles > Assignee: Jonathan Eagles > Attachments: TEZ-3452.1.patch, TEZ-3452.2.patch > > > Overflow can occur when the numTasks is high (say 45000) and outputSize is > high (say 311TB) and slow start is set to 1.0. > {code:title=ShuffleVertexManager} > for (Map.Entry<String, SourceVertexInfo> vInfo : getBipartiteInfo()) { > SourceVertexInfo srcInfo = vInfo.getValue(); > if (srcInfo.numTasks > 0 && srcInfo.numVMEventsReceived > 0) { > // this assumes that 1 vmEvent is received per completed task - > TEZ-2961 > expectedTotalSourceTasksOutputSize += > (srcInfo.numTasks * srcInfo.outputSize) / > srcInfo.numVMEventsReceived; > } > } > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)