[ 
https://issues.apache.org/jira/browse/TEZ-3831?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16684523#comment-16684523
 ] 

Jaume M commented on TEZ-3831:
------------------------------

[~jeagles] I'm looking into a bug in which tez hangs [in this take()| 
https://github.com/apache/tez/blob/efc73318342e40dac30ec321119c6536b67c0a64/tez-runtime-library/src/main/java/org/apache/tez/runtime/library/common/shuffle/impl/ShuffleManager.java#L985].
 This is happening when {{InputAttemptIdentifier.fetchTypeInfo == 
SPILL_INFO.FINAL_MERGE_ENABLED}}.

The poison pill wouldn't be added in the case in which the event is not done 
[here|https://github.com/apache/tez/blob/efc73318342e40dac30ec321119c6536b67c0a64/tez-runtime-library/src/main/java/org/apache/tez/runtime/library/common/shuffle/impl/ShuffleManager.java#L863].
 Could this lead to the take() blocking indefinitely?

> Reduce Unordered memory needed for storing empty completed events
> -----------------------------------------------------------------
>
>                 Key: TEZ-3831
>                 URL: https://issues.apache.org/jira/browse/TEZ-3831
>             Project: Apache Tez
>          Issue Type: Bug
>            Reporter: Jonathan Eagles
>            Assignee: Jonathan Eagles
>            Priority: Major
>             Fix For: 0.9.1
>
>         Attachments: Screen Shot 2017-09-13 at 4.55.11 PM.png, 
> TEZ-3831.001-addendum.patch, TEZ-3831.001.patch
>
>
> the completedInputs blocking queue is used to store inputs for the 
> UnorderedKVReader to consume. With Auto-reduce parallelism enabled and nearly 
> all empty inputs, the reader can't prune the empty events from the blocking 
> queue fast enough to keep up. In my scenario, an OOM occurred. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to