[
https://issues.apache.org/jira/browse/TEZ-2237?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14393468#comment-14393468
]
Siddharth Seth edited comment on TEZ-2237 at 4/2/15 9:30 PM:
-------------------------------------------------------------
OK. Here's what is happening.
OrderedPartitionedKVOutput does not generate any events if the Output is not
started. The downstream consumer, however, expects events to come in before it
completes. Since none were generated, these events will never come in and these
tasks get stuck.
That I think is the core bug, which needs to be addressed in Tez - as part of
this jira.
On a related note, we should try moving away from forcing certain events - and
see if there's a mechanism to generate (and time) an all events generated
message. I'll create a jira for this.
Look for "Attempting to close output {VertexName} before it was started" in the
logs.
There's two vertices where this happens - E7274DEFA42F4EFD9D950071DD0A9021,
DAD5EEB9962B4A65B1394B186692F7BB.
I'm assuming there's some logic in Cascading which only starts the Output if
there's some data to be written to them. A temporary mitigation while the 0
events is fixed would be to force start all Outputs irrespective of whether
there's output to be written or not.
Unrelated. The io.sort.mb is set to 100mb. You may want to increase this to
reduce the number of spills (I didn't verify if there are any - this can be
checked via ADDITIONAL_SPILLS counter)
was (Author: sseth):
OK. Here's what is happening.
OrderedPartitionedKVOutput does not generate any events if the Output is not
started. The downstream consumer, however, expects events to come in before it
completes. Since none were generated, these events will never come in and these
tasks get stuck.
That I think is the core bug, which needs to be addressed in Tez - as part of
this jira.
On a related note, we should try moving away from forcing certain events - and
see if there's a mechanism to generate (and time) an all events generated
message. I'll create a jira for this.
There's two vertices where this happens - E7274DEFA42F4EFD9D950071DD0A9021,
DAD5EEB9962B4A65B1394B186692F7BB.
I'm assuming there's some logic in Cascading which only starts the Output if
there's some data to be written to them. A temporary mitigation while the 0
events is fixed would be to force start all Outputs irrespective of whether
there's output to be written or not.
Unrelated. The io.sort.mb is set to 100mb. You may want to increase this to
reduce the number of spills (I didn't verify if there are any - this can be
checked via ADDITIONAL_SPILLS counter)
> Complex DAG freezes and fails (was BufferTooSmallException raised in
> UnorderedPartitionedKVWriter then DAG lingers)
> -------------------------------------------------------------------------------------------------------------------
>
> Key: TEZ-2237
> URL: https://issues.apache.org/jira/browse/TEZ-2237
> Project: Apache Tez
> Issue Type: Bug
> Affects Versions: 0.6.0
> Environment: Debian Linux "jessie"
> OpenJDK Runtime Environment (build 1.8.0_40-internal-b27)
> OpenJDK 64-Bit Server VM (build 25.40-b25, mixed mode)
> 7 * Intel(R) Core(TM) i7-3770 CPU @ 3.40GHz, 16/24 GB RAM per node, 1*system
> disk + 4*1 or 2 TiB HDD for HDFS & local (on-prem, dedicated hardware)
> Scalding 0.13.1 modified with https://github.com/twitter/scalding/pull/1220
> to run Cascading 3.0.0-wip-90 with TEZ 0.6.0
> Reporter: Cyrille Chépélov
> Attachments: all_stacks.lst, alloc_mem.png, alloc_vcores.png,
> application_1427324000018_1444.yarn-logs.red.txt.gz,
> application_1427324000018_1908.red.txt.bz2,
> appmaster____syslog_dag_1427282048097_0215_1.red.txt.gz,
> appmaster____syslog_dag_1427282048097_0237_1.red.txt.gz,
> gc_count_MRAppMaster.png, mem_free.png, ordered-grouped-kv-input-traces.diff,
> start_containers.png, stop_containers.png,
> syslog_attempt_1427282048097_0215_1_21_000014_0.red.txt.gz,
> syslog_attempt_1427282048097_0237_1_70_000028_0.red.txt.gz, yarn_rm_flips.png
>
>
> On a specific DAG with many vertices (actually part of a larger meta-DAG),
> after about a hour of processing, several BufferTooSmallException are raised
> in UnorderedPartitionedKVWriter (about one every two or three spills).
> Once these exceptions are raised, the DAG remains indefinitely "active",
> tying up memory and CPU resources as far as YARN is concerned, while little
> if any actual processing takes place.
> It seems two separate issues are at hand:
> 1. BufferTooSmallException are raised even though, small as the actually
> allocated buffers seem to be (around a couple megabytes were allotted whereas
> 100MiB were requested), the actual keys and values are never bigger than 24
> and 1024 bytes respectively.
> 2. In the event BufferTooSmallExceptions are raised, the DAG fails to stop
> (stop requests appear to be sent 7 hours after the BTSE exceptions are
> raised, but 9 hours after these stop requests, the DAG was still lingering on
> with all containers present tying up memory and CPU allocations)
> The emergence of the BTSE prevent the Cascade to complete, preventing from
> validating the results compared to traditional MR1-based results. The lack of
> conclusion renders the cluster queue unavailable.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)