Hi Roger,

" but it only spawns one container and still hangs after bootstrap"
    -- this probably is due to your local machine does not have enough
resource for the second container. Because I checked your log file, each
container is about 4GB.

"When I run it on our YARN cluster with a single container, it works
correctly.  When I tried it with 5 containers, it gets hung after consuming
the bootstrap topic."
   -- Have you figure it out? I have a looked at your log and also the
code. My suspect is that, there is a null enveloper somehow blocking the
process. If you can paste the trace level log, it will be more helpful
because many logs in chooser are trace level.

Thanks,

Fang, Yan
yanfang...@gmail.com

On Thu, Jun 18, 2015 at 5:20 PM, Roger Hoover <roger.hoo...@gmail.com>
wrote:

> I need some help.  I have a job which bootstraps one stream and then is
> supposed to read from two.  When I run it on our YARN cluster with a single
> container, it works correctly.  When I tried it with 5 containers, it gets
> hung after consuming the bootstrap topic.  I ran it with the grid script on
> my laptop (Mac OS X) with yarn.container.count=2 but it only spawns one
> container and still hangs after bootstrap.
>
> Debug logs are here: http://pastebin.com/af3KPvju
>
> I looked at JMX metrics and see:
> - Task Metrics - no value for kafka offset of non-bootstrapped stream
> -  SystemConsumerMetrics
>     - choose null keeps incrementing
>      - ssps-needed-by-chooser 1
>       - unprocessed-messages 62k
> - Bootstrapping Chooser
>   - lagging partitions 4
>   - laggin-batch-streams - 4
>   - batch-resets - 0
>
> Has anyone seen this or can offer ideas of how to better debug it?
>
> I'm using Samza 0.9.0 and YARN 2.4.0.
>
> Thanks!
>
> Roger
>

Reply via email to