Re: Samza Job Slow to Restart

2017-10-24 Thread Liu Bo
met the same problem before and resolved with Yi's help, xD On 20 October 2017 at 06:10, Yi Pan wrote: > Awesome that you have figured it out! Just a general notice: any logcompact > topic used in Samza may see this slow-down if the Kafka log cleaner thread > dies, which

Re: Samza Job Slow to Restart

2017-10-19 Thread Yi Pan
Awesome that you have figured it out! Just a general notice: any logcompact topic used in Samza may see this slow-down if the Kafka log cleaner thread dies, which include checkpoint, coordinator stream, and changelog topics. Best! -Yi On Thu, Oct 19, 2017 at 12:14 PM, XiaoChuan Yu

Re: Samza Job Slow to Restart

2017-10-19 Thread XiaoChuan Yu
Hi, We were finally able to find out why the job takes so long to start. There was higher than normal network IO during job startup and so we checked size of the checkpoint topic on disk and it was ~21GB. We then restarted the Kafka node who was the leader for the checkpoint topic, the topic disk

Re: Samza Job Slow to Restart

2017-09-23 Thread XiaoChuan Yu
>> How long does it take? It took around 10 minute from "Got offset 0 for topic ... " to init() being called on the Task. >> Have you measured which parts of the start up sequence take the most time? >> - is it checkpoint restoration, or restore of local state? Should be checkpoint restoration.

Re: Samza Job Slow to Restart

2017-09-20 Thread Jagadish Venkatraman
Hi Xiaochuan, >> What does that loop do exactly? Most of what the run-loop does is documented in https://samza.apache.org/learn/documentation/0.9/container/event-loop.html >> We are running into a problem where it seems to take a very long time to restart a Samza job. Some follow-up questions,

Samza Job Slow to Restart

2017-09-20 Thread XiaoChuan Yu
Hi, We are running into a problem where it seems to take a very long time to restart a Samza job. We are using Samza 0.9.1 at the moment. >From the logs for a particular container it looks like it has something to do with reading checkpoints from Kafka: 2017-09-20 03:21:02.060 INFO