Re: Errors and hung job on broker shutdown

2015-05-12 Thread Dan
ide when I see > > this > > > > > happen. First a series of communication errors along the lines of > > the > > > > > following which I think were due to a broker bouncing or timing > out: > > > > > > > > > > WARN

Re: Errors and hung job on broker shutdown

2015-05-12 Thread Guozhang Wang
with correlation id 331092 on > > > > topic-partition -16, retrying (4 attempts left). Error: > > > > NOT_LEADER_FOR_PARTITION 2015-04-15 13:54:13,890 > > > > (org.apache.kafka.clients.producer.internals.Sender) > > > > > > > > But then the c

Re: Errors and hung job on broker shutdown

2015-05-11 Thread Dan
internals.Sender) > > > > > > But then the client dies with: > > > > > > java: target/snappy-1.1.1/snappy.cc:423: char* > > > snappy::internal::CompressFragment(const char*, size_t, char*, > > > snappy::uint16*, int): Assertion `0 == memcmp(base,

Re: Errors and hung job on broker shutdown

2015-05-11 Thread Guozhang Wang
: target/snappy-1.1.1/snappy.cc:423: char* > > snappy::internal::CompressFragment(const char*, size_t, char*, > > snappy::uint16*, int): Assertion `0 == memcmp(base, candidate, matched)' > > failed. > > > > I'll try and get some better traces and po

Re: Errors and hung job on broker shutdown

2015-05-08 Thread Dan
. > > I'll try and get some better traces and post over on the kafka list. But > it'll be after Strata this week. > > Cheers > Garry > > > -Original Message----- > From: Guozhang Wang [mailto:wangg...@gmail.com] > Sent: 04 May 2015 00:38 > To: dev@s

RE: Errors and hung job on broker shutdown

2015-05-04 Thread Garry Turkington
led. I'll try and get some better traces and post over on the kafka list. But it'll be after Strata this week. Cheers Garry -Original Message- From: Guozhang Wang [mailto:wangg...@gmail.com] Sent: 04 May 2015 00:38 To: dev@samza.apache.org Subject: Re: Errors and hung job on br

Re: Errors and hung job on broker shutdown

2015-05-03 Thread Guozhang Wang
t: 01 May 2015 23:57 > To: dev@samza.apache.org > Subject: Re: Errors and hung job on broker shutdown > > Hmm, it seems your snappy compressed data is corrupted and hence keep > getting rejected by the broker, hence keeping the producer blocked on > close(). Not sure how this happens as

RE: Errors and hung job on broker shutdown

2015-05-03 Thread Garry Turkington
nt: 01 May 2015 23:57 To: dev@samza.apache.org Subject: Re: Errors and hung job on broker shutdown Hmm, it seems your snappy compressed data is corrupted and hence keep getting rejected by the broker, hence keeping the producer blocked on close(). Not sure how this happens as I have not seen this

Re: Errors and hung job on broker shutdown

2015-05-02 Thread Guozhang Wang
Hmm, it seems your snappy compressed data is corrupted and hence keep getting rejected by the broker, hence keeping the producer blocked on close(). Not sure how this happens as I have not seen this error ever before (myself wrote the new Kafka producer's compression module, and have ran it with va

Re: Errors and hung job on broker shutdown

2015-04-29 Thread Roger Hoover
Guozhang and Yan, Thank you both for your responses. I tried a lot of combinations and I think I've determined that it's new producer + snappy that causes the issue. It never happens with the old producer and it never happens with lz4 or no compression. It only happens when a broker gets restar

Re: Errors and hung job on broker shutdown

2015-04-29 Thread Guozhang Wang
And just to answer your first question: SIGTERM with controlled.shutdown=true should be OK for bouncing the broker. Guozhang On Wed, Apr 29, 2015 at 7:36 PM, Guozhang Wang wrote: > Roger, > > I believe Samza 0.9.0 already uses the Java producer. > > Java producer's close() call will try to flus

Re: Errors and hung job on broker shutdown

2015-04-29 Thread Guozhang Wang
Roger, I believe Samza 0.9.0 already uses the Java producer. Java producer's close() call will try to flush all buffered data to the brokers before completing the call. However, if some buffered data's destination partition leader is not known, the producer will block on refreshing the metadata a

Re: Errors and hung job on broker shutdown

2015-04-29 Thread Yan Fang
Not sure about the Kafka side. From the Samza side, from your description ( "does not exit nor does it make any progress" ), I think the code is stuck in producer.close

Re: Errors and hung job on broker shutdown

2015-04-28 Thread Roger Hoover
At error level logging, this was the only entry in the Samza log: 2015-04-28 14:28:25 KafkaSystemProducer [ERROR] task[Partition 2] ssp[kafka,svc.call.w_deploy.c7tH4YaiTQyBEwAAhQzRXw,2] offset[9129395] Unable to send message from TaskName-Partition 1 to system kafka Here is the log from the Kafka

Re: Errors and hung job on broker shutdown

2015-04-28 Thread Yi Pan
Roger, could you paste the full log from Samza container? If you can figure out which Kafka broker the message was sent to, it would be helpful if we get the log from the broker as well. On Tue, Apr 28, 2015 at 3:31 PM, Roger Hoover wrote: > Hi, > > I need some help figuring out what's going on.

Errors and hung job on broker shutdown

2015-04-28 Thread Roger Hoover
Hi, I need some help figuring out what's going on. I'm running Kafka 0.8.2.1 and Samza 0.9.0 on YARN. All the topics have replication factor of 2. I'm bouncing the Kafka broker using SIGTERM (with controlled.shutdown.enable=true). I see the Samza job log this message and then hang (does not ex