date:20160531

[jira] [Commented] (KAFKA-3769) KStream job spending 60% of time writing metrics

2016-05-31 Thread Jay Kreps (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-3769?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15309288#comment-15309288
 ] 

Jay Kreps commented on KAFKA-3769:
--

Nice catch.

> KStream job spending 60% of time writing metrics
> 
>
> Key: KAFKA-3769
> URL: https://issues.apache.org/jira/browse/KAFKA-3769
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Affects Versions: 0.10.0.0
>Reporter: Greg Fodor
>Assignee: Guozhang Wang
>Priority: Critical
>
> I've been profiling a complex streams job, and found two major hotspots when 
> writing metrics, which take up about 60% of the CPU time of the job. (!) A PR 
> is attached.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Assigned] (KAFKA-2894) WorkerSinkTask doesn't handle rewinding offsets on rebalance

2016-05-31 Thread Liquan Pei (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-2894?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Liquan Pei reassigned KAFKA-2894:
-

Assignee: Liquan Pei  (was: Ewen Cheslack-Postava)

> WorkerSinkTask doesn't handle rewinding offsets on rebalance
> 
>
> Key: KAFKA-2894
> URL: https://issues.apache.org/jira/browse/KAFKA-2894
> Project: Kafka
>  Issue Type: Bug
>  Components: KafkaConnect
>Affects Versions: 0.9.0.0
>Reporter: Ewen Cheslack-Postava
>Assignee: Liquan Pei
>
> rewind() is only invoked at the beginning of each poll(). This means that if 
> a rebalance occurs in the poll, it's feasible to get data that doesn't match 
> a request to change offsets during the rebalance. I think the consumer will 
> hold on to consumer data across the rebalance if it is reassigned the same 
> offset, so there may already be data ready to be delivered. Additionally we 
> may already have data in an incomplete messageBatch that should be discarded 
> when the rewind is requested.
> While connectors that care about this (i.e. ones that manage their own 
> offsets) can handle this correctly by tracking the offsets they're expecting 
> to see, it's a hassle, error prone, an pretty unintuitive.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (KAFKA-3773) SocketServer inflightResponses collection leaks memory on client disconnect

2016-05-31 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-3773?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15308919#comment-15308919
 ] 

ASF GitHub Bot commented on KAFKA-3773:
---

Github user hachikuji closed the pull request at:

https://github.com/apache/kafka/pull/1452


> SocketServer inflightResponses collection leaks memory on client disconnect
> ---
>
> Key: KAFKA-3773
> URL: https://issues.apache.org/jira/browse/KAFKA-3773
> Project: Kafka
>  Issue Type: Bug
>  Components: core
>Affects Versions: 0.9.0.0, 0.9.0.1
>Reporter: Jason Gustafson
>Assignee: Jason Gustafson
> Fix For: 0.9.0.2
>
>
> Prior to 0.10, we did not clean the {{inflightResponses}} collection 
> maintained by each processor when a client disconnected. This causes a 
> quickly growing memory leak since that collection maintains a reference to 
> the pending response, which can be quite large in the case of fetches. This 
> was fixed in KAFKA-3489 for 0.10, but we should ensure that 0.9 is patched as 
> well.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[GitHub] kafka pull request: KAFKA-3773: Remove from inflightResponses on client disc...

2016-05-31 Thread hachikuji

Github user hachikuji closed the pull request at:

https://github.com/apache/kafka/pull/1452


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[jira] [Resolved] (KAFKA-3773) SocketServer inflightResponses collection leaks memory on client disconnect

2016-05-31 Thread Ismael Juma (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-3773?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ismael Juma resolved KAFKA-3773.

Resolution: Fixed
  Reviewer: Ismael Juma

> SocketServer inflightResponses collection leaks memory on client disconnect
> ---
>
> Key: KAFKA-3773
> URL: https://issues.apache.org/jira/browse/KAFKA-3773
> Project: Kafka
>  Issue Type: Bug
>  Components: core
>Affects Versions: 0.9.0.0, 0.9.0.1
>Reporter: Jason Gustafson
>Assignee: Jason Gustafson
> Fix For: 0.9.0.2
>
>
> Prior to 0.10, we did not clean the {{inflightResponses}} collection 
> maintained by each processor when a client disconnected. This causes a 
> quickly growing memory leak since that collection maintains a reference to 
> the pending response, which can be quite large in the case of fetches. This 
> was fixed in KAFKA-3489 for 0.10, but we should ensure that 0.9 is patched as 
> well.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (KAFKA-3773) SocketServer inflightResponses collection leaks memory on client disconnect

2016-05-31 Thread Ismael Juma (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-3773?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ismael Juma updated KAFKA-3773:
---
Fix Version/s: 0.9.0.2

> SocketServer inflightResponses collection leaks memory on client disconnect
> ---
>
> Key: KAFKA-3773
> URL: https://issues.apache.org/jira/browse/KAFKA-3773
> Project: Kafka
>  Issue Type: Bug
>  Components: core
>Affects Versions: 0.9.0.0, 0.9.0.1
>Reporter: Jason Gustafson
>Assignee: Jason Gustafson
> Fix For: 0.9.0.2
>
>
> Prior to 0.10, we did not clean the {{inflightResponses}} collection 
> maintained by each processor when a client disconnected. This causes a 
> quickly growing memory leak since that collection maintains a reference to 
> the pending response, which can be quite large in the case of fetches. This 
> was fixed in KAFKA-3489 for 0.10, but we should ensure that 0.9 is patched as 
> well.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Re: kafka consumer group is rebalancing

2016-05-31 Thread Liquan Pei

Hi Sunny,

It would be helpful to know the processing logic after the records are
returned by poll(). How do you make sure that the polling is happening
within session timeout? Did you try to reduce the number of messages
returned by setting max.poll.records to a smaller value?

Thanks,
Liquan

On Mon, May 30, 2016 at 7:30 AM, Sunny Gupta 
wrote:

> Hi,
> I am using Kafka .9 and new Java consumer. I am polling inside a loop. I am
> getting commitfailedexcption because of group rebalance, when code try to
> execute consumer.commitSycn . Please note, I am adding session.timeout.ms
> as 3 and heartbeat.interval.ms as 1 to consumer and polling
> happens
> for sure with in 3. Can anyone help me out. Please let me know if any
> information is needed.
> I am using 3 node kafka cluster.
> Thanks,
> Sunny
>

-- 
Liquan Pei
Software Engineer, Confluent Inc

[GitHub] kafka pull request: KAFKA-3773: Remove from inflightResponses on client disc...

2016-05-31 Thread hachikuji

Github user hachikuji closed the pull request at:

https://github.com/apache/kafka/pull/1452


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[jira] [Updated] (KAFKA-3769) KStream job spending 60% of time writing metrics

2016-05-31 Thread Guozhang Wang (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-3769?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Guozhang Wang updated KAFKA-3769:
-
Reviewer: Guozhang Wang

> KStream job spending 60% of time writing metrics
> 
>
> Key: KAFKA-3769
> URL: https://issues.apache.org/jira/browse/KAFKA-3769
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Affects Versions: 0.10.0.0
>Reporter: Greg Fodor
>Assignee: Guozhang Wang
>Priority: Critical
>
> I've been profiling a complex streams job, and found two major hotspots when 
> writing metrics, which take up about 60% of the CPU time of the job. (!) A PR 
> is attached.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (KAFKA-3715) Higher granularity streams metrics

2016-05-31 Thread Guozhang Wang (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-3715?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Guozhang Wang updated KAFKA-3715:
-
Reviewer: Eno Thereska

> Higher granularity streams metrics 
> ---
>
> Key: KAFKA-3715
> URL: https://issues.apache.org/jira/browse/KAFKA-3715
> Project: Kafka
>  Issue Type: Improvement
>  Components: streams
>Reporter: Jeff Klukas
>Assignee: aarti gupta
>Priority: Minor
>  Labels: api, newbie
> Fix For: 0.10.1.0
>
>
> Originally proposed by [~guozhang] in 
> https://github.com/apache/kafka/pull/1362#issuecomment-218326690
> We can consider adding metrics for process / punctuate / commit rate at the 
> granularity of each processor node in addition to the global rate mentioned 
> above. This is very helpful in debugging.
> We can consider adding rate / total cumulated metrics for context.forward 
> indicating how many records were forwarded downstream from this processor 
> node as well. This is helpful in debugging.
> We can consider adding metrics for each stream partition's timestamp. This is 
> helpful in debugging.
> Besides the latency metrics, we can also add throughput latency in terms of 
> source records consumed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (KAFKA-3773) SocketServer inflightResponses collection leaks memory on client disconnect

2016-05-31 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-3773?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15308718#comment-15308718
 ] 

ASF GitHub Bot commented on KAFKA-3773:
---

GitHub user hachikuji reopened a pull request:

https://github.com/apache/kafka/pull/1452

KAFKA-3773: Remove from inflightResponses on client disconnect to prevent 
memory leak



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/hachikuji/kafka hotfix-processor-memleak

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/1452.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #1452






> SocketServer inflightResponses collection leaks memory on client disconnect
> ---
>
> Key: KAFKA-3773
> URL: https://issues.apache.org/jira/browse/KAFKA-3773
> Project: Kafka
>  Issue Type: Bug
>  Components: core
>Affects Versions: 0.9.0.0, 0.9.0.1
>Reporter: Jason Gustafson
>Assignee: Jason Gustafson
>
> Prior to 0.10, we did not clean the {{inflightResponses}} collection 
> maintained by each processor when a client disconnected. This causes a 
> quickly growing memory leak since that collection maintains a reference to 
> the pending response, which can be quite large in the case of fetches. This 
> was fixed in KAFKA-3489 for 0.10, but we should ensure that 0.9 is patched as 
> well.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (KAFKA-3773) SocketServer inflightResponses collection leaks memory on client disconnect

2016-05-31 Thread Jason Gustafson (JIRA)

Jason Gustafson created KAFKA-3773:
--

 Summary: SocketServer inflightResponses collection leaks memory on 
client disconnect
 Key: KAFKA-3773
 URL: https://issues.apache.org/jira/browse/KAFKA-3773
 Project: Kafka
  Issue Type: Bug
  Components: core
Affects Versions: 0.9.0.1, 0.9.0.0
Reporter: Jason Gustafson
Assignee: Jason Gustafson


Prior to 0.10, we did not clean the {{inflightResponses}} collection maintained 
by each processor when a client disconnected. This causes a quickly growing 
memory leak since that collection maintains a reference to the pending 
response, which can be quite large in the case of fetches. This was fixed in 
KAFKA-3489 for 0.10, but we should ensure that 0.9 is patched as well.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (KAFKA-3773) SocketServer inflightResponses collection leaks memory on client disconnect

2016-05-31 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-3773?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15308717#comment-15308717
 ] 

ASF GitHub Bot commented on KAFKA-3773:
---

Github user hachikuji closed the pull request at:

https://github.com/apache/kafka/pull/1452


> SocketServer inflightResponses collection leaks memory on client disconnect
> ---
>
> Key: KAFKA-3773
> URL: https://issues.apache.org/jira/browse/KAFKA-3773
> Project: Kafka
>  Issue Type: Bug
>  Components: core
>Affects Versions: 0.9.0.0, 0.9.0.1
>Reporter: Jason Gustafson
>Assignee: Jason Gustafson
>
> Prior to 0.10, we did not clean the {{inflightResponses}} collection 
> maintained by each processor when a client disconnected. This causes a 
> quickly growing memory leak since that collection maintains a reference to 
> the pending response, which can be quite large in the case of fetches. This 
> was fixed in KAFKA-3489 for 0.10, but we should ensure that 0.9 is patched as 
> well.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Re: [DISCUSS] KIP-62: Allow consumer to send heartbeats from a background thread

2016-05-31 Thread Jason Gustafson

Hey Onur,

Thanks for the detailed response. I think the problem of controlling
rebalance times is the main (known) gap in the proposal as it stands.

This burden goes away if you loosen the liveness property by having a
> required rebalance time and optional processing time where rebalance
> happens in the background thread as stated in the KIP.

Just to clarify, the current KIP only allows rebalances to complete in the
foreground. When I suggested above in reply to Grant was that we could add
a separate rebalance timeout setting, the behavior I had in mind was to let
the consumer fall out of the group if the timeout is reached while the
consumer is still processing. I was specifically trying to avoid moving the
rebalance to the background thread since this significantly increases the
complexity of the implementation. We'd also have to think about
compatibility a bit more. For example, what are the implications of having
the rebalance listener execute in a separate thread?

Putting that issue aside, I think we need to convince ourselves that a
separate rebalance timeout is really necessary since every new timeout adds
some conceptual noise which all users will see. My thought in this KIP was
that users who didn't want the burden of tuning the process timeout could
use a relatively large value without a major impact because group
rebalances themselves will typically be infrequent. The main concern is for
users who have highly variant processing times and want to ensure a tight
bound on rebalance times (even if it means having to discard some
processing that cannot be completed before the rebalance finishes). These
users will be left trying to tune process.timeout.ms and max.poll.records,
which is basically the same position they are currently in. The problem is
I don't know how common this case is, so I'm not sure how it weighs against
the cost of having an additional timeout that needs to be explained. We can
always add the rebalance timeout later, but if it will be tough to remove
once it's there. All the same, I'm not that keen on another iteration of
this problem, so if we believe this use case is common enough, then maybe
we should add it now.

Thanks,
Jason

On Sat, May 28, 2016 at 3:10 AM, Onur Karaman 
wrote:

> Thanks for the KIP writeup, Jason.
>
> Before anything else, I just wanted to point out that it's worth mentioning
> the "heartbeat.interval.ms" consumer config in the KIP for completeness.
> Today this config only starts to kick in if poll is called frequently
> enough. A separate heartbeat thread should make this config behave more
> like what people would expect: a separate thread sending heartbeats at the
> configured interval.
>
> With this KIP, the relevant configs become:
> "max.poll.records" - already exists
> "session.timeout.ms" - already exists
> "heartbeat.interval.ms" - already exists
> "process.timeout.ms" - new
>
> After reading the KIP several times, I think it would be helpful to be more
> explicit in the desired outcome. Is it trying to make faster
> best/average/worst case rebalance times? Is it trying to make the clients
> need less configuration tuning?
>
> Also it seems that brokers probably still want to enforce minimum and
> maximum rebalance timeouts just as with the minimum and maximum session
> timeouts so DelayedJoins don't stay in purgatory indefinitely. So we'd add
> new "group.min.rebalance.timeout.ms" and "group.max.rebalance.timeout.ms"
> broker configs which again might need to be brought up in the KIP. Let's
> say we add these bounds. A side-effect of having broker-side bounds on
> rebalance timeouts in combination with Java clients that makes process
> timeouts the same as rebalance timeouts is that the broker effectively
> dictates the max processing time allowed between poll calls. This gotcha
> exists right now with today's broker-side bounds on session timeouts. So
> I'm not really convinced that the proposal gets rid of this complication
> mentioned in the KIP.
>
> I think the main question to ask is: does the KIP actually make a
> difference?
>
> It looks like this KIP improves rebalance times specifically when the
> client currently has processing times large enough to force larger session
> timeouts and heartbeat intervals to not be honored. Separating session
> timeouts from processing time means clients can keep their "
> session.timeout.ms" low so the coordinator can quickly detect process
> failure, and honoring a low "heartbeat.interval.ms" on the separate
> heartbeat thread means clients will be quickly notified of group membership
> and subscription changes - all without placing difficult expectations on
> processing time. But even so, rebalancing through the calling thread means
> the slowest processing client in the group will still be the rate limiting
> step when looking at rebalance times.
>
> From a usability perspective, the burden still seems like it will be tuning
> the processing time to keep the

[jira] [Updated] (KAFKA-3748) Add consumer-property to console tools consumer (similar to --producer-property)

2016-05-31 Thread Bharat Viswanadham (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-3748?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bharat Viswanadham updated KAFKA-3748:
--
Fix Version/s: (was: 0.9.0.1)
   (was: 0.9.0.0)

> Add consumer-property to console tools consumer (similar to 
> --producer-property)
> 
>
> Key: KAFKA-3748
> URL: https://issues.apache.org/jira/browse/KAFKA-3748
> Project: Kafka
>  Issue Type: Improvement
>  Components: core
>Affects Versions: 0.9.0.0
>Reporter: Bharat Viswanadham
>Assignee: Bharat Viswanadham
>  Labels: newbie
>
> Add --consumer-property to the console consumer.
> Creating this task from the comment given in KAFKA-3567.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (KAFKA-3111) java.lang.ArithmeticException: / by zero in ConsumerPerformance

2016-05-31 Thread Vahid Hashemian (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-3111?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15308547#comment-15308547
 ] 

Vahid Hashemian commented on KAFKA-3111:


[~junrao] Could you please take a look at the PR for this JIRA if you still 
think the JIRA is worth a fix? After running some tests I believe the fix 
provided still improves the consumer performance test output. Thanks for your 
feedback in advance.

> java.lang.ArithmeticException: / by zero in ConsumerPerformance
> ---
>
> Key: KAFKA-3111
> URL: https://issues.apache.org/jira/browse/KAFKA-3111
> Project: Kafka
>  Issue Type: Bug
>  Components: core
>Affects Versions: 0.9.0.0
>Reporter: Jun Rao
>Assignee: Vahid Hashemian
>  Labels: newbie
>
> Saw the following error. If there are lots of unconsumed messages, between 
> two iterations of the consumption, the timestamp may not have changed.
> kafka-consumer-perf-test --zookeeper localhost:2181 --topic test --group 
> test-group --threads 1 --show-detailed-stats --reporting-interval 5000
> 2016-01-13 09:12:43:905, 0, 1048576, 35.2856, 238.4186, 37, 250.
> 2016-01-13 09:12:43:916, 0, 1048576, 35.7624, 47.6837, 375000, 50.
> java.lang.ArithmeticException: / by zero
> at 
> kafka.tools.ConsumerPerformance$ConsumerPerfThread.printMessage(ConsumerPerformance.scala:189)
> at 
> kafka.tools.ConsumerPerformance$ConsumerPerfThread.run(ConsumerPerformance.scala:164)
> 2016-01-13 09:12:43:918, 0, 1048576, 36.2393, 0.7117, 38, 7000.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (KAFKA-3758) KStream job fails to recover after Kafka broker stopped

2016-05-31 Thread Greg Fodor (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-3758?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15308306#comment-15308306
 ] 

Greg Fodor commented on KAFKA-3758:
---

No, the kstream job was running across 2 servers, and the kafka cluster was a 3 
node cluster running on separate machines.

> KStream job fails to recover after Kafka broker stopped
> ---
>
> Key: KAFKA-3758
> URL: https://issues.apache.org/jira/browse/KAFKA-3758
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Affects Versions: 0.10.0.0
>Reporter: Greg Fodor
>Assignee: Guozhang Wang
> Attachments: muon.log.1.gz
>
>
> We've been doing some testing of a fairly complex KStreams job and under load 
> it seems the job fails to rebalance + recover if we shut down one of the 
> kafka brokers. The test we were running had a 3-node kafka cluster where each 
> topic had at least a replication factor of 2, and we terminated one of the 
> nodes.
> Attached is the full log, the root exception seems to be contention on the 
> lock on the state directory. The job continues to try to recover but throws 
> errors relating to locks over and over. Restarting the job itself resolves 
> the problem.
>  1702 org.apache.kafka.streams.errors.ProcessorStateException: Error while 
> creating the state manager
>  1703 at 
> org.apache.kafka.streams.processor.internals.AbstractTask.(AbstractTask.java:71)
>  1704 at 
> org.apache.kafka.streams.processor.internals.StreamTask.(StreamTask.java:86)
>  1705 at 
> org.apache.kafka.streams.processor.internals.StreamThread.createStreamTask(StreamThread.java:550)
>  1706 at 
> org.apache.kafka.streams.processor.internals.StreamThread.addStreamTasks(StreamThread.java:577)
>  1707 at 
> org.apache.kafka.streams.processor.internals.StreamThread.access$000(StreamThread.java:68)
>  1708 at 
> org.apache.kafka.streams.processor.internals.StreamThread$1.onPartitionsAssigned(StreamThread.java:123)
>  1709 at 
> org.apache.kafka.clients.consumer.internals.ConsumerCoordinator.onJoinComplete(ConsumerCoordinator.java:222)
>  1710 at 
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator$1.onSuccess(AbstractCoordinator.java:232)
>  1711 at 
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator$1.onSuccess(AbstractCoordinator.java:227)
>  1712 at 
> org.apache.kafka.clients.consumer.internals.RequestFuture.fireSuccess(RequestFuture.java:133)
>  1713 at 
> org.apache.kafka.clients.consumer.internals.RequestFuture.complete(RequestFuture.java:107)
>  1714 at 
> org.apache.kafka.clients.consumer.internals.RequestFuture$2.onSuccess(RequestFuture.java:182)
>  1715 at 
> org.apache.kafka.clients.consumer.internals.RequestFuture.fireSuccess(RequestFuture.java:133)
>  1716 at 
> org.apache.kafka.clients.consumer.internals.RequestFuture.complete(RequestFuture.java:107)
>  1717 at 
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator$SyncGroupResponseHandler.handle(AbstractCoordinator.java:436)
>  1718 at 
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator$SyncGroupResponseHandler.handle(AbstractCoordinator.java:422)
>  1719 at 
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator$CoordinatorResponseHandler.onSuccess(AbstractCoordinator.java:679)
>  1720 at 
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator$CoordinatorResponseHandler.onSuccess(AbstractCoordinator.java:658)
>  1721 at 
> org.apache.kafka.clients.consumer.internals.RequestFuture$1.onSuccess(RequestFuture.java:167)
>  1722 at 
> org.apache.kafka.clients.consumer.internals.RequestFuture.fireSuccess(RequestFuture.java:133)
>  1723 at 
> org.apache.kafka.clients.consumer.internals.RequestFuture.complete(RequestFuture.java:107)
>  1724 at 
> org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient$RequestFutureCompletionHandler.onComplete(ConsumerNetworkClient.java:426)
>  1725 at 
> org.apache.kafka.clients.NetworkClient.poll(NetworkClient.java:278)
>  1726 at 
> org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.clientPoll(ConsumerNetworkClient.java:360)
>  1727 at 
> org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.poll(ConsumerNetworkClient.java:224)
>  1728 at 
> org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.poll(ConsumerNetworkClient.java:192)
>  1729 at 
> org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.poll(ConsumerNetworkClient.java:163)
>  1730 at 
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator.ensureActiveGroup(AbstractCoordinator.java:243)
>  1731 at 
>

[jira] [Commented] (KAFKA-3764) Error processing append operation on partition

2016-05-31 Thread Martin Nowak (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-3764?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15308263#comment-15308263
 ] 

Martin Nowak commented on KAFKA-3764:
-

I didn't configure that, so it was the default (0.10.0).

> Error processing append operation on partition
> --
>
> Key: KAFKA-3764
> URL: https://issues.apache.org/jira/browse/KAFKA-3764
> Project: Kafka
>  Issue Type: Bug
>Affects Versions: 0.10.0.0
>Reporter: Martin Nowak
>
> After updating Kafka from 0.9.0.1 to 0.10.0.0 I'm getting plenty of `Error 
> processing append operation on partition` errors. This happens with 
> ruby-kafka as producer and enabled snappy compression.
> {noformat}
> [2016-05-27 20:00:11,074] ERROR [Replica Manager on Broker 2]: Error 
> processing append operation on partition m2m-0 (kafka.server.ReplicaManager)
> kafka.common.KafkaException: 
> at 
> kafka.message.ByteBufferMessageSet$$anon$1.makeNext(ByteBufferMessageSet.scala:159)
> at 
> kafka.message.ByteBufferMessageSet$$anon$1.makeNext(ByteBufferMessageSet.scala:85)
> at 
> kafka.utils.IteratorTemplate.maybeComputeNext(IteratorTemplate.scala:64)
> at kafka.utils.IteratorTemplate.hasNext(IteratorTemplate.scala:56)
> at 
> kafka.message.ByteBufferMessageSet$$anon$2.makeNextOuter(ByteBufferMessageSet.scala:357)
> at 
> kafka.message.ByteBufferMessageSet$$anon$2.makeNext(ByteBufferMessageSet.scala:369)
> at 
> kafka.message.ByteBufferMessageSet$$anon$2.makeNext(ByteBufferMessageSet.scala:324)
> at 
> kafka.utils.IteratorTemplate.maybeComputeNext(IteratorTemplate.scala:64)
> at kafka.utils.IteratorTemplate.hasNext(IteratorTemplate.scala:56)
> at scala.collection.Iterator$class.foreach(Iterator.scala:893)
> at kafka.utils.IteratorTemplate.foreach(IteratorTemplate.scala:30)
> at 
> kafka.message.ByteBufferMessageSet.validateMessagesAndAssignOffsets(ByteBufferMessageSet.scala:427)
> at kafka.log.Log.liftedTree1$1(Log.scala:339)
> at kafka.log.Log.append(Log.scala:338)
> at kafka.cluster.Partition$$anonfun$11.apply(Partition.scala:443)
> at kafka.cluster.Partition$$anonfun$11.apply(Partition.scala:429)
> at kafka.utils.CoreUtils$.inLock(CoreUtils.scala:231)
> at kafka.utils.CoreUtils$.inReadLock(CoreUtils.scala:237)
> at kafka.cluster.Partition.appendMessagesToLeader(Partition.scala:429)
> at 
> kafka.server.ReplicaManager$$anonfun$appendToLocalLog$2.apply(ReplicaManager.scala:406)
> at 
> kafka.server.ReplicaManager$$anonfun$appendToLocalLog$2.apply(ReplicaManager.scala:392)
> at 
> scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
> at 
> scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
> at 
> scala.collection.mutable.HashMap$$anonfun$foreach$1.apply(HashMap.scala:99)
> at 
> scala.collection.mutable.HashMap$$anonfun$foreach$1.apply(HashMap.scala:99)
> at 
> scala.collection.mutable.HashTable$class.foreachEntry(HashTable.scala:230)
> at scala.collection.mutable.HashMap.foreachEntry(HashMap.scala:40)
> at scala.collection.mutable.HashMap.foreach(HashMap.scala:99)
> at 
> scala.collection.TraversableLike$class.map(TraversableLike.scala:234)
> at scala.collection.AbstractTraversable.map(Traversable.scala:104)
> at 
> kafka.server.ReplicaManager.appendToLocalLog(ReplicaManager.scala:392)
> at 
> kafka.server.ReplicaManager.appendMessages(ReplicaManager.scala:328)
> at kafka.server.KafkaApis.handleProducerRequest(KafkaApis.scala:405)
> at kafka.server.KafkaApis.handle(KafkaApis.scala:76)
> at kafka.server.KafkaRequestHandler.run(KafkaRequestHandler.scala:60)
> at java.lang.Thread.run(Thread.java:745)
> Caused by: java.io.IOException: failed to read chunk
> at 
> org.xerial.snappy.SnappyInputStream.hasNextChunk(SnappyInputStream.java:433)
> at 
> org.xerial.snappy.SnappyInputStream.read(SnappyInputStream.java:167)
> at java.io.DataInputStream.readFully(DataInputStream.java:195)
> at java.io.DataInputStream.readLong(DataInputStream.java:416)
> at 
> kafka.message.ByteBufferMessageSet$$anon$1.readMessageFromStream(ByteBufferMessageSet.scala:118)
> at 
> kafka.message.ByteBufferMessageSet$$anon$1.makeNext(ByteBufferMessageSet.scala:153)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Build failed in Jenkins: kafka-trunk-jdk8 #664

2016-05-31 Thread Apache Jenkins Server

See 

Changes:

[me] MINOR: Avoid trace logging computation in

--
[...truncated 3189 lines...]

kafka.log.LogTest > testOpenDeletesObsoleteFiles PASSED

kafka.log.LogTest > testSizeBasedLogRoll PASSED

kafka.log.LogTest > testTimeBasedLogRollJitter PASSED

kafka.log.LogTest > testParseTopicPartitionName PASSED

kafka.log.LogTest > testTruncateTo PASSED

kafka.log.LogTest > testCleanShutdownFile PASSED

kafka.log.LogSegmentTest > testRecoveryWithCorruptMessage PASSED

kafka.log.LogSegmentTest > testRecoveryFixesCorruptIndex PASSED

kafka.log.LogSegmentTest > testReadFromGap PASSED

kafka.log.LogSegmentTest > testTruncate PASSED

kafka.log.LogSegmentTest > testReadBeforeFirstOffset PASSED

kafka.log.LogSegmentTest > testCreateWithInitFileSizeAppendMessage PASSED

kafka.log.LogSegmentTest > testChangeFileSuffixes PASSED

kafka.log.LogSegmentTest > testMaxOffset PASSED

kafka.log.LogSegmentTest > testNextOffsetCalculation PASSED

kafka.log.LogSegmentTest > testReadOnEmptySegment PASSED

kafka.log.LogSegmentTest > testReadAfterLast PASSED

kafka.log.LogSegmentTest > testCreateWithInitFileSizeClearShutdown PASSED

kafka.log.LogSegmentTest > testTruncateFull PASSED

kafka.api.SaslSslEndToEndAuthorizationTest > testNoConsumeAcl PASSED

kafka.api.SaslSslEndToEndAuthorizationTest > testProduceConsume PASSED

kafka.api.SaslSslEndToEndAuthorizationTest > testNoProduceAcl PASSED

kafka.api.SaslSslEndToEndAuthorizationTest > testNoGroupAcl PASSED

kafka.api.SslProducerSendTest > testSendNonCompressedMessageWithCreateTime 
PASSED

kafka.api.SslProducerSendTest > testSendCompressedMessageWithLogAppendTime 
PASSED

kafka.api.SslProducerSendTest > testClose PASSED

kafka.api.SslProducerSendTest > testFlush PASSED

kafka.api.SslProducerSendTest > testSendToPartition PASSED

kafka.api.SslProducerSendTest > testSendOffset PASSED

kafka.api.SslProducerSendTest > testAutoCreateTopic PASSED

kafka.api.SslProducerSendTest > testSendWithInvalidCreateTime PASSED

kafka.api.SslProducerSendTest > testSendCompressedMessageWithCreateTime PASSED

kafka.api.SslProducerSendTest > testCloseWithZeroTimeoutFromCallerThread PASSED

kafka.api.SslProducerSendTest > testCloseWithZeroTimeoutFromSenderThread PASSED

kafka.api.SslProducerSendTest > testSendNonCompressedMessageWithLogApendTime 
PASSED

kafka.api.SslEndToEndAuthorizationTest > testNoConsumeAcl PASSED

kafka.api.SslEndToEndAuthorizationTest > testProduceConsume PASSED

kafka.api.SslEndToEndAuthorizationTest > testNoProduceAcl PASSED

kafka.api.SslEndToEndAuthorizationTest > testNoGroupAcl PASSED

kafka.api.QuotasTest > testProducerConsumerOverrideUnthrottled PASSED

kafka.api.QuotasTest > testThrottledProducerConsumer PASSED

kafka.api.PlaintextProducerSendTest > testSerializerConstructors PASSED

kafka.api.PlaintextProducerSendTest > testWrongSerializer PASSED

kafka.api.PlaintextProducerSendTest > 
testSendNonCompressedMessageWithCreateTime PASSED

kafka.api.PlaintextProducerSendTest > 
testSendCompressedMessageWithLogAppendTime PASSED

kafka.api.PlaintextProducerSendTest > testClose PASSED

kafka.api.PlaintextProducerSendTest > testFlush PASSED

kafka.api.PlaintextProducerSendTest > testSendToPartition PASSED

kafka.api.PlaintextProducerSendTest > testSendOffset PASSED

kafka.api.PlaintextProducerSendTest > testAutoCreateTopic PASSED

kafka.api.PlaintextProducerSendTest > testSendWithInvalidCreateTime PASSED

kafka.api.PlaintextProducerSendTest > testSendCompressedMessageWithCreateTime 
PASSED

kafka.api.PlaintextProducerSendTest > testCloseWithZeroTimeoutFromCallerThread 
PASSED

kafka.api.PlaintextProducerSendTest > testCloseWithZeroTimeoutFromSenderThread 
PASSED

kafka.api.PlaintextProducerSendTest > 
testSendNonCompressedMessageWithLogApendTime PASSED

kafka.api.SaslMultiMechanismConsumerTest > testMultipleBrokerMechanisms PASSED

kafka.api.SaslMultiMechanismConsumerTest > 
testPauseStateNotPreservedByRebalance PASSED

kafka.api.SaslMultiMechanismConsumerTest > testUnsubscribeTopic PASSED

kafka.api.SaslMultiMechanismConsumerTest > testListTopics PASSED

kafka.api.SaslMultiMechanismConsumerTest > testAutoCommitOnRebalance PASSED

kafka.api.SaslMultiMechanismConsumerTest > testSimpleConsumption PASSED

kafka.api.SaslMultiMechanismConsumerTest > testPartitionReassignmentCallback 
PASSED

kafka.api.SaslMultiMechanismConsumerTest > testCommitSpecifiedOffsets PASSED

kafka.api.AdminClientTest > testDescribeGroup PASSED

kafka.api.AdminClientTest > testDescribeConsumerGroup PASSED

kafka.api.AdminClientTest > testListGroups PASSED

kafka.api.AdminClientTest > testDescribeConsumerGroupForNonExistentGroup PASSED

kafka.api.ApiUtilsTest > testShortStringNonASCII PASSED

kafka.api.ApiUtilsTest > testShortStringASCII PASSED

kafka.api.PlaintextConsumerTest > testPartitionsForAutoCreate PASSED

kafka.api.PlaintextConsumerTest > testShrinkingTopicSubscriptions PASSED

[jira] [Commented] (KAFKA-3764) Error processing append operation on partition

2016-05-31 Thread Ismael Juma (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-3764?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15308174#comment-15308174
 ] 

Ismael Juma commented on KAFKA-3764:


There could be a bug in the conversion, but we have a number of system tests 
and librdkafka and kafka-python have not run into issues yet. What's your 
setting for 
`log.message.format.version`? And did you change that on the broker at some 
point during the upgrade?

> Error processing append operation on partition
> --
>
> Key: KAFKA-3764
> URL: https://issues.apache.org/jira/browse/KAFKA-3764
> Project: Kafka
>  Issue Type: Bug
>Affects Versions: 0.10.0.0
>Reporter: Martin Nowak
>
> After updating Kafka from 0.9.0.1 to 0.10.0.0 I'm getting plenty of `Error 
> processing append operation on partition` errors. This happens with 
> ruby-kafka as producer and enabled snappy compression.
> {noformat}
> [2016-05-27 20:00:11,074] ERROR [Replica Manager on Broker 2]: Error 
> processing append operation on partition m2m-0 (kafka.server.ReplicaManager)
> kafka.common.KafkaException: 
> at 
> kafka.message.ByteBufferMessageSet$$anon$1.makeNext(ByteBufferMessageSet.scala:159)
> at 
> kafka.message.ByteBufferMessageSet$$anon$1.makeNext(ByteBufferMessageSet.scala:85)
> at 
> kafka.utils.IteratorTemplate.maybeComputeNext(IteratorTemplate.scala:64)
> at kafka.utils.IteratorTemplate.hasNext(IteratorTemplate.scala:56)
> at 
> kafka.message.ByteBufferMessageSet$$anon$2.makeNextOuter(ByteBufferMessageSet.scala:357)
> at 
> kafka.message.ByteBufferMessageSet$$anon$2.makeNext(ByteBufferMessageSet.scala:369)
> at 
> kafka.message.ByteBufferMessageSet$$anon$2.makeNext(ByteBufferMessageSet.scala:324)
> at 
> kafka.utils.IteratorTemplate.maybeComputeNext(IteratorTemplate.scala:64)
> at kafka.utils.IteratorTemplate.hasNext(IteratorTemplate.scala:56)
> at scala.collection.Iterator$class.foreach(Iterator.scala:893)
> at kafka.utils.IteratorTemplate.foreach(IteratorTemplate.scala:30)
> at 
> kafka.message.ByteBufferMessageSet.validateMessagesAndAssignOffsets(ByteBufferMessageSet.scala:427)
> at kafka.log.Log.liftedTree1$1(Log.scala:339)
> at kafka.log.Log.append(Log.scala:338)
> at kafka.cluster.Partition$$anonfun$11.apply(Partition.scala:443)
> at kafka.cluster.Partition$$anonfun$11.apply(Partition.scala:429)
> at kafka.utils.CoreUtils$.inLock(CoreUtils.scala:231)
> at kafka.utils.CoreUtils$.inReadLock(CoreUtils.scala:237)
> at kafka.cluster.Partition.appendMessagesToLeader(Partition.scala:429)
> at 
> kafka.server.ReplicaManager$$anonfun$appendToLocalLog$2.apply(ReplicaManager.scala:406)
> at 
> kafka.server.ReplicaManager$$anonfun$appendToLocalLog$2.apply(ReplicaManager.scala:392)
> at 
> scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
> at 
> scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
> at 
> scala.collection.mutable.HashMap$$anonfun$foreach$1.apply(HashMap.scala:99)
> at 
> scala.collection.mutable.HashMap$$anonfun$foreach$1.apply(HashMap.scala:99)
> at 
> scala.collection.mutable.HashTable$class.foreachEntry(HashTable.scala:230)
> at scala.collection.mutable.HashMap.foreachEntry(HashMap.scala:40)
> at scala.collection.mutable.HashMap.foreach(HashMap.scala:99)
> at 
> scala.collection.TraversableLike$class.map(TraversableLike.scala:234)
> at scala.collection.AbstractTraversable.map(Traversable.scala:104)
> at 
> kafka.server.ReplicaManager.appendToLocalLog(ReplicaManager.scala:392)
> at 
> kafka.server.ReplicaManager.appendMessages(ReplicaManager.scala:328)
> at kafka.server.KafkaApis.handleProducerRequest(KafkaApis.scala:405)
> at kafka.server.KafkaApis.handle(KafkaApis.scala:76)
> at kafka.server.KafkaRequestHandler.run(KafkaRequestHandler.scala:60)
> at java.lang.Thread.run(Thread.java:745)
> Caused by: java.io.IOException: failed to read chunk
> at 
> org.xerial.snappy.SnappyInputStream.hasNextChunk(SnappyInputStream.java:433)
> at 
> org.xerial.snappy.SnappyInputStream.read(SnappyInputStream.java:167)
> at java.io.DataInputStream.readFully(DataInputStream.java:195)
> at java.io.DataInputStream.readLong(DataInputStream.java:416)
> at 
> kafka.message.ByteBufferMessageSet$$anon$1.readMessageFromStream(ByteBufferMessageSet.scala:118)
> at 
> kafka.message.ByteBufferMessageSet$$anon$1.makeNext(ByteBufferMessageSet.scala:153)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

RE: [VOTE] KIP-58 - Make Log Compaction Point Configurable

2016-05-31 Thread Eric Wasserman

During the vote it was suggested the name of the property be changed from:

log.cleaner.compaction.lag.ms

to

log.cleaner.compaction.delay.ms

Note that this feature concerns controlling the portion of the head of the
log (new messages go on the head) that will be left un-compacted (i.e. in
full detail) in the event that the log undergoes a compaction. This
property is not saying *when* to do a compaction rather it is how far a
compaction may go into the head. A compaction can still happen to the log
so we are not "delaying" the compaction itself, rather we are preventing
the compaction from proceeding all the way to the head of the log.

One other consideration
The original name ("lag") was proposed when non-temporal constraints were
being considered (size, # of messages). Those were deferred in the KIP
process. However, the new name ("delay") is (to my ear) purely temporal. In
the event use cases for other forms of compaction point configuration do
arise naming them consistently may be trickier.

To James Cheng's concern:

I just thought of something: what happens to the value of "
> log.cleaner.delete.retention.ms"?
> Does it still have the same meaning as before? Does the timer start when
> log compaction happens
> (as it currently does), so in reality, tombstones will only be removed
> from the log some time
> after (log.cleaner.min.compaction.lag.ms + log.cleaner.delete.retention.ms
> )?


Nothing about tombstones handling changes. The tombstone timer starts when
the tombstone is created (i.e. during compaction). As long as one
interprets "log.cleaner.delete.retention.ms" as the minimum time-to-live of
the tombstones then the meaning of that property doesn't change. The new
feature only indirectly makes the tombstone removal later by virtue of
delaying their creation.

[GitHub] kafka pull request: HOTFIX: remove from inflightResponses on client disconne...

2016-05-31 Thread hachikuji

GitHub user hachikuji opened a pull request:

https://github.com/apache/kafka/pull/1452

HOTFIX: remove from inflightResponses on client disconnect to prevent 
memory leak



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/hachikuji/kafka hotfix-processor-memleak

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/1452.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #1452


commit 75c02b1507495561fc5fb96a6b342d2f08c0e0c6
Author: Jason Gustafson 
Date:   2016-05-27T00:40:33Z

HOTFIX: remove from inflightResponses on client disconnect




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

RE: [VOTE] KIP-58 - Make Log Compaction Point Configurable

2016-05-31 Thread Eric Wasserman

Thanks everyone for voting.
The vote passed with 5 binding +1's (Gwen, Jay, Ewen, Ismael, Jun)
and with 9 nb's (Becket, James, Tom, Mnikumar, Ben, Grant, Joel, Eric)

The discussion should move to KAFKA-1981

Build failed in Jenkins: kafka-trunk-jdk7 #1328

2016-05-31 Thread Apache Jenkins Server

See 

Changes:

[me] MINOR: Avoid trace logging computation in

--
Started by an SCM change
[EnvInject] - Loading node environment variables.
Building remotely on H11 (docker Ubuntu ubuntu yahoo-not-h2) in workspace 

 > git rev-parse --is-inside-work-tree # timeout=10
Fetching changes from the remote Git repository
 > git config remote.origin.url 
 > https://git-wip-us.apache.org/repos/asf/kafka.git # timeout=10
Fetching upstream changes from https://git-wip-us.apache.org/repos/asf/kafka.git
 > git --version # timeout=10
 > git -c core.askpass=true fetch --tags --progress 
 > https://git-wip-us.apache.org/repos/asf/kafka.git 
 > +refs/heads/*:refs/remotes/origin/*
 > git rev-parse refs/remotes/origin/trunk^{commit} # timeout=10
 > git rev-parse refs/remotes/origin/origin/trunk^{commit} # timeout=10
Checking out Revision fed3f1f8890b219e4247fd9de1305ad18679ff99 
(refs/remotes/origin/trunk)
 > git config core.sparsecheckout # timeout=10
 > git checkout -f fed3f1f8890b219e4247fd9de1305ad18679ff99
 > git rev-list 3fd9be49ac35adaca401f58552b3ffa68f8d4eaa # timeout=10
Setting 
GRADLE_2_4_RC_2_HOME=/home/jenkins/jenkins-slave/tools/hudson.plugins.gradle.GradleInstallation/Gradle_2.4-rc-2
Setting 
JDK_1_7U51_HOME=/home/jenkins/jenkins-slave/tools/hudson.model.JDK/jdk-1.7u51
[kafka-trunk-jdk7] $ /bin/bash -xe /tmp/hudson9196532778355904514.sh
+ 
/home/jenkins/jenkins-slave/tools/hudson.plugins.gradle.GradleInstallation/Gradle_2.4-rc-2/bin/gradle
To honour the JVM settings for this build a new JVM will be forked. Please 
consider using the daemon: 
http://gradle.org/docs/2.4-rc-2/userguide/gradle_daemon.html.
Building project 'core' with Scala version 2.10.6
:downloadWrapper

BUILD SUCCESSFUL

Total time: 20.314 secs
Setting 
GRADLE_2_4_RC_2_HOME=/home/jenkins/jenkins-slave/tools/hudson.plugins.gradle.GradleInstallation/Gradle_2.4-rc-2
Setting 
JDK_1_7U51_HOME=/home/jenkins/jenkins-slave/tools/hudson.model.JDK/jdk-1.7u51
[kafka-trunk-jdk7] $ /bin/bash -xe /tmp/hudson1290362621481760814.sh
+ export GRADLE_OPTS=-Xmx1024m
+ GRADLE_OPTS=-Xmx1024m
+ ./gradlew -Dorg.gradle.project.maxParallelForks=1 clean jarAll testAll
To honour the JVM settings for this build a new JVM will be forked. Please 
consider using the daemon: 
https://docs.gradle.org/2.13/userguide/gradle_daemon.html.
Building project 'core' with Scala version 2.10.6
Build file ': 
line 231
useAnt has been deprecated and is scheduled to be removed in Gradle 3.0. The 
Ant-Based Scala compiler is deprecated, please see 
https://docs.gradle.org/current/userguide/scala_plugin.html.
:clean UP-TO-DATE
:clients:clean UP-TO-DATE
:connect:clean UP-TO-DATE
:core:clean
:examples:clean UP-TO-DATE
:log4j-appender:clean UP-TO-DATE
:streams:clean UP-TO-DATE
:tools:clean UP-TO-DATE
:connect:api:clean UP-TO-DATE
:connect:file:clean UP-TO-DATE
:connect:json:clean UP-TO-DATE
:connect:runtime:clean UP-TO-DATE
:streams:examples:clean UP-TO-DATE
:jar_core_2_10
Building project 'core' with Scala version 2.10.6
:kafka-trunk-jdk7:clients:compileJava
:jar_core_2_10 FAILED

FAILURE: Build failed with an exception.

* What went wrong:
Failed to capture snapshot of input files for task 'compileJava' during 
up-to-date check.
> Could not add entry 
> '/home/jenkins/.gradle/caches/modules-2/files-2.1/net.jpountz.lz4/lz4/1.3.0/c708bb2590c0652a642236ef45d9f99ff842a2ce/lz4-1.3.0.jar'
>  to cache fileHashes.bin 
> (

* Try:
Run with --stacktrace option to get the stack trace. Run with --info or --debug 
option to get more log output.

BUILD FAILED

Total time: 20.196 secs
Build step 'Execute shell' marked build as failure
Recording test results
Setting 
GRADLE_2_4_RC_2_HOME=/home/jenkins/jenkins-slave/tools/hudson.plugins.gradle.GradleInstallation/Gradle_2.4-rc-2
Setting 
JDK_1_7U51_HOME=/home/jenkins/jenkins-slave/tools/hudson.model.JDK/jdk-1.7u51
ERROR: Step ?Publish JUnit test result report? failed: No test report files 
were found. Configuration error?
Setting 
GRADLE_2_4_RC_2_HOME=/home/jenkins/jenkins-slave/tools/hudson.plugins.gradle.GradleInstallation/Gradle_2.4-rc-2
Setting 
JDK_1_7U51_HOME=/home/jenkins/jenkins-slave/tools/hudson.model.JDK/jdk-1.7u51

[jira] [Commented] (KAFKA-3758) KStream job fails to recover after Kafka broker stopped

2016-05-31 Thread Guozhang Wang (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-3758?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15308047#comment-15308047
 ] 

Guozhang Wang commented on KAFKA-3758:
--

I think there may be an issue about directory locking that results in both this 
and KAFKA-3752. 
I will investigate further and update these tickets later.

One question: is your Kafka Streams processes running on the same node as Kafka 
servers, i.e. if you shutdown some Kafka broker does that also kill some 
running Kafka Streams process?


> KStream job fails to recover after Kafka broker stopped
> ---
>
> Key: KAFKA-3758
> URL: https://issues.apache.org/jira/browse/KAFKA-3758
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Affects Versions: 0.10.0.0
>Reporter: Greg Fodor
>Assignee: Guozhang Wang
> Attachments: muon.log.1.gz
>
>
> We've been doing some testing of a fairly complex KStreams job and under load 
> it seems the job fails to rebalance + recover if we shut down one of the 
> kafka brokers. The test we were running had a 3-node kafka cluster where each 
> topic had at least a replication factor of 2, and we terminated one of the 
> nodes.
> Attached is the full log, the root exception seems to be contention on the 
> lock on the state directory. The job continues to try to recover but throws 
> errors relating to locks over and over. Restarting the job itself resolves 
> the problem.
>  1702 org.apache.kafka.streams.errors.ProcessorStateException: Error while 
> creating the state manager
>  1703 at 
> org.apache.kafka.streams.processor.internals.AbstractTask.(AbstractTask.java:71)
>  1704 at 
> org.apache.kafka.streams.processor.internals.StreamTask.(StreamTask.java:86)
>  1705 at 
> org.apache.kafka.streams.processor.internals.StreamThread.createStreamTask(StreamThread.java:550)
>  1706 at 
> org.apache.kafka.streams.processor.internals.StreamThread.addStreamTasks(StreamThread.java:577)
>  1707 at 
> org.apache.kafka.streams.processor.internals.StreamThread.access$000(StreamThread.java:68)
>  1708 at 
> org.apache.kafka.streams.processor.internals.StreamThread$1.onPartitionsAssigned(StreamThread.java:123)
>  1709 at 
> org.apache.kafka.clients.consumer.internals.ConsumerCoordinator.onJoinComplete(ConsumerCoordinator.java:222)
>  1710 at 
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator$1.onSuccess(AbstractCoordinator.java:232)
>  1711 at 
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator$1.onSuccess(AbstractCoordinator.java:227)
>  1712 at 
> org.apache.kafka.clients.consumer.internals.RequestFuture.fireSuccess(RequestFuture.java:133)
>  1713 at 
> org.apache.kafka.clients.consumer.internals.RequestFuture.complete(RequestFuture.java:107)
>  1714 at 
> org.apache.kafka.clients.consumer.internals.RequestFuture$2.onSuccess(RequestFuture.java:182)
>  1715 at 
> org.apache.kafka.clients.consumer.internals.RequestFuture.fireSuccess(RequestFuture.java:133)
>  1716 at 
> org.apache.kafka.clients.consumer.internals.RequestFuture.complete(RequestFuture.java:107)
>  1717 at 
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator$SyncGroupResponseHandler.handle(AbstractCoordinator.java:436)
>  1718 at 
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator$SyncGroupResponseHandler.handle(AbstractCoordinator.java:422)
>  1719 at 
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator$CoordinatorResponseHandler.onSuccess(AbstractCoordinator.java:679)
>  1720 at 
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator$CoordinatorResponseHandler.onSuccess(AbstractCoordinator.java:658)
>  1721 at 
> org.apache.kafka.clients.consumer.internals.RequestFuture$1.onSuccess(RequestFuture.java:167)
>  1722 at 
> org.apache.kafka.clients.consumer.internals.RequestFuture.fireSuccess(RequestFuture.java:133)
>  1723 at 
> org.apache.kafka.clients.consumer.internals.RequestFuture.complete(RequestFuture.java:107)
>  1724 at 
> org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient$RequestFutureCompletionHandler.onComplete(ConsumerNetworkClient.java:426)
>  1725 at 
> org.apache.kafka.clients.NetworkClient.poll(NetworkClient.java:278)
>  1726 at 
> org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.clientPoll(ConsumerNetworkClient.java:360)
>  1727 at 
> org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.poll(ConsumerNetworkClient.java:224)
>  1728 at 
> org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.poll(ConsumerNetworkClient.java:192)
>  1729 at 
>

[GitHub] kafka pull request: MINOR: Avoid trace logging computation in `checkEnoughRe...

2016-05-31 Thread asfgit

Github user asfgit closed the pull request at:

https://github.com/apache/kafka/pull/1449


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[jira] [Commented] (KAFKA-3764) Error processing append operation on partition

2016-05-31 Thread Martin Nowak (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-3764?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15307964#comment-15307964
 ] 

Martin Nowak commented on KAFKA-3764:
-

The client driver still is on the 0.9.0 protocol, but from the upgrade guide I 
read that brokers translate message to/from older clients.
Once I had some 0.10.0 messages in the topic, b/c some of the produce requests 
did work, a ruby-kafka consumer was receiving 0.10.0 messages. So maybe the 
translation for older clients is somehow flaky?

> Error processing append operation on partition
> --
>
> Key: KAFKA-3764
> URL: https://issues.apache.org/jira/browse/KAFKA-3764
> Project: Kafka
>  Issue Type: Bug
>Affects Versions: 0.10.0.0
>Reporter: Martin Nowak
>
> After updating Kafka from 0.9.0.1 to 0.10.0.0 I'm getting plenty of `Error 
> processing append operation on partition` errors. This happens with 
> ruby-kafka as producer and enabled snappy compression.
> {noformat}
> [2016-05-27 20:00:11,074] ERROR [Replica Manager on Broker 2]: Error 
> processing append operation on partition m2m-0 (kafka.server.ReplicaManager)
> kafka.common.KafkaException: 
> at 
> kafka.message.ByteBufferMessageSet$$anon$1.makeNext(ByteBufferMessageSet.scala:159)
> at 
> kafka.message.ByteBufferMessageSet$$anon$1.makeNext(ByteBufferMessageSet.scala:85)
> at 
> kafka.utils.IteratorTemplate.maybeComputeNext(IteratorTemplate.scala:64)
> at kafka.utils.IteratorTemplate.hasNext(IteratorTemplate.scala:56)
> at 
> kafka.message.ByteBufferMessageSet$$anon$2.makeNextOuter(ByteBufferMessageSet.scala:357)
> at 
> kafka.message.ByteBufferMessageSet$$anon$2.makeNext(ByteBufferMessageSet.scala:369)
> at 
> kafka.message.ByteBufferMessageSet$$anon$2.makeNext(ByteBufferMessageSet.scala:324)
> at 
> kafka.utils.IteratorTemplate.maybeComputeNext(IteratorTemplate.scala:64)
> at kafka.utils.IteratorTemplate.hasNext(IteratorTemplate.scala:56)
> at scala.collection.Iterator$class.foreach(Iterator.scala:893)
> at kafka.utils.IteratorTemplate.foreach(IteratorTemplate.scala:30)
> at 
> kafka.message.ByteBufferMessageSet.validateMessagesAndAssignOffsets(ByteBufferMessageSet.scala:427)
> at kafka.log.Log.liftedTree1$1(Log.scala:339)
> at kafka.log.Log.append(Log.scala:338)
> at kafka.cluster.Partition$$anonfun$11.apply(Partition.scala:443)
> at kafka.cluster.Partition$$anonfun$11.apply(Partition.scala:429)
> at kafka.utils.CoreUtils$.inLock(CoreUtils.scala:231)
> at kafka.utils.CoreUtils$.inReadLock(CoreUtils.scala:237)
> at kafka.cluster.Partition.appendMessagesToLeader(Partition.scala:429)
> at 
> kafka.server.ReplicaManager$$anonfun$appendToLocalLog$2.apply(ReplicaManager.scala:406)
> at 
> kafka.server.ReplicaManager$$anonfun$appendToLocalLog$2.apply(ReplicaManager.scala:392)
> at 
> scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
> at 
> scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
> at 
> scala.collection.mutable.HashMap$$anonfun$foreach$1.apply(HashMap.scala:99)
> at 
> scala.collection.mutable.HashMap$$anonfun$foreach$1.apply(HashMap.scala:99)
> at 
> scala.collection.mutable.HashTable$class.foreachEntry(HashTable.scala:230)
> at scala.collection.mutable.HashMap.foreachEntry(HashMap.scala:40)
> at scala.collection.mutable.HashMap.foreach(HashMap.scala:99)
> at 
> scala.collection.TraversableLike$class.map(TraversableLike.scala:234)
> at scala.collection.AbstractTraversable.map(Traversable.scala:104)
> at 
> kafka.server.ReplicaManager.appendToLocalLog(ReplicaManager.scala:392)
> at 
> kafka.server.ReplicaManager.appendMessages(ReplicaManager.scala:328)
> at kafka.server.KafkaApis.handleProducerRequest(KafkaApis.scala:405)
> at kafka.server.KafkaApis.handle(KafkaApis.scala:76)
> at kafka.server.KafkaRequestHandler.run(KafkaRequestHandler.scala:60)
> at java.lang.Thread.run(Thread.java:745)
> Caused by: java.io.IOException: failed to read chunk
> at 
> org.xerial.snappy.SnappyInputStream.hasNextChunk(SnappyInputStream.java:433)
> at 
> org.xerial.snappy.SnappyInputStream.read(SnappyInputStream.java:167)
> at java.io.DataInputStream.readFully(DataInputStream.java:195)
> at java.io.DataInputStream.readLong(DataInputStream.java:416)
> at 
> kafka.message.ByteBufferMessageSet$$anon$1.readMessageFromStream(ByteBufferMessageSet.scala:118)
> at 
> kafka.message.ByteBufferMessageSet$$anon$1.makeNext(ByteBufferMessageSet.scala:153)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (KAFKA-3764) Error processing append operation on partition

2016-05-31 Thread Ismael Juma (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-3764?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15307936#comment-15307936
 ] 

Ismael Juma commented on KAFKA-3764:


Yes, it's definitely useful to have the bug report. It is a bit hard to 
diagnose the problem from the information provided. A lot of things changed on 
the server side as you can see from the upgrade notes you linked to. I was 
trying to figure out if this is an issue that's specific to ruby-kafka or your 
workload.

With regards to what the client could be doing wrong, if the produce request 
version was wrong, it could lead to an error like that one.

> Error processing append operation on partition
> --
>
> Key: KAFKA-3764
> URL: https://issues.apache.org/jira/browse/KAFKA-3764
> Project: Kafka
>  Issue Type: Bug
>Affects Versions: 0.10.0.0
>Reporter: Martin Nowak
>
> After updating Kafka from 0.9.0.1 to 0.10.0.0 I'm getting plenty of `Error 
> processing append operation on partition` errors. This happens with 
> ruby-kafka as producer and enabled snappy compression.
> {noformat}
> [2016-05-27 20:00:11,074] ERROR [Replica Manager on Broker 2]: Error 
> processing append operation on partition m2m-0 (kafka.server.ReplicaManager)
> kafka.common.KafkaException: 
> at 
> kafka.message.ByteBufferMessageSet$$anon$1.makeNext(ByteBufferMessageSet.scala:159)
> at 
> kafka.message.ByteBufferMessageSet$$anon$1.makeNext(ByteBufferMessageSet.scala:85)
> at 
> kafka.utils.IteratorTemplate.maybeComputeNext(IteratorTemplate.scala:64)
> at kafka.utils.IteratorTemplate.hasNext(IteratorTemplate.scala:56)
> at 
> kafka.message.ByteBufferMessageSet$$anon$2.makeNextOuter(ByteBufferMessageSet.scala:357)
> at 
> kafka.message.ByteBufferMessageSet$$anon$2.makeNext(ByteBufferMessageSet.scala:369)
> at 
> kafka.message.ByteBufferMessageSet$$anon$2.makeNext(ByteBufferMessageSet.scala:324)
> at 
> kafka.utils.IteratorTemplate.maybeComputeNext(IteratorTemplate.scala:64)
> at kafka.utils.IteratorTemplate.hasNext(IteratorTemplate.scala:56)
> at scala.collection.Iterator$class.foreach(Iterator.scala:893)
> at kafka.utils.IteratorTemplate.foreach(IteratorTemplate.scala:30)
> at 
> kafka.message.ByteBufferMessageSet.validateMessagesAndAssignOffsets(ByteBufferMessageSet.scala:427)
> at kafka.log.Log.liftedTree1$1(Log.scala:339)
> at kafka.log.Log.append(Log.scala:338)
> at kafka.cluster.Partition$$anonfun$11.apply(Partition.scala:443)
> at kafka.cluster.Partition$$anonfun$11.apply(Partition.scala:429)
> at kafka.utils.CoreUtils$.inLock(CoreUtils.scala:231)
> at kafka.utils.CoreUtils$.inReadLock(CoreUtils.scala:237)
> at kafka.cluster.Partition.appendMessagesToLeader(Partition.scala:429)
> at 
> kafka.server.ReplicaManager$$anonfun$appendToLocalLog$2.apply(ReplicaManager.scala:406)
> at 
> kafka.server.ReplicaManager$$anonfun$appendToLocalLog$2.apply(ReplicaManager.scala:392)
> at 
> scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
> at 
> scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
> at 
> scala.collection.mutable.HashMap$$anonfun$foreach$1.apply(HashMap.scala:99)
> at 
> scala.collection.mutable.HashMap$$anonfun$foreach$1.apply(HashMap.scala:99)
> at 
> scala.collection.mutable.HashTable$class.foreachEntry(HashTable.scala:230)
> at scala.collection.mutable.HashMap.foreachEntry(HashMap.scala:40)
> at scala.collection.mutable.HashMap.foreach(HashMap.scala:99)
> at 
> scala.collection.TraversableLike$class.map(TraversableLike.scala:234)
> at scala.collection.AbstractTraversable.map(Traversable.scala:104)
> at 
> kafka.server.ReplicaManager.appendToLocalLog(ReplicaManager.scala:392)
> at 
> kafka.server.ReplicaManager.appendMessages(ReplicaManager.scala:328)
> at kafka.server.KafkaApis.handleProducerRequest(KafkaApis.scala:405)
> at kafka.server.KafkaApis.handle(KafkaApis.scala:76)
> at kafka.server.KafkaRequestHandler.run(KafkaRequestHandler.scala:60)
> at java.lang.Thread.run(Thread.java:745)
> Caused by: java.io.IOException: failed to read chunk
> at 
> org.xerial.snappy.SnappyInputStream.hasNextChunk(SnappyInputStream.java:433)
> at 
> org.xerial.snappy.SnappyInputStream.read(SnappyInputStream.java:167)
> at java.io.DataInputStream.readFully(DataInputStream.java:195)
> at java.io.DataInputStream.readLong(DataInputStream.java:416)
> at 
> kafka.message.ByteBufferMessageSet$$anon$1.readMessageFromStream(ByteBufferMessageSet.scala:118)
> at 
>

[jira] [Commented] (KAFKA-3693) Race condition between highwatermark-checkpoint thread and handleLeaderAndIsrRequest at broker start-up

2016-05-31 Thread Maysam Yabandeh (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-3693?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15307932#comment-15307932
 ] 

Maysam Yabandeh commented on KAFKA-3693:


[~junrao] Any thoughts?

> Race condition between highwatermark-checkpoint thread and 
> handleLeaderAndIsrRequest at broker start-up
> ---
>
> Key: KAFKA-3693
> URL: https://issues.apache.org/jira/browse/KAFKA-3693
> Project: Kafka
>  Issue Type: Bug
>  Components: core
>Affects Versions: 0.9.0.1
>Reporter: Maysam Yabandeh
>
> Upon broker start-up, a race between highwatermark-checkpoint thread to write 
> replication-offset-checkpoint file and handleLeaderAndIsrRequest thread 
> reading from it causes the highwatermark for some partitions to be reset to 
> 0. In the good case, this results the replica to truncate its entire log to 0 
> and hence initiates fetching of terabytes of data from the lead broker, which 
> sometimes leads to hours of downtime. We observed the bad cases that the 
> reset offset can propagate to recovery-point-offset-checkpoint file, making a 
> lead broker to truncate the file. This seems to have the potential to lead to 
> data loss if the truncation happens at both follower and leader brokers.
> This is the particular faulty scenario manifested in our tests:
> # The broker restarts and receive LeaderAndIsr from the controller
> # LeaderAndIsr message however does not contain all the partitions (probably 
> because other brokers were churning at the same time)
> # becomeLeaderOrFollower calls getOrCreatePartition and updates the 
> allPartitions with the partitions included in the LeaderAndIsr message {code}
>   def getOrCreatePartition(topic: String, partitionId: Int): Partition = {
> var partition = allPartitions.get((topic, partitionId))
> if (partition == null) {
>   allPartitions.putIfNotExists((topic, partitionId), new Partition(topic, 
> partitionId, time, this))
> {code}
> # replication-offset-checkpoint jumps in taking a snapshot of (the partial) 
> allReplicas' high watermark into replication-offset-checkpoint file {code}  
> def checkpointHighWatermarks() {
> val replicas = 
> allPartitions.values.map(_.getReplica(config.brokerId)).collect{case 
> Some(replica) => replica}{code} hence rewriting the previous highwatermarks.
> # Later becomeLeaderOrFollower calls makeLeaders and makeFollowers which read 
> the (now partial) file through Partition::getOrCreateReplica {code}
>   val checkpoint = 
> replicaManager.highWatermarkCheckpoints(log.dir.getParentFile.getAbsolutePath)
>   val offsetMap = checkpoint.read
>   if (!offsetMap.contains(TopicAndPartition(topic, partitionId)))
> info("No checkpointed highwatermark is found for partition 
> [%s,%d]".format(topic, partitionId))
> {code}
> We are not entirely sure whether the initial LeaderAndIsr message including a 
> subset of partitions is critical in making this race condition manifest or 
> not. But it is an important detail since it clarifies that a solution based 
> on not letting the highwatermark-checkpoint thread jumping in the middle of 
> processing a LeaderAndIsr message would not suffice.
> The solution we are thinking of is to force initializing allPartitions by the 
> partitions listed in the replication-offset-checkpoint (and perhaps 
> recovery-point-offset-checkpoint file too) when a server starts.
> Thoughts?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (KAFKA-3772) MirrorMaker crashes on Corrupted Message

2016-05-31 Thread James Ranson (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-3772?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

James Ranson updated KAFKA-3772:

Description: 
We recently came across an issue where a message on our source kafka cluster 
became corrupted. When MirrorMaker tried to consume this message, the thread 
crashed and caused the entire process to also crash. Each time we attempted to 
restart MM, it crashed on the same message. There is no information in the MM 
logs about which message it was trying to consume (what topic, what offset, 
etc). So the only way we were able to get past the issue was to go into the 
zookeeper tree for our mirror consumer group and increment the offset for every 
partition on every topic until the MM process could start without crashing. 
This is not a tenable operational solution. MirrorMaker should gracefully skip 
corrupt messages since they will never be able to be replicated anyway.

{noformat}2016-05-26 20:02:26,787 FATAL  MirrorMaker$MirrorMakerThread - [{}] 
[mirrormaker-thread-3] Mirror maker thread failure due to
kafka.message.InvalidMessageException: Message is corrupt (stored crc = 
33747148, computed crc = 3550736267)
at kafka.message.Message.ensureValid(Message.scala:167)
at kafka.consumer.ConsumerIterator.makeNext(ConsumerIterator.scala:101)
at kafka.consumer.ConsumerIterator.makeNext(ConsumerIterator.scala:33)
at 
kafka.utils.IteratorTemplate.maybeComputeNext(IteratorTemplate.scala:66)
at kafka.utils.IteratorTemplate.hasNext(IteratorTemplate.scala:58)
at 
kafka.tools.MirrorMaker$MirrorMakerOldConsumer.hasData(MirrorMaker.scala:483)
at kafka.tools.MirrorMaker$MirrorMakerThread.run(MirrorMaker.scala:394)

2016-05-26 20:02:27,580 FATAL  MirrorMaker$MirrorMakerThread - [{}] 
[mirrormaker-thread-3] Mirror maker thread exited abnormally, stopping the 
whole mirror maker.{noformat}

  was:
We recently came across an issue where a message on our source kafka cluster 
became corrupted. When MirrorMaker tried to consume this message, the thread 
crashed and caused the entire process to also crash. Each time we attempted to 
restart MM, it crashed on the same message. There is no information in the MM 
logs about which message it was trying to consume (what topic, what offset, 
etc). So the only way we were able to get past the issue was to go into the 
zookeeper tree for our mirror consumer group and increment the offset for every 
partition on every topic until the MM process could start without crashing. 
This is not a tenable operational solution. MirrorMaker should gracefully skip 
corrupt messages since they will never be able to be replicated anyway.

```2016-05-26 20:02:26,787 FATAL  MirrorMaker$MirrorMakerThread - [{}] 
[mirrormaker-thread-3] Mirror maker thread failure due to
kafka.message.InvalidMessageException: Message is corrupt (stored crc = 
33747148, computed crc = 3550736267)
at kafka.message.Message.ensureValid(Message.scala:167)
at kafka.consumer.ConsumerIterator.makeNext(ConsumerIterator.scala:101)
at kafka.consumer.ConsumerIterator.makeNext(ConsumerIterator.scala:33)
at 
kafka.utils.IteratorTemplate.maybeComputeNext(IteratorTemplate.scala:66)
at kafka.utils.IteratorTemplate.hasNext(IteratorTemplate.scala:58)
at 
kafka.tools.MirrorMaker$MirrorMakerOldConsumer.hasData(MirrorMaker.scala:483)
at kafka.tools.MirrorMaker$MirrorMakerThread.run(MirrorMaker.scala:394)

2016-05-26 20:02:27,580 FATAL  MirrorMaker$MirrorMakerThread - [{}] 
[mirrormaker-thread-3] Mirror maker thread exited abnormally, stopping the 
whole mirror maker.```


> MirrorMaker crashes on Corrupted Message
> 
>
> Key: KAFKA-3772
> URL: https://issues.apache.org/jira/browse/KAFKA-3772
> Project: Kafka
>  Issue Type: Bug
>  Components: core
>Affects Versions: 0.9.0.1
>Reporter: James Ranson
>  Labels: mirror-maker
>
> We recently came across an issue where a message on our source kafka cluster 
> became corrupted. When MirrorMaker tried to consume this message, the thread 
> crashed and caused the entire process to also crash. Each time we attempted 
> to restart MM, it crashed on the same message. There is no information in the 
> MM logs about which message it was trying to consume (what topic, what 
> offset, etc). So the only way we were able to get past the issue was to go 
> into the zookeeper tree for our mirror consumer group and increment the 
> offset for every partition on every topic until the MM process could start 
> without crashing. This is not a tenable operational solution. MirrorMaker 
> should gracefully skip corrupt messages since they will never be able to be 
> replicated anyway.
> {noformat}2016-05-26 20:02:26,787 FATAL  MirrorMaker$MirrorMakerThread - [{}] 
>

[jira] [Created] (KAFKA-3772) MirrorMaker crashes on Corrupted Message

2016-05-31 Thread James Ranson (JIRA)

James Ranson created KAFKA-3772:
---

 Summary: MirrorMaker crashes on Corrupted Message
 Key: KAFKA-3772
 URL: https://issues.apache.org/jira/browse/KAFKA-3772
 Project: Kafka
  Issue Type: Bug
  Components: core
Affects Versions: 0.9.0.1
Reporter: James Ranson


We recently came across an issue where a message on our source kafka cluster 
became corrupted. When MirrorMaker tried to consume this message, the thread 
crashed and caused the entire process to also crash. Each time we attempted to 
restart MM, it crashed on the same message. There is no information in the MM 
logs about which message it was trying to consume (what topic, what offset, 
etc). So the only way we were able to get past the issue was to go into the 
zookeeper tree for our mirror consumer group and increment the offset for every 
partition on every topic until the MM process could start without crashing. 
This is not a tenable operational solution. MirrorMaker should gracefully skip 
corrupt messages since they will never be able to be replicated anyway.

```2016-05-26 20:02:26,787 FATAL  MirrorMaker$MirrorMakerThread - [{}] 
[mirrormaker-thread-3] Mirror maker thread failure due to
kafka.message.InvalidMessageException: Message is corrupt (stored crc = 
33747148, computed crc = 3550736267)
at kafka.message.Message.ensureValid(Message.scala:167)
at kafka.consumer.ConsumerIterator.makeNext(ConsumerIterator.scala:101)
at kafka.consumer.ConsumerIterator.makeNext(ConsumerIterator.scala:33)
at 
kafka.utils.IteratorTemplate.maybeComputeNext(IteratorTemplate.scala:66)
at kafka.utils.IteratorTemplate.hasNext(IteratorTemplate.scala:58)
at 
kafka.tools.MirrorMaker$MirrorMakerOldConsumer.hasData(MirrorMaker.scala:483)
at kafka.tools.MirrorMaker$MirrorMakerThread.run(MirrorMaker.scala:394)

2016-05-26 20:02:27,580 FATAL  MirrorMaker$MirrorMakerThread - [{}] 
[mirrormaker-thread-3] Mirror maker thread exited abnormally, stopping the 
whole mirror maker.```



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Garbage free Kafka client

2016-05-31 Thread Mikael Ståldal

Hi.

I am committer on the Apache Log4j 2 project. We are working on providing
garbage-free logging, as described here:
http://logging.apache.org/log4j/2.x/manual/garbagefree.html

I work on the Kafka appender in Log4j 2:
http://logging.apache.org/log4j/2.x/manual/appenders.html#KafkaAppender
https://github.com/apache/logging-log4j2/tree/master/log4j-core/src/main/java/org/apache/logging/log4j/core/appender/mom/kafka

I am now looking into if it is feasible to support garbage-free operation
with the Kafka appender.

This would require some changes in the Kafka client library. The first step
would be to make it possible to send a message without creating a new
ProducerRecord object each time. Maybe an additional "low-level" send
method can be added to the Producer interface with unrolled arguments, like
this:

/**
 * Send the given record asynchronously and return a future which will
eventually contain the response information.
 *
 * @param topic The topic the record will be appended to
 * @param key The key that will be included in the record, or
{@literal null} for no key
 * @param value The record contents
 *
 * @return A future which will eventually contain the response information
 */
public Future send(String topic, byte[] key, byte[] value);

Perhaps it would be good to bypass key and value serializers in this case.

-- 
[image: MagineTV]

*Mikael Ståldal*
Senior software developer

*Magine TV*
mikael.stal...@magine.com
Grev Turegatan 3  | 114 46 Stockholm, Sweden  |   www.magine.com

Privileged and/or Confidential Information may be contained in this
message. If you are not the addressee indicated in this message
(or responsible for delivery of the message to such a person), you may not
copy or deliver this message to anyone. In such case,
you should destroy this message and kindly notify the sender by reply
email.

[jira] [Commented] (KAFKA-3764) Error processing append operation on partition

2016-05-31 Thread Martin Nowak (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-3764?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15307885#comment-15307885
 ] 

Martin Nowak commented on KAFKA-3764:
-

Well this bug report is mostly about the fact that a producer that worked with 
0.9.0.1 breaks when updating to 0.10.0.0. Whether this is an issue of Kafka or 
the client isn't that interesting, but for sure something changed on the server 
side.
Might be interesting for http://kafka.apache.org/documentation.html#upgrade_10.

I'll try to debug this in more detail in the next few days. Any gut feeling 
what the client might do wrong?


> Error processing append operation on partition
> --
>
> Key: KAFKA-3764
> URL: https://issues.apache.org/jira/browse/KAFKA-3764
> Project: Kafka
>  Issue Type: Bug
>Affects Versions: 0.10.0.0
>Reporter: Martin Nowak
>
> After updating Kafka from 0.9.0.1 to 0.10.0.0 I'm getting plenty of `Error 
> processing append operation on partition` errors. This happens with 
> ruby-kafka as producer and enabled snappy compression.
> {noformat}
> [2016-05-27 20:00:11,074] ERROR [Replica Manager on Broker 2]: Error 
> processing append operation on partition m2m-0 (kafka.server.ReplicaManager)
> kafka.common.KafkaException: 
> at 
> kafka.message.ByteBufferMessageSet$$anon$1.makeNext(ByteBufferMessageSet.scala:159)
> at 
> kafka.message.ByteBufferMessageSet$$anon$1.makeNext(ByteBufferMessageSet.scala:85)
> at 
> kafka.utils.IteratorTemplate.maybeComputeNext(IteratorTemplate.scala:64)
> at kafka.utils.IteratorTemplate.hasNext(IteratorTemplate.scala:56)
> at 
> kafka.message.ByteBufferMessageSet$$anon$2.makeNextOuter(ByteBufferMessageSet.scala:357)
> at 
> kafka.message.ByteBufferMessageSet$$anon$2.makeNext(ByteBufferMessageSet.scala:369)
> at 
> kafka.message.ByteBufferMessageSet$$anon$2.makeNext(ByteBufferMessageSet.scala:324)
> at 
> kafka.utils.IteratorTemplate.maybeComputeNext(IteratorTemplate.scala:64)
> at kafka.utils.IteratorTemplate.hasNext(IteratorTemplate.scala:56)
> at scala.collection.Iterator$class.foreach(Iterator.scala:893)
> at kafka.utils.IteratorTemplate.foreach(IteratorTemplate.scala:30)
> at 
> kafka.message.ByteBufferMessageSet.validateMessagesAndAssignOffsets(ByteBufferMessageSet.scala:427)
> at kafka.log.Log.liftedTree1$1(Log.scala:339)
> at kafka.log.Log.append(Log.scala:338)
> at kafka.cluster.Partition$$anonfun$11.apply(Partition.scala:443)
> at kafka.cluster.Partition$$anonfun$11.apply(Partition.scala:429)
> at kafka.utils.CoreUtils$.inLock(CoreUtils.scala:231)
> at kafka.utils.CoreUtils$.inReadLock(CoreUtils.scala:237)
> at kafka.cluster.Partition.appendMessagesToLeader(Partition.scala:429)
> at 
> kafka.server.ReplicaManager$$anonfun$appendToLocalLog$2.apply(ReplicaManager.scala:406)
> at 
> kafka.server.ReplicaManager$$anonfun$appendToLocalLog$2.apply(ReplicaManager.scala:392)
> at 
> scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
> at 
> scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
> at 
> scala.collection.mutable.HashMap$$anonfun$foreach$1.apply(HashMap.scala:99)
> at 
> scala.collection.mutable.HashMap$$anonfun$foreach$1.apply(HashMap.scala:99)
> at 
> scala.collection.mutable.HashTable$class.foreachEntry(HashTable.scala:230)
> at scala.collection.mutable.HashMap.foreachEntry(HashMap.scala:40)
> at scala.collection.mutable.HashMap.foreach(HashMap.scala:99)
> at 
> scala.collection.TraversableLike$class.map(TraversableLike.scala:234)
> at scala.collection.AbstractTraversable.map(Traversable.scala:104)
> at 
> kafka.server.ReplicaManager.appendToLocalLog(ReplicaManager.scala:392)
> at 
> kafka.server.ReplicaManager.appendMessages(ReplicaManager.scala:328)
> at kafka.server.KafkaApis.handleProducerRequest(KafkaApis.scala:405)
> at kafka.server.KafkaApis.handle(KafkaApis.scala:76)
> at kafka.server.KafkaRequestHandler.run(KafkaRequestHandler.scala:60)
> at java.lang.Thread.run(Thread.java:745)
> Caused by: java.io.IOException: failed to read chunk
> at 
> org.xerial.snappy.SnappyInputStream.hasNextChunk(SnappyInputStream.java:433)
> at 
> org.xerial.snappy.SnappyInputStream.read(SnappyInputStream.java:167)
> at java.io.DataInputStream.readFully(DataInputStream.java:195)
> at java.io.DataInputStream.readLong(DataInputStream.java:416)
> at 
> kafka.message.ByteBufferMessageSet$$anon$1.readMessageFromStream(ByteBufferMessageSet.scala:118)
> at 
>

[jira] [Commented] (KAFKA-2318) replica manager repeatedly tries to fetch from partitions already moved during controlled shutdown

2016-05-31 Thread Wanli Tian (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-2318?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15307514#comment-15307514
 ] 

Wanli Tian commented on KAFKA-2318:
---

same issue

> replica manager repeatedly tries to fetch from partitions already moved 
> during controlled shutdown
> --
>
> Key: KAFKA-2318
> URL: https://issues.apache.org/jira/browse/KAFKA-2318
> Project: Kafka
>  Issue Type: Bug
>Reporter: Jason Rosenberg
>
> Using version 0.8.2.1.
> During a controlled shutdown, it seems like the left-hand is often not 
> talking to the right :)
> In this case, we see the ReplicaManager remove a fetcher for a partition, 
> truncate it's log, and then apparently try to fetch data from that partition 
> repeatedly, spamming the log with "failed due to Leader not local for 
> partition" warnings.
> Below is a snippet (in this case it happened for partition 
> '__consumer_offsets,7' and '__consumer_offsets,47').  It went on for quite a 
> bit longer than included here.  The current broker is '99' here.
> {code}
> 2015-07-07 18:54:26,415  INFO [kafka-request-handler-0] 
> server.ReplicaFetcherManager - [ReplicaFetcherManager on broker 99] Removed 
> fetcher for partitions [__consumer_offsets,7]
> 2015-07-07 18:54:26,415  INFO [kafka-request-handler-0] log.Log - Truncating 
> log __consumer_offsets-7 to offset 0.
> 2015-07-07 18:54:26,421  WARN [kafka-request-handler-3] server.ReplicaManager 
> - [Replica Manager on Broker 99]: Fetch request with correlation id 6832556 
> from client ReplicaFetcherThread-0-99 on partition [__consumer_offsets,7] 
> failed due to Leader not local for partition [__consumer_offsets,7] on broker 
> 99
> 2015-07-07 18:54:26,429  WARN [kafka-request-handler-4] server.ReplicaManager 
> - [Replica Manager on Broker 99]: Fetch request with correlation id 4345717 
> from client ReplicaFetcherThread-0-99 on partition [__consumer_offsets,7] 
> failed due to Leader not local for partition [__consumer_offsets,7] on broker 
> 99
> 2015-07-07 18:54:26,430  WARN [kafka-request-handler-2] server.ReplicaManager 
> - [Replica Manager on Broker 99]: Fetch request with correlation id 4345718 
> from client ReplicaFetcherThread-0-99 on partition [__consumer_offsets,7] 
> failed due to Leader not local for partition [__consumer_offsets,7] on broker 
> 99
> 2015-07-07 18:54:26,431  WARN [kafka-request-handler-4] server.ReplicaManager 
> - [Replica Manager on Broker 99]: Fetch request with correlation id 4345719 
> from client ReplicaFetcherThread-0-99 on partition [__consumer_offsets,7] 
> failed due to Leader not local for partition [__consumer_offsets,7] on broker 
> 99
> 2015-07-07 18:54:26,432  WARN [kafka-request-handler-5] server.ReplicaManager 
> - [Replica Manager on Broker 99]: Fetch request with correlation id 4345720 
> from client ReplicaFetcherThread-0-99 on partition [__consumer_offsets,7] 
> failed due to Leader not local for partition [__consumer_offsets,7] on broker 
> 99
> 2015-07-07 18:54:26,433  WARN [kafka-request-handler-2] server.ReplicaManager 
> - [Replica Manager on Broker 99]: Fetch request with correlation id 4345721 
> from client ReplicaFetcherThread-0-99 on partition [__consumer_offsets,7] 
> failed due to Leader not local for partition [__consumer_offsets,7] on broker 
> 99
> 2015-07-07 18:54:26,434  WARN [kafka-request-handler-3] server.ReplicaManager 
> - [Replica Manager on Broker 99]: Fetch request with correlation id 4345722 
> from client ReplicaFetcherThread-0-99 on partition [__consumer_offsets,7] 
> failed due to Leader not local for partition [__consumer_offsets,7] on broker 
> 99
> 2015-07-07 18:54:26,436  WARN [kafka-request-handler-1] server.ReplicaManager 
> - [Replica Manager on Broker 99]: Fetch request with correlation id 4345723 
> from client ReplicaFetcherThread-0-99 on partition [__consumer_offsets,7] 
> failed due to Leader not local for partition [__consumer_offsets,7] on broker 
> 99
> 2015-07-07 18:54:26,437  WARN [kafka-request-handler-2] server.ReplicaManager 
> - [Replica Manager on Broker 99]: Fetch request with correlation id 4345724 
> from client ReplicaFetcherThread-0-99 on partition [__consumer_offsets,7] 
> failed due to Leader not local for partition [__consumer_offsets,7] on broker 
> 99
> 2015-07-07 18:54:26,438  WARN [kafka-request-handler-7] server.ReplicaManager 
> - [Replica Manager on Broker 99]: Fetch request with correlation id 4345725 
> from client ReplicaFetcherThread-0-99 on partition [__consumer_offsets,7] 
> failed due to Leader not local for partition [__consumer_offsets,7] on broker 
> 99
> 2015-07-07 18:54:26,438  INFO [kafka-request-handler-6] 
> server.ReplicaFetcherManager - [ReplicaFetcherManager on broker 99] Removed 
> fetcher for partitions [__consumer_offsets,47]
> 2015-07-07 18:54:26,438

[jira] [Commented] (KAFKA-2081) testUncleanLeaderElectionEnabledByTopicOverride transient failure

2016-05-31 Thread Balint Molnar (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-2081?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15307492#comment-15307492
 ] 

Balint Molnar commented on KAFKA-2081:
--

[~junrao] I think this one is not happening any more, I cannot reproduce this 
on my local machine

> testUncleanLeaderElectionEnabledByTopicOverride transient failure
> -
>
> Key: KAFKA-2081
> URL: https://issues.apache.org/jira/browse/KAFKA-2081
> Project: Kafka
>  Issue Type: Sub-task
>  Components: core
>Affects Versions: 0.9.0.0
>Reporter: Jun Rao
>  Labels: transient-unit-test-failure
>
> Saw the following failure.
> kafka.integration.UncleanLeaderElectionTest > 
> testUncleanLeaderElectionEnabledByTopicOverride FAILED
> junit.framework.AssertionFailedError: expected: but 
> was:
> at junit.framework.Assert.fail(Assert.java:47)
> at junit.framework.Assert.failNotEquals(Assert.java:277)
> at junit.framework.Assert.assertEquals(Assert.java:64)
> at junit.framework.Assert.assertEquals(Assert.java:71)
> at 
> kafka.integration.UncleanLeaderElectionTest.verifyUncleanLeaderElectionEnabled(UncleanLeaderElectionTest.scala:179)
> at 
> kafka.integration.UncleanLeaderElectionTest.testUncleanLeaderElectionEnabledByTopicOverride(UncleanLeaderElectionTest.scala:135)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (KAFKA-1573) Transient test failures on LogTest.testCorruptLog

2016-05-31 Thread Balint Molnar (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-1573?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15307481#comment-15307481
 ] 

Balint Molnar commented on KAFKA-1573:
--

[~gwenshap] I think this one is not failing any more, I tried to reproduce on 
my local machine, but it always passed for me.

> Transient test failures on LogTest.testCorruptLog
> -
>
> Key: KAFKA-1573
> URL: https://issues.apache.org/jira/browse/KAFKA-1573
> Project: Kafka
>  Issue Type: Sub-task
>Reporter: Guozhang Wang
>  Labels: transient-unit-test-failure
> Fix For: 0.10.1.0
>
>
> Here is an example of the test failure trace:
> junit.framework.AssertionFailedError: expected:<87> but was:<68>
>   at junit.framework.Assert.fail(Assert.java:47)
>   at junit.framework.Assert.failNotEquals(Assert.java:277)
>   at junit.framework.Assert.assertEquals(Assert.java:64)
>   at junit.framework.Assert.assertEquals(Assert.java:130)
>   at junit.framework.Assert.assertEquals(Assert.java:136)
>   at 
> kafka.log.LogTest$$anonfun$testCorruptLog$1.apply$mcVI$sp(LogTest.scala:615)
>   at 
> scala.collection.immutable.Range$ByOne$class.foreach$mVc$sp(Range.scala:282)
>   at 
> scala.collection.immutable.Range$$anon$2.foreach$mVc$sp(Range.scala:265)
>   at kafka.log.LogTest.testCorruptLog(LogTest.scala:595)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>   at java.lang.reflect.Method.invoke(Method.java:597)
>   at 
> org.junit.internal.runners.TestMethodRunner.executeMethodBody(TestMethodRunner.java:99)
>   at 
> org.junit.internal.runners.TestMethodRunner.runUnprotected(TestMethodRunner.java:81)
>   at 
> org.junit.internal.runners.BeforeAndAfterRunner.runProtected(BeforeAndAfterRunner.java:34)
>   at 
> org.junit.internal.runners.TestMethodRunner.runMethod(TestMethodRunner.java:75)
>   at 
> org.junit.internal.runners.TestMethodRunner.run(TestMethodRunner.java:45)
>   at 
> org.junit.internal.runners.TestClassMethodsRunner.invokeTestMethod(TestClassMethodsRunner.java:71)
>   at 
> org.junit.internal.runners.TestClassMethodsRunner.run(TestClassMethodsRunner.java:35)
>   at 
> org.junit.internal.runners.TestClassRunner$1.runUnprotected(TestClassRunner.java:42)
>   at 
> org.junit.internal.runners.BeforeAndAfterRunner.runProtected(BeforeAndAfterRunner.java:34)
>   at 
> org.junit.internal.runners.TestClassRunner.run(TestClassRunner.java:52)
>   at 
> org.gradle.api.internal.tasks.testing.junit.JUnitTestClassExecuter.runTestClass(JUnitTestClassExecuter.java:80)
>   at 
> org.gradle.api.internal.tasks.testing.junit.JUnitTestClassExecuter.execute(JUnitTestClassExecuter.java:47)
>   at 
> org.gradle.api.internal.tasks.testing.junit.JUnitTestClassProcessor.processTestClass(JUnitTestClassProcessor.java:69)
>   at 
> org.gradle.api.internal.tasks.testing.SuiteTestClassProcessor.processTestClass(SuiteTestClassProcessor.java:49)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>   at java.lang.reflect.Method.invoke(Method.java:597)
>   at 
> org.gradle.messaging.dispatch.ReflectionDispatch.dispatch(ReflectionDispatch.java:35)
>   at 
> org.gradle.messaging.dispatch.ReflectionDispatch.dispatch(ReflectionDispatch.java:24)
>   at 
> org.gradle.messaging.dispatch.ContextClassLoaderDispatch.dispatch(ContextClassLoaderDispatch.java:32)
>   at 
> org.gradle.messaging.dispatch.ProxyDispatchAdapter$DispatchingInvocationHandler.invoke(ProxyDispatchAdapter.java:93)
>   at $Proxy2.processTestClass(Unknown Source)
>   at 
> org.gradle.api.internal.tasks.testing.worker.TestWorker.processTestClass(TestWorker.java:103)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>   at java.lang.reflect.Method.invoke(Method.java:597)
>   at 
> org.gradle.messaging.dispatch.ReflectionDispatch.dispatch(ReflectionDispatch.java:35)
>   at 
> org.gradle.messaging.dispatch.ReflectionDispatch.dispatch(ReflectionDispatch.java:24)
>   at 
> org.gradle.messaging.remote.internal.hub.MessageHub$Handler.run(MessageHub.java:355)
>   at 
>

[jira] [Commented] (KAFKA-1534) transient unit test failure in testBasicPreferredReplicaElection

2016-05-31 Thread Balint Molnar (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-1534?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15307435#comment-15307435
 ] 

Balint Molnar commented on KAFKA-1534:
--

[~nehanarkhede] I think this issue is already fixed, because I cannot reproduce 
it on my machine.

> transient unit test failure in testBasicPreferredReplicaElection
> 
>
> Key: KAFKA-1534
> URL: https://issues.apache.org/jira/browse/KAFKA-1534
> Project: Kafka
>  Issue Type: Sub-task
>  Components: core
>Affects Versions: 0.8.2.0
>Reporter: Jun Rao
>Assignee: Abhishek Sharma
>  Labels: newbie, transient-unit-test-failure
>
> Saw the following transient failure. 
> kafka.admin.AdminTest > testBasicPreferredReplicaElection FAILED
> junit.framework.AssertionFailedError: Timing out after 5000 ms since 
> leader is not elected or changed for partition [test,1]
> at junit.framework.Assert.fail(Assert.java:47)
> at 
> kafka.utils.TestUtils$.waitUntilLeaderIsElectedOrChanged(TestUtils.scala:542)
> at 
> kafka.admin.AdminTest.testBasicPreferredReplicaElection(AdminTest.scala:310)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

37 matches

Mail list logo