[GitHub] kafka pull request: Automatically expire RecordBatches before drai...

2016-03-29 Thread SoyeeDst
Github user SoyeeDst closed the pull request at:

https://github.com/apache/kafka/pull/1154


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (KAFKA-3205) Error in I/O with host (java.io.EOFException) raised in producer

2016-03-29 Thread Jonathan Bond (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-3205?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15217352#comment-15217352
 ] 

Jonathan Bond commented on KAFKA-3205:
--

Our logs are being flooded with this message from the producer since we updated 
the broker to 0.9.  I was hoping somebody could review this PR to see if it is 
an acceptable solution to the problem.

> Error in I/O with host (java.io.EOFException) raised in producer
> 
>
> Key: KAFKA-3205
> URL: https://issues.apache.org/jira/browse/KAFKA-3205
> Project: Kafka
>  Issue Type: Bug
>  Components: clients
>Affects Versions: 0.8.2.1, 0.9.0.0
>Reporter: Jonathan Raffre
>
> In a situation with a Kafka broker in 0.9 and producers still in 0.8.2.x, 
> producers seems to raise the following after a variable amount of time since 
> start :
> {noformat}
> 2016-01-29 14:33:13,066 WARN [] o.a.k.c.n.Selector: Error in I/O with 
> 172.22.2.170
> java.io.EOFException: null
> at 
> org.apache.kafka.common.network.NetworkReceive.readFrom(NetworkReceive.java:62)
>  ~[org.apache.kafka.kafka-clients-0.8.2.0.jar:na]
> at org.apache.kafka.common.network.Selector.poll(Selector.java:248) 
> ~[org.apache.kafka.kafka-clients-0.8.2.0.jar:na]
> at 
> org.apache.kafka.clients.NetworkClient.poll(NetworkClient.java:192) 
> [org.apache.kafka.kafka-clients-0.8.2.0.jar:na]
> at 
> org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:191) 
> [org.apache.kafka.kafka-clients-0.8.2.0.jar:na]
> at 
> org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:122) 
> [org.apache.kafka.kafka-clients-0.8.2.0.jar:na]
> at java.lang.Thread.run(Thread.java:745) [na:1.8.0_66-internal]
> {noformat}
> This can be reproduced successfully by doing the following :
>  * Start a 0.8.2 producer connected to the 0.9 broker
>  * Wait 15 minutes, exactly
>  * See the error in the producer logs.
> Oddly, this also shows up in an active producer but after 10 minutes of 
> activity.
> Kafka's server.properties :
> {noformat}
> broker.id=1
> listeners=PLAINTEXT://:9092
> port=9092
> num.network.threads=2
> num.io.threads=2
> socket.send.buffer.bytes=1048576
> socket.receive.buffer.bytes=1048576
> socket.request.max.bytes=104857600
> log.dirs=/mnt/data/kafka
> num.partitions=4
> auto.create.topics.enable=false
> delete.topic.enable=true
> num.recovery.threads.per.data.dir=1
> log.retention.hours=48
> log.retention.bytes=524288000
> log.segment.bytes=52428800
> log.retention.check.interval.ms=6
> log.roll.hours=24
> log.cleanup.policy=delete
> log.cleaner.enable=true
> zookeeper.connect=127.0.0.1:2181
> zookeeper.connection.timeout.ms=100
> {noformat}
> Producer's configuration :
> {noformat}
>   compression.type = none
>   metric.reporters = []
>   metadata.max.age.ms = 30
>   metadata.fetch.timeout.ms = 6
>   acks = all
>   batch.size = 16384
>   reconnect.backoff.ms = 10
>   bootstrap.servers = [127.0.0.1:9092]
>   receive.buffer.bytes = 32768
>   retry.backoff.ms = 500
>   buffer.memory = 33554432
>   timeout.ms = 3
>   key.serializer = class 
> org.apache.kafka.common.serialization.StringSerializer
>   retries = 3
>   max.request.size = 500
>   block.on.buffer.full = true
>   value.serializer = class 
> org.apache.kafka.common.serialization.StringSerializer
>   metrics.sample.window.ms = 3
>   send.buffer.bytes = 131072
>   max.in.flight.requests.per.connection = 5
>   metrics.num.samples = 2
>   linger.ms = 0
>   client.id = 
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-3205) Error in I/O with host (java.io.EOFException) raised in producer

2016-03-29 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-3205?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15217346#comment-15217346
 ] 

ASF GitHub Bot commented on KAFKA-3205:
---

GitHub user bondj opened a pull request:

https://github.com/apache/kafka/pull/1166

KAFKA-3205 Support passive close by broker

An attempt to fix KAFKA-3205.  It appears the problem is that the broker 
has closed the connection passively, and the client should react appropriately.

In NetworkReceive.readFrom() rather than throw an EOFException (Which means 
the end of stream has been reached unexpectedly during input), instead return 
the negative bytes read signifying an acceptable end of stream.

In Selector if the channel is being passively closed, don't try to read any 
more data, don't try to write, and close the key.

I believe this will fix the problem.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/bondj/kafka passiveClose

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/1166.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #1166


commit 5dc11015435a38a0d97efa2f46b4d9d9f41645b5
Author: Jonathan Bond 
Date:   2016-03-30T03:57:11Z

Support passive close by broker




> Error in I/O with host (java.io.EOFException) raised in producer
> 
>
> Key: KAFKA-3205
> URL: https://issues.apache.org/jira/browse/KAFKA-3205
> Project: Kafka
>  Issue Type: Bug
>  Components: clients
>Affects Versions: 0.8.2.1, 0.9.0.0
>Reporter: Jonathan Raffre
>
> In a situation with a Kafka broker in 0.9 and producers still in 0.8.2.x, 
> producers seems to raise the following after a variable amount of time since 
> start :
> {noformat}
> 2016-01-29 14:33:13,066 WARN [] o.a.k.c.n.Selector: Error in I/O with 
> 172.22.2.170
> java.io.EOFException: null
> at 
> org.apache.kafka.common.network.NetworkReceive.readFrom(NetworkReceive.java:62)
>  ~[org.apache.kafka.kafka-clients-0.8.2.0.jar:na]
> at org.apache.kafka.common.network.Selector.poll(Selector.java:248) 
> ~[org.apache.kafka.kafka-clients-0.8.2.0.jar:na]
> at 
> org.apache.kafka.clients.NetworkClient.poll(NetworkClient.java:192) 
> [org.apache.kafka.kafka-clients-0.8.2.0.jar:na]
> at 
> org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:191) 
> [org.apache.kafka.kafka-clients-0.8.2.0.jar:na]
> at 
> org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:122) 
> [org.apache.kafka.kafka-clients-0.8.2.0.jar:na]
> at java.lang.Thread.run(Thread.java:745) [na:1.8.0_66-internal]
> {noformat}
> This can be reproduced successfully by doing the following :
>  * Start a 0.8.2 producer connected to the 0.9 broker
>  * Wait 15 minutes, exactly
>  * See the error in the producer logs.
> Oddly, this also shows up in an active producer but after 10 minutes of 
> activity.
> Kafka's server.properties :
> {noformat}
> broker.id=1
> listeners=PLAINTEXT://:9092
> port=9092
> num.network.threads=2
> num.io.threads=2
> socket.send.buffer.bytes=1048576
> socket.receive.buffer.bytes=1048576
> socket.request.max.bytes=104857600
> log.dirs=/mnt/data/kafka
> num.partitions=4
> auto.create.topics.enable=false
> delete.topic.enable=true
> num.recovery.threads.per.data.dir=1
> log.retention.hours=48
> log.retention.bytes=524288000
> log.segment.bytes=52428800
> log.retention.check.interval.ms=6
> log.roll.hours=24
> log.cleanup.policy=delete
> log.cleaner.enable=true
> zookeeper.connect=127.0.0.1:2181
> zookeeper.connection.timeout.ms=100
> {noformat}
> Producer's configuration :
> {noformat}
>   compression.type = none
>   metric.reporters = []
>   metadata.max.age.ms = 30
>   metadata.fetch.timeout.ms = 6
>   acks = all
>   batch.size = 16384
>   reconnect.backoff.ms = 10
>   bootstrap.servers = [127.0.0.1:9092]
>   receive.buffer.bytes = 32768
>   retry.backoff.ms = 500
>   buffer.memory = 33554432
>   timeout.ms = 3
>   key.serializer = class 
> org.apache.kafka.common.serialization.StringSerializer
>   retries = 3
>   max.request.size = 500
>   block.on.buffer.full = true
>   value.serializer = class 
> org.apache.kafka.common.serialization.StringSerializer
>   metrics.sample.window.ms = 3
>   send.buffer.bytes = 131072
>   max.in.flight.requests.per.connection = 5
>   metrics.num.samples = 2
>   linger.ms = 0
>   client.id = 
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-3471) min.insync.replicas isn't respected when there's a delaying follower who still in ISR

2016-03-29 Thread Jiangjie Qin (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-3471?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15217295#comment-15217295
 ] 

Jiangjie Qin commented on KAFKA-3471:
-

I think it depends on what you actually want. Shortening 
replica.lag.time.max.ms will let the leader kick the slow broker out of ISR 
quicker, but might increase the possibility of under replicated partition 
churning. It seems usually we want to producer to handle it. So setting the 
buffer size, retry and back off ms seems reasonable solution. It allows the 
producer buffer the messages so it is more resilient to such broker side issues.

> min.insync.replicas isn't respected when there's a delaying follower who 
> still in ISR
> -
>
> Key: KAFKA-3471
> URL: https://issues.apache.org/jira/browse/KAFKA-3471
> Project: Kafka
>  Issue Type: Bug
>  Components: core
>Affects Versions: 0.9.0.1
>Reporter: Yuto Kawamura
>
> h2. tl;dr;
> Partition.checkEnoughReplicasReachOffset should see the number of followers 
> which are already caught up until requiredOffset instead of high watermark to 
> consider whether there are enough number of replicas for a produce request.
> h2. Description
> Just recently I found an interesting metric on our Kafka cluster.
> During the peak time, the number of produce requests significantly decreased 
> only on single broker. Let's say this broker's id=1.
> - broker-1 holds leadership for 3 partitions of the topic T.
> - For each producer they configured to have acks=all.
> - broker-1 contains some topics and each topic is configured to have 3 
> replicas, and min.insync.replicas is configured to 2.
> - For the partition 1 of topic T(T-1) there are 3 replicas namely: 
> broker-1(leader), broker-2(follower-A), broker-3(follower-B).
> When I see the logs of broker-1, there was lot's of logs indicating ISR 
> expand and shrink happening frequently for T-1.
> After investigating a while, we restarted broker-1 and unexpectedly 
> continuous ISR expand/shrink had gone.
> Since it is highly likely a state corruption issue(because it's fixed by a 
> simple restart) and it's never reproduced after a broker restart,
> unfortunately, but we lost clue to understand what was happening actually so 
> until today I'm not knowing the cause of this phenomenon.
> By the way we continued investigating why frequent ISR shrink/expand causes 
> reduction of the number of produce requests and found that kafka broker isn't 
> likely respecting min.insync.replicas as the document of this config 
> describes.
> Here's the scenario:
> 1. Everything working well.
>ISR(=LogEndOffset): leader=3, follower-A=2, follower-B=2
>HW: 2
> 2. Producer client produces some records. For simplicity it contains only one 
> record so the LogEndOffset is updated to 4, and the request will put into 
> purgatory since it has requiredOffset=4 while HW stay in 2.
>ISR(=LogEndOffset): leader=4, follower-A=2, follower-B=2
>HW: 2
> 3. follower-A performs fetch and updated it's LogEndOffset to 4. IIUC, the 
> request received at 2. can be considered as succeeded ATM since it requires 
> only 2 out of 3 replicas are in sync(min.insync.replicas=2, acks=all), but 
> it's not with current implementation because of HW(ref: 
> https://github.com/apache/kafka/blob/1fbe445dde71df0023a978c5e54dd229d3d23e1b/core/src/main/scala/kafka/cluster/Partition.scala#L325).
>ISR(=LogEndOffset): leader=4, follower-A=4, follower-B=2
>HW: 2
> 4. By some reason, follower-B couldn't perform fetch for a while. ATM 
> follower-B still in ISR because of replica.lag.time.max.ms, meaning it still 
> affects HW.
>ISR(=LogEndOffset): leader=4, follower-A=4, follower-B=2
>HW: 2
> 5. By timeout the produce request received at 2. considered as failed and 
> client retries. Any incomming requests for T-1 will never succeed during this 
> moment.
>ISR(=LogEndOffset): leader=4, follower-A=4, follower-B=2
>HW: 2
> 6. The leader decides to abandon follower-B from ISR because of 
> replica.lag.time.max.ms. HW increased to 4 and all produce requests can now 
> successfully processed.
>ISR(=LogEndOffset): leader=4, follower-A=4
>HW: 4
> 7. After a while follower-B came back and caught up until the LogEndOffset so 
> the leader let him in to ISR again. The situation goes back to 1., continues 
> again.
> So here I understand that records on a producer are accumulated to single 
> batch while the produce request for the T-1 blocked(and retried) during 4-6 
> and that's why the total number of requests decreased significantly on 
> broker-1 while the total number of messages hasn't changed.
> As I commented on 3., the leader should consider a produce request succeeded 
> after it confirms min.insync.replicas's number of acks, so the current 
> 

[jira] [Commented] (KAFKA-1718) "Message Size Too Large" error when only small messages produced with Snappy

2016-03-29 Thread Swapnesh Gandhi (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1718?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15217246#comment-15217246
 ] 

Swapnesh Gandhi commented on KAFKA-1718:


I am seeing the same issue, 
is this compressed messageSet, split into small messages at some point? if so, 
when?
does this impact replication in any way? I am thinking performance-wise. 

If I bump up the max.message.size to solve this problem, will that impact 
performance? if I am sure I have many small messages rather than a single large 
message.

> "Message Size Too Large" error when only small messages produced with Snappy
> 
>
> Key: KAFKA-1718
> URL: https://issues.apache.org/jira/browse/KAFKA-1718
> Project: Kafka
>  Issue Type: Bug
>  Components: core
>Affects Versions: 0.8.1.1
>Reporter: Evan Huus
>Priority: Critical
>
> I'm the primary author of the Go bindings, and while I originally received 
> this as a bug against my bindings, I'm coming to the conclusion that it's a 
> bug in the broker somehow.
> Specifically, take a look at the last two kafka packets in the following 
> packet capture: https://dl.dropboxusercontent.com/u/171647/kafka.pcapng (you 
> will need a trunk build of Wireshark to fully decode the kafka part of the 
> packets).
> The produce request contains two partitions on one topic. Each partition has 
> one message set (sizes 977205 bytes and 967362 bytes respectively). Each 
> message set is a sequential collection of snappy-compressed messages, each 
> message of size 46899. When uncompressed, each message contains a message set 
> of 999600 bytes, containing a sequence of uncompressed 1024-byte messages.
> However, the broker responds to this with a MessageSizeTooLarge error, full 
> stacktrace from the broker logs being:
> kafka.common.MessageSizeTooLargeException: Message size is 1070127 bytes 
> which exceeds the maximum configured message size of 112.
>   at kafka.log.Log$$anonfun$append$1.apply(Log.scala:267)
>   at kafka.log.Log$$anonfun$append$1.apply(Log.scala:265)
>   at scala.collection.Iterator$class.foreach(Iterator.scala:727)
>   at kafka.utils.IteratorTemplate.foreach(IteratorTemplate.scala:32)
>   at kafka.log.Log.append(Log.scala:265)
>   at kafka.cluster.Partition.appendMessagesToLeader(Partition.scala:354)
>   at 
> kafka.server.KafkaApis$$anonfun$appendToLocalLog$2.apply(KafkaApis.scala:376)
>   at 
> kafka.server.KafkaApis$$anonfun$appendToLocalLog$2.apply(KafkaApis.scala:366)
>   at 
> scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244)
>   at 
> scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244)
>   at 
> scala.collection.mutable.HashMap$$anonfun$foreach$1.apply(HashMap.scala:98)
>   at 
> scala.collection.mutable.HashMap$$anonfun$foreach$1.apply(HashMap.scala:98)
>   at 
> scala.collection.mutable.HashTable$class.foreachEntry(HashTable.scala:226)
>   at scala.collection.mutable.HashMap.foreachEntry(HashMap.scala:39)
>   at scala.collection.mutable.HashMap.foreach(HashMap.scala:98)
>   at scala.collection.TraversableLike$class.map(TraversableLike.scala:244)
>   at scala.collection.AbstractTraversable.map(Traversable.scala:105)
>   at kafka.server.KafkaApis.appendToLocalLog(KafkaApis.scala:366)
>   at kafka.server.KafkaApis.handleProducerRequest(KafkaApis.scala:292)
>   at kafka.server.KafkaApis.handle(KafkaApis.scala:185)
>   at kafka.server.KafkaRequestHandler.run(KafkaRequestHandler.scala:42)
>   at java.lang.Thread.run(Thread.java:695)
> Since as far as I can tell none of the sizes in the actual produced packet 
> exceed the defined maximum, I can only assume that the broker is 
> miscalculating something somewhere and throwing the exception improperly.
> ---
> This issue can be reliably reproduced using an out-of-the-box binary download 
> of 0.8.1.1 and the following gist: 
> https://gist.github.com/eapache/ce0f15311c605a165ce7 (you will need to use 
> the `producer-ng` branch of the Sarama library).
> ---
> I am happy to provide any more information you might need, or to do relevant 
> experiments etc.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Work started] (KAFKA-3338) Add print and writeAsText functions to the Streams DSL

2016-03-29 Thread Bill Bejeck (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-3338?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on KAFKA-3338 started by Bill Bejeck.
--
> Add print and writeAsText functions to the Streams DSL
> --
>
> Key: KAFKA-3338
> URL: https://issues.apache.org/jira/browse/KAFKA-3338
> Project: Kafka
>  Issue Type: Sub-task
>Reporter: Guozhang Wang
>Assignee: Bill Bejeck
>  Labels: newbie++
> Fix For: 0.10.1.0
>
>
> We want to provide some REPL-like pattern for users for better debuggability. 
> More concretely, we want to allow users to easily inspect their intermediate 
> data streams in the topology while running the application. Theoretically 
> this can be done by using a break point, or by calling System.out.println() 
> inside the operator, or through a finer grained trace-level logging. But more 
> user-friendly API would be to add a print() function to the KStream / KTable 
> object like:
> {code}
> // Prints the elements in this stream to the stdout, i.e. "System.out" of the 
> JVM
> KStream#print(/* optional serde */);  
> KTable#print(/* optional serde */);  
> // Writes the stream as text file(s) to the specified location.
> KStream#writeAsText(String filePath, /* optional serde */);
> KTable#writeAsText(String filePath, /* optional serde */);
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] kafka pull request: MINOR: Advance system test ducktape dependency...

2016-03-29 Thread granders
GitHub user granders opened a pull request:

https://github.com/apache/kafka/pull/1165

MINOR: Advance system test ducktape dependency from 0.3.10 to 0.4.0

Previous version of ducktape was found to have a memory leak which caused 
occasional failures in nightly runs.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/confluentinc/kafka 
minor-advance-ducktape-to-0.4.0

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/1165.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #1165


commit 74ce1af3206b1bb2e72c57995a070da87ddab357
Author: Geoff Anderson 
Date:   2016-03-29T23:17:19Z

Advanced system test ducktape dependency from 0.3.10 to 0.4.0




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


Build failed in Jenkins: kafka-trunk-jdk8 #490

2016-03-29 Thread Apache Jenkins Server
See 

Changes:

[wangguoz] KAFKA-3425: add missing upgrade notes

--
[...truncated 1613 lines...]
kafka.producer.AsyncProducerTest > testNoBroker PASSED

kafka.producer.AsyncProducerTest > testProduceAfterClosed PASSED

kafka.producer.AsyncProducerTest > testJavaProducer PASSED

kafka.producer.AsyncProducerTest > testIncompatibleEncoder PASSED

kafka.producer.SyncProducerTest > testReachableServer PASSED

kafka.producer.SyncProducerTest > testMessageSizeTooLarge PASSED

kafka.producer.SyncProducerTest > testNotEnoughReplicas PASSED

kafka.producer.SyncProducerTest > testMessageSizeTooLargeWithAckZero PASSED

kafka.producer.SyncProducerTest > testProducerCanTimeout PASSED

kafka.producer.SyncProducerTest > testProduceRequestWithNoResponse PASSED

kafka.producer.SyncProducerTest > testEmptyProduceRequest PASSED

kafka.producer.SyncProducerTest > testProduceCorrectlyReceivesResponse PASSED

kafka.tools.ConsoleProducerTest > testParseKeyProp PASSED

kafka.tools.ConsoleProducerTest > testValidConfigsOldProducer PASSED

kafka.tools.ConsoleProducerTest > testInvalidConfigs PASSED

kafka.tools.ConsoleProducerTest > testValidConfigsNewProducer PASSED

kafka.tools.ConsoleConsumerTest > shouldLimitReadsToMaxMessageLimit PASSED

kafka.tools.ConsoleConsumerTest > shouldParseValidNewConsumerValidConfig PASSED

kafka.tools.ConsoleConsumerTest > shouldParseConfigsFromFile PASSED

kafka.tools.ConsoleConsumerTest > shouldParseValidOldConsumerValidConfig PASSED

kafka.security.auth.PermissionTypeTest > testFromString PASSED

kafka.security.auth.ResourceTypeTest > testFromString PASSED

kafka.security.auth.OperationTest > testFromString PASSED

kafka.security.auth.AclTest > testAclJsonConversion PASSED

kafka.security.auth.ZkAuthorizationTest > testIsZkSecurityEnabled PASSED

kafka.security.auth.ZkAuthorizationTest > testZkUtils PASSED

kafka.security.auth.ZkAuthorizationTest > testZkAntiMigration PASSED

kafka.security.auth.ZkAuthorizationTest > testZkMigration PASSED

kafka.security.auth.ZkAuthorizationTest > testChroot PASSED

kafka.security.auth.ZkAuthorizationTest > testDelete PASSED

kafka.security.auth.ZkAuthorizationTest > testDeleteRecursive PASSED

kafka.security.auth.SimpleAclAuthorizerTest > testAllowAllAccess PASSED

kafka.security.auth.SimpleAclAuthorizerTest > 
testLocalConcurrentModificationOfResourceAcls PASSED

kafka.security.auth.SimpleAclAuthorizerTest > testNoAclFound PASSED

kafka.security.auth.SimpleAclAuthorizerTest > 
testDistributedConcurrentModificationOfResourceAcls PASSED

kafka.security.auth.SimpleAclAuthorizerTest > testAclManagementAPIs PASSED

kafka.security.auth.SimpleAclAuthorizerTest > testWildCardAcls PASSED

kafka.security.auth.SimpleAclAuthorizerTest > testTopicAcl PASSED

kafka.security.auth.SimpleAclAuthorizerTest > testSuperUserHasAccess PASSED

kafka.security.auth.SimpleAclAuthorizerTest > testDenyTakesPrecedence PASSED

kafka.security.auth.SimpleAclAuthorizerTest > testNoAclFoundOverride PASSED

kafka.security.auth.SimpleAclAuthorizerTest > 
testHighConcurrencyModificationOfResourceAcls PASSED

kafka.security.auth.SimpleAclAuthorizerTest > testLoadCache PASSED

kafka.integration.SaslPlaintextTopicMetadataTest > 
testIsrAfterBrokerShutDownAndJoinsBack PASSED

kafka.integration.SaslPlaintextTopicMetadataTest > 
testAutoCreateTopicWithCollision PASSED

kafka.integration.SaslPlaintextTopicMetadataTest > 
testAliveBrokerListWithNoTopics PASSED

kafka.integration.SaslPlaintextTopicMetadataTest > testAutoCreateTopic PASSED

kafka.integration.SaslPlaintextTopicMetadataTest > testGetAllTopicMetadata 
PASSED

kafka.integration.SaslPlaintextTopicMetadataTest > 
testAliveBrokersListWithNoTopicsAfterNewBrokerStartup PASSED

kafka.integration.SaslPlaintextTopicMetadataTest > testBasicTopicMetadata PASSED

kafka.integration.SaslPlaintextTopicMetadataTest > 
testAliveBrokersListWithNoTopicsAfterABrokerShutdown PASSED

kafka.integration.PrimitiveApiTest > testMultiProduce PASSED

kafka.integration.PrimitiveApiTest > testDefaultEncoderProducerAndFetch PASSED

kafka.integration.PrimitiveApiTest > testFetchRequestCanProperlySerialize PASSED

kafka.integration.PrimitiveApiTest > testPipelinedProduceRequests PASSED

kafka.integration.PrimitiveApiTest > testProduceAndMultiFetch PASSED

kafka.integration.PrimitiveApiTest > 
testDefaultEncoderProducerAndFetchWithCompression PASSED

kafka.integration.PrimitiveApiTest > testConsumerEmptyTopic PASSED

kafka.integration.PrimitiveApiTest > testEmptyFetchRequest PASSED

kafka.integration.PlaintextTopicMetadataTest > 
testIsrAfterBrokerShutDownAndJoinsBack PASSED

kafka.integration.PlaintextTopicMetadataTest > testAutoCreateTopicWithCollision 
PASSED

kafka.integration.PlaintextTopicMetadataTest > testAliveBrokerListWithNoTopics 
PASSED

kafka.integration.PlaintextTopicMetadataTest > testAutoCreateTopic PASSED

kafka.integration.PlaintextTopicMetadataTest > 

[jira] [Commented] (KAFKA-2426) A Kafka node tries to connect to itself through its advertised hostname

2016-03-29 Thread JIRA

[ 
https://issues.apache.org/jira/browse/KAFKA-2426?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15217007#comment-15217007
 ] 

Mikaël Cluseau commented on KAFKA-2426:
---

Hi,

it depends on how you add the host to the container's /etc/hosts. I think it's 
overwritten by docker on start, so you have to add it to add it after the 
container started (but before kafka resolves it). The easiest way is probably 
to use the --add-host parameter.

About iptables & NAT things, chances are that you must enable the hairpin mode 
on the container's host port interface for a natted packet to be sent back on 
the same interface (echo 1 >/sys/class/net/vethXXX/brport/hairpin_mode).

> A Kafka node tries to connect to itself through its advertised hostname
> ---
>
> Key: KAFKA-2426
> URL: https://issues.apache.org/jira/browse/KAFKA-2426
> Project: Kafka
>  Issue Type: Bug
>  Components: network
>Affects Versions: 0.8.2.1
> Environment: Docker https://github.com/wurstmeister/kafka-docker, 
> managed by a Kubernetes cluster, with an "iptables proxy".
>Reporter: Mikaël Cluseau
>Assignee: Jun Rao
>
> Hi,
> when used behind a firewall, Apache Kafka nodes are trying to connect to 
> themselves using their advertised hostnames. This means that if you have a 
> service IP managed by the docker's host using *only* iptables DNAT rules, the 
> node's connection to "itself" times out.
> This is the case in any setup where a host will DNAT the service IP to the 
> instance's IP, and send the packet back on the same interface other a Linux 
> Bridge port not configured in "hairpin" mode. It's because of this: 
> https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/tree/net/bridge/br_forward.c#n30
> The specific part of the kubernetes issue is here: 
> https://github.com/BenTheElder/kubernetes/issues/3#issuecomment-123925060 .
> The timeout involves that the even if partition's leader is elected, it then 
> fails to accept writes from the other members, causing a write lock. and 
> generating very heavy logs (as fast as Kafka usualy is, but through log4j 
> this time ;)).
> This also means that the normal docker case work by going through the 
> userspace-proxy, which necessarily impacts the performance.
> The workaround for us was to add a "127.0.0.2 advertised-hostname" to 
> /etc/hosts in the container startup script.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Build failed in Jenkins: kafka-trunk-jdk7 #1158

2016-03-29 Thread Apache Jenkins Server
See 

Changes:

[wangguoz] KAFKA-3425: add missing upgrade notes

--
[...truncated 1551 lines...]

kafka.log.LogTest > testTruncateTo PASSED

kafka.log.LogTest > testCleanShutdownFile PASSED

kafka.log.OffsetIndexTest > lookupExtremeCases PASSED

kafka.log.OffsetIndexTest > appendTooMany PASSED

kafka.log.OffsetIndexTest > randomLookupTest PASSED

kafka.log.OffsetIndexTest > testReopen PASSED

kafka.log.OffsetIndexTest > appendOutOfOrder PASSED

kafka.log.OffsetIndexTest > truncate PASSED

kafka.log.LogSegmentTest > testRecoveryWithCorruptMessage PASSED

kafka.log.LogSegmentTest > testRecoveryFixesCorruptIndex PASSED

kafka.log.LogSegmentTest > testReadFromGap PASSED

kafka.log.LogSegmentTest > testTruncate PASSED

kafka.log.LogSegmentTest > testReadBeforeFirstOffset PASSED

kafka.log.LogSegmentTest > testCreateWithInitFileSizeAppendMessage PASSED

kafka.log.LogSegmentTest > testChangeFileSuffixes PASSED

kafka.log.LogSegmentTest > testMaxOffset PASSED

kafka.log.LogSegmentTest > testNextOffsetCalculation PASSED

kafka.log.LogSegmentTest > testReadOnEmptySegment PASSED

kafka.log.LogSegmentTest > testReadAfterLast PASSED

kafka.log.LogSegmentTest > testCreateWithInitFileSizeClearShutdown PASSED

kafka.log.LogSegmentTest > testTruncateFull PASSED

kafka.log.CleanerTest > testBuildOffsetMap PASSED

kafka.log.CleanerTest > testSegmentGrouping PASSED

kafka.log.CleanerTest > testCleanSegmentsWithAbort PASSED

kafka.log.CleanerTest > testSegmentGroupingWithSparseOffsets PASSED

kafka.log.CleanerTest > testRecoveryAfterCrash PASSED

kafka.log.CleanerTest > testLogToClean PASSED

kafka.log.CleanerTest > testCleaningWithDeletes PASSED

kafka.log.CleanerTest > testCleanSegments PASSED

kafka.log.CleanerTest > testCleaningWithUnkeyedMessages PASSED

kafka.log.OffsetMapTest > testClear PASSED

kafka.log.OffsetMapTest > testBasicValidation PASSED

kafka.coordinator.GroupCoordinatorResponseTest > testHeartbeatWrongCoordinator 
PASSED

kafka.coordinator.GroupCoordinatorResponseTest > testDescribeGroupStable PASSED

kafka.coordinator.GroupCoordinatorResponseTest > testHeartbeatIllegalGeneration 
PASSED

kafka.coordinator.GroupCoordinatorResponseTest > 
testDescribeGroupWrongCoordinator PASSED

kafka.coordinator.GroupCoordinatorResponseTest > testDescribeGroupRebalancing 
PASSED

kafka.coordinator.GroupCoordinatorResponseTest > testLeaderFailureInSyncGroup 
PASSED

kafka.coordinator.GroupCoordinatorResponseTest > 
testGenerationIdIncrementsOnRebalance PASSED

kafka.coordinator.GroupCoordinatorResponseTest > 
testSyncGroupFromIllegalGeneration PASSED

kafka.coordinator.GroupCoordinatorResponseTest > testInvalidGroupId PASSED

kafka.coordinator.GroupCoordinatorResponseTest > testHeartbeatUnknownGroup 
PASSED

kafka.coordinator.GroupCoordinatorResponseTest > 
testListGroupsIncludesStableGroups PASSED

kafka.coordinator.GroupCoordinatorResponseTest > 
testHeartbeatDuringRebalanceCausesRebalanceInProgress PASSED

kafka.coordinator.GroupCoordinatorResponseTest > 
testJoinGroupInconsistentGroupProtocol PASSED

kafka.coordinator.GroupCoordinatorResponseTest > 
testJoinGroupSessionTimeoutTooLarge PASSED

kafka.coordinator.GroupCoordinatorResponseTest > 
testJoinGroupSessionTimeoutTooSmall PASSED

kafka.coordinator.GroupCoordinatorResponseTest > testSyncGroupEmptyAssignment 
PASSED

kafka.coordinator.GroupCoordinatorResponseTest > 
testCommitOffsetWithDefaultGeneration PASSED

kafka.coordinator.GroupCoordinatorResponseTest > 
testJoinGroupFromUnchangedLeaderShouldRebalance PASSED

kafka.coordinator.GroupCoordinatorResponseTest > 
testHeartbeatRebalanceInProgress PASSED

kafka.coordinator.GroupCoordinatorResponseTest > testLeaveGroupUnknownGroup 
PASSED

kafka.coordinator.GroupCoordinatorResponseTest > 
testListGroupsIncludesRebalancingGroups PASSED

kafka.coordinator.GroupCoordinatorResponseTest > 
testSyncGroupFollowerAfterLeader PASSED

kafka.coordinator.GroupCoordinatorResponseTest > testCommitOffsetInAwaitingSync 
PASSED

kafka.coordinator.GroupCoordinatorResponseTest > testJoinGroupWrongCoordinator 
PASSED

kafka.coordinator.GroupCoordinatorResponseTest > 
testJoinGroupUnknownConsumerExistingGroup PASSED

kafka.coordinator.GroupCoordinatorResponseTest > testSyncGroupFromUnknownGroup 
PASSED

kafka.coordinator.GroupCoordinatorResponseTest > 
testJoinGroupInconsistentProtocolType PASSED

kafka.coordinator.GroupCoordinatorResponseTest > 
testCommitOffsetFromUnknownGroup PASSED

kafka.coordinator.GroupCoordinatorResponseTest > testLeaveGroupWrongCoordinator 
PASSED

kafka.coordinator.GroupCoordinatorResponseTest > 
testLeaveGroupUnknownConsumerExistingGroup PASSED

kafka.coordinator.GroupCoordinatorResponseTest > 
testJoinGroupUnknownConsumerNewGroup PASSED

kafka.coordinator.GroupCoordinatorResponseTest > 
testJoinGroupFromUnchangedFollowerDoesNotRebalance PASSED

kafka.coordinator.GroupCoordinatorResponseTest > 

Jenkins build is back to normal : kafka-trunk-jdk8 #489

2016-03-29 Thread Apache Jenkins Server
See 



[GitHub] kafka pull request: MINOR: a simple benchmark for Streams

2016-03-29 Thread ymatsuda
GitHub user ymatsuda opened a pull request:

https://github.com/apache/kafka/pull/1164

MINOR: a simple benchmark for Streams

@guozhangwang @miguno 

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/ymatsuda/kafka perf

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/1164.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #1164


commit 8549122435b71b41a9270d0dc2c70b863c0cd064
Author: Yasuhiro Matsuda 
Date:   2016-03-29T22:15:02Z

MINOR: a simple benchmark for Streams




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Updated] (KAFKA-3262) Make KafkaStreams debugging friendly

2016-03-29 Thread Guozhang Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-3262?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Guozhang Wang updated KAFKA-3262:
-
Fix Version/s: 0.10.0.1

> Make KafkaStreams debugging friendly
> 
>
> Key: KAFKA-3262
> URL: https://issues.apache.org/jira/browse/KAFKA-3262
> Project: Kafka
>  Issue Type: Sub-task
>  Components: kafka streams
>Affects Versions: 0.10.0.0
>Reporter: Yasuhiro Matsuda
> Fix For: 0.10.0.1
>
>
> Current KafkaStreams polls records in the same thread as the data processing 
> thread. This makes debugging user code, as well as KafkaStreams itself, 
> difficult. When the thread is suspended by the debugger, the next heartbeat 
> of the consumer tie to the thread won't be send until the thread is resumed. 
> This often results in missed heartbeats and causes a group rebalance. So it 
> may will be a completely different context then the thread hits the break 
> point the next time.
> We should consider using separate threads for polling and processing.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-3262) Make KafkaStreams debugging friendly

2016-03-29 Thread Guozhang Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-3262?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15216934#comment-15216934
 ] 

Guozhang Wang commented on KAFKA-3262:
--

I agree that for development cycle we should enforce "single thread".

[~jkreps] What do you mean by "Another issue is that the clients default to 
debug logging"?

> Make KafkaStreams debugging friendly
> 
>
> Key: KAFKA-3262
> URL: https://issues.apache.org/jira/browse/KAFKA-3262
> Project: Kafka
>  Issue Type: Sub-task
>  Components: kafka streams
>Affects Versions: 0.10.0.0
>Reporter: Yasuhiro Matsuda
> Fix For: 0.10.0.1
>
>
> Current KafkaStreams polls records in the same thread as the data processing 
> thread. This makes debugging user code, as well as KafkaStreams itself, 
> difficult. When the thread is suspended by the debugger, the next heartbeat 
> of the consumer tie to the thread won't be send until the thread is resumed. 
> This often results in missed heartbeats and causes a group rebalance. So it 
> may will be a completely different context then the thread hits the break 
> point the next time.
> We should consider using separate threads for polling and processing.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Build failed in Jenkins: kafka-trunk-jdk7 #1157

2016-03-29 Thread Apache Jenkins Server
See 

Changes:

[wangguoz] HOTFIX: RocksDBStore must clear dirty flags after flush

--
[...truncated 1589 lines...]

kafka.log.LogSegmentTest > testCreateWithInitFileSizeClearShutdown PASSED

kafka.log.LogSegmentTest > testTruncateFull PASSED

kafka.log.CleanerTest > testBuildOffsetMap PASSED

kafka.log.CleanerTest > testSegmentGrouping PASSED

kafka.log.CleanerTest > testCleanSegmentsWithAbort PASSED

kafka.log.CleanerTest > testSegmentGroupingWithSparseOffsets PASSED

kafka.log.CleanerTest > testRecoveryAfterCrash PASSED

kafka.log.CleanerTest > testLogToClean PASSED

kafka.log.CleanerTest > testCleaningWithDeletes PASSED

kafka.log.CleanerTest > testCleanSegments PASSED

kafka.log.CleanerTest > testCleaningWithUnkeyedMessages PASSED

kafka.log.OffsetMapTest > testClear PASSED

kafka.log.OffsetMapTest > testBasicValidation PASSED

kafka.coordinator.GroupCoordinatorResponseTest > testHeartbeatWrongCoordinator 
PASSED

kafka.coordinator.GroupCoordinatorResponseTest > testDescribeGroupStable PASSED

kafka.coordinator.GroupCoordinatorResponseTest > testHeartbeatIllegalGeneration 
PASSED

kafka.coordinator.GroupCoordinatorResponseTest > 
testDescribeGroupWrongCoordinator PASSED

kafka.coordinator.GroupCoordinatorResponseTest > testDescribeGroupRebalancing 
PASSED

kafka.coordinator.GroupCoordinatorResponseTest > testLeaderFailureInSyncGroup 
PASSED

kafka.coordinator.GroupCoordinatorResponseTest > 
testGenerationIdIncrementsOnRebalance PASSED

kafka.coordinator.GroupCoordinatorResponseTest > 
testSyncGroupFromIllegalGeneration PASSED

kafka.coordinator.GroupCoordinatorResponseTest > testInvalidGroupId PASSED

kafka.coordinator.GroupCoordinatorResponseTest > testHeartbeatUnknownGroup 
PASSED

kafka.coordinator.GroupCoordinatorResponseTest > 
testListGroupsIncludesStableGroups PASSED

kafka.coordinator.GroupCoordinatorResponseTest > 
testHeartbeatDuringRebalanceCausesRebalanceInProgress PASSED

kafka.coordinator.GroupCoordinatorResponseTest > 
testJoinGroupInconsistentGroupProtocol PASSED

kafka.coordinator.GroupCoordinatorResponseTest > 
testJoinGroupSessionTimeoutTooLarge PASSED

kafka.coordinator.GroupCoordinatorResponseTest > 
testJoinGroupSessionTimeoutTooSmall PASSED

kafka.coordinator.GroupCoordinatorResponseTest > testSyncGroupEmptyAssignment 
PASSED

kafka.coordinator.GroupCoordinatorResponseTest > 
testCommitOffsetWithDefaultGeneration PASSED

kafka.coordinator.GroupCoordinatorResponseTest > 
testJoinGroupFromUnchangedLeaderShouldRebalance PASSED

kafka.coordinator.GroupCoordinatorResponseTest > 
testHeartbeatRebalanceInProgress PASSED

kafka.coordinator.GroupCoordinatorResponseTest > testLeaveGroupUnknownGroup 
PASSED

kafka.coordinator.GroupCoordinatorResponseTest > 
testListGroupsIncludesRebalancingGroups PASSED

kafka.coordinator.GroupCoordinatorResponseTest > 
testSyncGroupFollowerAfterLeader PASSED

kafka.coordinator.GroupCoordinatorResponseTest > testCommitOffsetInAwaitingSync 
PASSED

kafka.coordinator.GroupCoordinatorResponseTest > testJoinGroupWrongCoordinator 
PASSED

kafka.coordinator.GroupCoordinatorResponseTest > 
testJoinGroupUnknownConsumerExistingGroup PASSED

kafka.coordinator.GroupCoordinatorResponseTest > testSyncGroupFromUnknownGroup 
PASSED

kafka.coordinator.GroupCoordinatorResponseTest > 
testJoinGroupInconsistentProtocolType PASSED

kafka.coordinator.GroupCoordinatorResponseTest > 
testCommitOffsetFromUnknownGroup PASSED

kafka.coordinator.GroupCoordinatorResponseTest > testLeaveGroupWrongCoordinator 
PASSED

kafka.coordinator.GroupCoordinatorResponseTest > 
testLeaveGroupUnknownConsumerExistingGroup PASSED

kafka.coordinator.GroupCoordinatorResponseTest > 
testJoinGroupUnknownConsumerNewGroup PASSED

kafka.coordinator.GroupCoordinatorResponseTest > 
testJoinGroupFromUnchangedFollowerDoesNotRebalance PASSED

kafka.coordinator.GroupCoordinatorResponseTest > testValidJoinGroup PASSED

kafka.coordinator.GroupCoordinatorResponseTest > 
testSyncGroupLeaderAfterFollower PASSED

kafka.coordinator.GroupCoordinatorResponseTest > testSyncGroupFromUnknownMember 
PASSED

kafka.coordinator.GroupCoordinatorResponseTest > testValidLeaveGroup PASSED

kafka.coordinator.GroupCoordinatorResponseTest > testDescribeGroupInactiveGroup 
PASSED

kafka.coordinator.GroupCoordinatorResponseTest > testSyncGroupNotCoordinator 
PASSED

kafka.coordinator.GroupCoordinatorResponseTest > 
testHeartbeatUnknownConsumerExistingGroup PASSED

kafka.coordinator.GroupCoordinatorResponseTest > testValidHeartbeat PASSED

kafka.coordinator.MemberMetadataTest > testMatchesSupportedProtocols PASSED

kafka.coordinator.MemberMetadataTest > testMetadata PASSED

kafka.coordinator.MemberMetadataTest > testMetadataRaisesOnUnsupportedProtocol 
PASSED

kafka.coordinator.MemberMetadataTest > testVoteForPreferredProtocol PASSED

kafka.coordinator.MemberMetadataTest > testVoteRaisesOnNoSupportedProtocols 
PASSED


[jira] [Created] (KAFKA-3485) Configuration to bind Kafka Producer to specific network interface

2016-03-29 Thread Kyle Kavanagh (JIRA)
Kyle Kavanagh created KAFKA-3485:


 Summary: Configuration to bind Kafka Producer to specific network 
interface
 Key: KAFKA-3485
 URL: https://issues.apache.org/jira/browse/KAFKA-3485
 Project: Kafka
  Issue Type: Improvement
Reporter: Kyle Kavanagh


We would like to provide a producer configuration that allows the user to 
configure the producer selector to be bound to a specific interface.  We would 
like to run multiple instances of our producer application on the same physical 
server, but would like to segregate producer traffic between multiple NICs



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-3425) Add missing notes to upgrade docs

2016-03-29 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-3425?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15216866#comment-15216866
 ] 

ASF GitHub Bot commented on KAFKA-3425:
---

Github user asfgit closed the pull request at:

https://github.com/apache/kafka/pull/1159


> Add missing notes to upgrade docs
> -
>
> Key: KAFKA-3425
> URL: https://issues.apache.org/jira/browse/KAFKA-3425
> Project: Kafka
>  Issue Type: Improvement
>  Components: consumer
>Reporter: Jason Gustafson
>Assignee: Jason Gustafson
>Priority: Blocker
> Fix For: 0.10.0.0
>
>
> We're missing a few notes from the upgrade documentation:
> - KIP-45 introduced an incompatible API change for the Kafka consumer, which 
> should be called out in the upgrade documentation.
> - New consumer now has option to exclude internal topics.
> - Old scala producers are deprecated.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (KAFKA-3425) Add missing notes to upgrade docs

2016-03-29 Thread Guozhang Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-3425?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Guozhang Wang updated KAFKA-3425:
-
Resolution: Fixed
Status: Resolved  (was: Patch Available)

Issue resolved by pull request 1159
[https://github.com/apache/kafka/pull/1159]

> Add missing notes to upgrade docs
> -
>
> Key: KAFKA-3425
> URL: https://issues.apache.org/jira/browse/KAFKA-3425
> Project: Kafka
>  Issue Type: Improvement
>  Components: consumer
>Reporter: Jason Gustafson
>Assignee: Jason Gustafson
>Priority: Blocker
> Fix For: 0.10.0.0
>
>
> We're missing a few notes from the upgrade documentation:
> - KIP-45 introduced an incompatible API change for the Kafka consumer, which 
> should be called out in the upgrade documentation.
> - New consumer now has option to exclude internal topics.
> - Old scala producers are deprecated.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] kafka pull request: KAFKA-3425: add missing upgrade notes

2016-03-29 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/kafka/pull/1159


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


Re: [VOTE] 0.10.0.0 RC1

2016-03-29 Thread Guozhang Wang
Hi Gwen:


We found a critical bug in Kafka Streams that heavily impact its
performance:


https://github.com/apache/kafka/pull/1163


It has been merged to 0.10.0 branch. Can we roll out another RC?


Guozhang



On Mon, Mar 28, 2016 at 10:01 PM, Dana Powers  wrote:

> +1 -- verified that all kafka-python integration tests now pass
>
> On Mon, Mar 28, 2016 at 2:34 PM, Gwen Shapira  wrote:
>
> > Hello Kafka users, developers and client-developers,
> >
> > This is the second candidate for release of Apache Kafka 0.10.0.0.
> >
> > This is a major release that includes:
> >
> > (1) New message format including timestamps
> >
> > (2) client interceptor API
> >
> > (3) Kafka Streams.
> >
> >
> > Since this is a major release, we will give people more time to try it
> > out and give feedback.
> >
> > Release notes for the 0.10.0.0
> > release:http://home.apache.org/~gwenshap/0.10.0.0-rc1/RELEASE_NOTES.html
> >
> > *** Please download, test and vote by Monday, April 4, 4pm PT
> >
> > Kafka's KEYS file containing PGP keys we use to sign the
> > release:http://kafka.apache.org/KEYS
> >
> > * Release artifacts to be voted upon (source and
> > binary):http://home.apache.org/~gwenshap/0.10.0.0-rc1/
> >
> > * Maven artifacts to be voted
> > upon:https://repository.apache.org/content/groups/staging/
> >
> > * scala-dochttp://home.apache.org/~gwenshap/0.10.0.0-rc1/scaladoc
> >
> > * java-dochttp://home.apache.org/~gwenshap/0.10.0.0-rc1/javadoc/
> >
> > * tag to be voted upon (off 0.10.0 branch) is the 0.10.0.0
> > tag:
> >
> https://git-wip-us.apache.org/repos/asf?p=kafka.git;a=tag;h=759940658d805b1262101dce0ea9a9d562c5f30d
> >
> > * Documentation:http://kafka.apache.org/0100/documentation.html
> >
> > * Protocol:http://kafka.apache.org/0100/protocol.html
> >
> > /**
> >
> > Thanks,
> >
> > Gwen
> >
>



-- 
-- Guozhang


[jira] [Created] (KAFKA-3484) Transient failure in SslConsumerTest.testListTopics

2016-03-29 Thread Jason Gustafson (JIRA)
Jason Gustafson created KAFKA-3484:
--

 Summary: Transient failure in SslConsumerTest.testListTopics
 Key: KAFKA-3484
 URL: https://issues.apache.org/jira/browse/KAFKA-3484
 Project: Kafka
  Issue Type: Sub-task
Reporter: Jason Gustafson


{code}
java.lang.AssertionError: expected:<5> but was:<6>
at org.junit.Assert.fail(Assert.java:88)
at org.junit.Assert.failNotEquals(Assert.java:834)
at org.junit.Assert.assertEquals(Assert.java:645)
at org.junit.Assert.assertEquals(Assert.java:631)
at kafka.api.BaseConsumerTest.testListTopics(BaseConsumerTest.scala:174)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at 
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
at 
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
at 
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
at 
org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
at 
org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
at 
org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325)
at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78)
at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57)
at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268)
at org.junit.runners.ParentRunner.run(ParentRunner.java:363)
at 
org.gradle.api.internal.tasks.testing.junit.JUnitTestClassExecuter.runTestClass(JUnitTestClassExecuter.java:105)
at 
org.gradle.api.internal.tasks.testing.junit.JUnitTestClassExecuter.execute(JUnitTestClassExecuter.java:56)
at 
org.gradle.api.internal.tasks.testing.junit.JUnitTestClassProcessor.processTestClass(JUnitTestClassProcessor.java:64)
at 
org.gradle.api.internal.tasks.testing.SuiteTestClassProcessor.processTestClass(SuiteTestClassProcessor.java:49)
at sun.reflect.GeneratedMethodAccessor40.invoke(Unknown Source)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at 
org.gradle.messaging.dispatch.ReflectionDispatch.dispatch(ReflectionDispatch.java:35)
at 
org.gradle.messaging.dispatch.ReflectionDispatch.dispatch(ReflectionDispatch.java:24)
at 
org.gradle.messaging.dispatch.ContextClassLoaderDispatch.dispatch(ContextClassLoaderDispatch.java:32)
at 
org.gradle.messaging.dispatch.ProxyDispatchAdapter$DispatchingInvocationHandler.invoke(ProxyDispatchAdapter.java:93)
at com.sun.proxy.$Proxy2.processTestClass(Unknown Source)
at 
org.gradle.api.internal.tasks.testing.worker.TestWorker.processTestClass(TestWorker.java:106)
at sun.reflect.GeneratedMethodAccessor39.invoke(Unknown Source)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at 
org.gradle.messaging.dispatch.ReflectionDispatch.dispatch(ReflectionDispatch.java:35)
at 
org.gradle.messaging.dispatch.ReflectionDispatch.dispatch(ReflectionDispatch.java:24)
at 
org.gradle.messaging.remote.internal.hub.MessageHub$Handler.run(MessageHub.java:360)
at 
org.gradle.internal.concurrent.ExecutorPolicy$CatchAndRecordFailures.onExecute(ExecutorPolicy.java:54)
at 
org.gradle.internal.concurrent.StoppableExecutorImpl$1.run(StoppableExecutorImpl.java:40)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:744)
{code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (KAFKA-3358) Only request metadata updates once we have topics or a pattern subscription

2016-03-29 Thread Jason Gustafson (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-3358?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15216760#comment-15216760
 ] 

Jason Gustafson edited comment on KAFKA-3358 at 3/29/16 8:16 PM:
-

The main problem is that the current topic metadata request doesn't give us the 
capability to actually fetch empty topic metadata (a fix for this problem was 
proposed in the patch for KAFKA-3306). If we had that, then this issue would 
resolve itself: if the Metadata contains no topics, we would only retrieve 
broker metadata. To fix this problem currently, we have to skip the metadata 
fetch entirely until we have a non-empty topic list. That would mean that the 
bootstrap broker list would stay around until that time, which may or may not 
be a problem depending on how the user has configured it (see discussion in 
KAFKA-3068). I wonder if we are better off waiting until the "empty means 
empty" topic metadata fix is checked in rather than attempting a short-term 
workaround?

[~junrao] Any thoughts?


was (Author: hachikuji):
The main problem is that the current topic metadata request doesn't give us the 
capability to actually fetch empty topic metadata (a fix for this problem was 
proposed in the patch for KAFKA-3306). If we had that, then this patch would 
resolve itself: if the Metadata contains no topics, we would only retrieve 
broker metadata. To fix this problem currently, we have to skip the metadata 
fetch entirely until we have a non-empty topic list. That would mean that the 
bootstrap broker list would stay around until that time, which may or may not 
be a problem depending on how the user has configured it (see discussion in 
KAFKA-3068). I wonder if we are better off waiting until the "empty means 
empty" topic metadata fix is checked in rather than attempting a short-term 
workaround?

[~junrao] Any thoughts?

> Only request metadata updates once we have topics or a pattern subscription
> ---
>
> Key: KAFKA-3358
> URL: https://issues.apache.org/jira/browse/KAFKA-3358
> Project: Kafka
>  Issue Type: Improvement
>  Components: clients
>Affects Versions: 0.9.0.0, 0.9.0.1
>Reporter: Ismael Juma
>Assignee: Jason Gustafson
>Priority: Critical
> Fix For: 0.10.1.0
>
>
> The current code requests a metadata update for _all_ topics which can cause 
> major load issues in large clusters.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-3358) Only request metadata updates once we have topics or a pattern subscription

2016-03-29 Thread Jason Gustafson (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-3358?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15216760#comment-15216760
 ] 

Jason Gustafson commented on KAFKA-3358:


The main problem is that the current topic metadata request doesn't give us the 
capability to actually fetch empty topic metadata (a fix for this problem was 
proposed in the patch for KAFKA-3306). If we had that, then this patch would 
resolve itself: if the Metadata contains no topics, we would only retrieve 
broker metadata. To fix this problem currently, we have to skip the metadata 
fetch entirely until we have a non-empty topic list. That would mean that the 
bootstrap broker list would stay around until that time, which may or may not 
be a problem depending on how the user has configured it (see discussion in 
KAFKA-3068). I wonder if we are better off waiting until the "empty means 
empty" topic metadata fix is checked in rather than attempting a short-term 
workaround?

[~junrao] Any thoughts?

> Only request metadata updates once we have topics or a pattern subscription
> ---
>
> Key: KAFKA-3358
> URL: https://issues.apache.org/jira/browse/KAFKA-3358
> Project: Kafka
>  Issue Type: Improvement
>  Components: clients
>Affects Versions: 0.9.0.0, 0.9.0.1
>Reporter: Ismael Juma
>Assignee: Jason Gustafson
>Priority: Critical
> Fix For: 0.10.1.0
>
>
> The current code requests a metadata update for _all_ topics which can cause 
> major load issues in large clusters.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-2426) A Kafka node tries to connect to itself through its advertised hostname

2016-03-29 Thread khenaidoo nursimulu (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-2426?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15216726#comment-15216726
 ] 

khenaidoo nursimulu commented on KAFKA-2426:


Hello,

I am having the same issue (with both kafka 0.9 and 0.8).  I have kafka running 
in a docker container (ubuntu) and I advertised the host and port of the 
physical machine for remote publishers to send events.  I get the exception 
below in the controller.log file when the kafka server is attempting to connect 
to itself using the advertised ip and port.  The ip and port is not reachable 
from within the container (the advertised host and port are natd to the docker 
IP and port 9092).   I tried the above workarounds as well as try to change the 
iptables rules within the container but nothing seems to work.  is there a 
config that would get the controller to use localhost instead of the advertised 
ip and port?  

Thanks
Khen

[2016-03-29 19:09:39,813] WARN [Controller-0-to-broker-0-send-thread], 
Controller 0's connection to broker Node(0, 10.1.89.26, 8198) was unsuccessful 
(kafka.controller.RequestSendThread)
java.io.IOException: Connection to Node(0, 10.1.89.26, 8198) failed
at 
kafka.utils.NetworkClientBlockingOps$$anonfun$blockingReady$extension$1.apply(NetworkClientBlockingOps.scala:62)
at 
kafka.utils.NetworkClientBlockingOps$$anonfun$blockingReady$extension$1.apply(NetworkClientBlockingOps.scala:58)
at 
kafka.utils.NetworkClientBlockingOps$$anonfun$kafka$utils$NetworkClientBlockingOps$$pollUntil$extension$2.apply(NetworkClientBlockingOps.scala:106)
at 
kafka.utils.NetworkClientBlockingOps$$anonfun$kafka$utils$NetworkClientBlockingOps$$pollUntil$extension$2.apply(NetworkClientBlockingOps.scala:105)
at 
kafka.utils.NetworkClientBlockingOps$.recurse$1(NetworkClientBlockingOps.scala:129)
at 
kafka.utils.NetworkClientBlockingOps$.kafka$utils$NetworkClientBlockingOps$$pollUntilFound$extension(NetworkClientBlockingOps.scala:139)
at 
kafka.utils.NetworkClientBlockingOps$.kafka$utils$NetworkClientBlockingOps$$pollUntil$extension(NetworkClientBlockingOps.scala:105)
at 
kafka.utils.NetworkClientBlockingOps$.blockingReady$extension(NetworkClientBlockingOps.scala:58)
at 
kafka.controller.RequestSendThread.brokerReady(ControllerChannelManager.scala:225)
at 
kafka.controller.RequestSendThread.liftedTree1$1(ControllerChannelManager.scala:172)
at 
kafka.controller.RequestSendThread.doWork(ControllerChannelManager.scala:171)
at kafka.utils.ShutdownableThread.run(ShutdownableThread.scala:63)



> A Kafka node tries to connect to itself through its advertised hostname
> ---
>
> Key: KAFKA-2426
> URL: https://issues.apache.org/jira/browse/KAFKA-2426
> Project: Kafka
>  Issue Type: Bug
>  Components: network
>Affects Versions: 0.8.2.1
> Environment: Docker https://github.com/wurstmeister/kafka-docker, 
> managed by a Kubernetes cluster, with an "iptables proxy".
>Reporter: Mikaël Cluseau
>Assignee: Jun Rao
>
> Hi,
> when used behind a firewall, Apache Kafka nodes are trying to connect to 
> themselves using their advertised hostnames. This means that if you have a 
> service IP managed by the docker's host using *only* iptables DNAT rules, the 
> node's connection to "itself" times out.
> This is the case in any setup where a host will DNAT the service IP to the 
> instance's IP, and send the packet back on the same interface other a Linux 
> Bridge port not configured in "hairpin" mode. It's because of this: 
> https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/tree/net/bridge/br_forward.c#n30
> The specific part of the kubernetes issue is here: 
> https://github.com/BenTheElder/kubernetes/issues/3#issuecomment-123925060 .
> The timeout involves that the even if partition's leader is elected, it then 
> fails to accept writes from the other members, causing a write lock. and 
> generating very heavy logs (as fast as Kafka usualy is, but through log4j 
> this time ;)).
> This also means that the normal docker case work by going through the 
> userspace-proxy, which necessarily impacts the performance.
> The workaround for us was to add a "127.0.0.2 advertised-hostname" to 
> /etc/hosts in the container startup script.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] kafka pull request: HOTFIX: RocksDBStore must clear dirty flags af...

2016-03-29 Thread ymatsuda
GitHub user ymatsuda opened a pull request:

https://github.com/apache/kafka/pull/1163

HOTFIX: RocksDBStore must clear dirty flags after flush

@guozhangwang 
Without clearing the dirty flags, RocksDBStore will perform flush for every 
new record. This bug made the store performance painfully slower. 


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/ymatsuda/kafka clear_dirty_flag

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/1163.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #1163


commit 53690e66d3a9bf564a9bb64fe53b92f9a14c9dd8
Author: Yasuhiro Matsuda 
Date:   2016-03-29T19:09:05Z

HOTFIX: RocksDBStore must clear dirty flags after flush




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Updated] (KAFKA-3483) Restructure ducktape tests to simplify running subsets of tests

2016-03-29 Thread Grant Henke (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-3483?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Grant Henke updated KAFKA-3483:
---
Status: Patch Available  (was: Open)

> Restructure ducktape tests to simplify running subsets of tests
> ---
>
> Key: KAFKA-3483
> URL: https://issues.apache.org/jira/browse/KAFKA-3483
> Project: Kafka
>  Issue Type: Improvement
>Reporter: Grant Henke
>Assignee: Grant Henke
>
> Provides a convenient way of running ducktape tests for a single component 
> (core, connect, streams, etc). It also separates tests from benchmarks. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-3483) Restructure ducktape tests to simplify running subsets of tests

2016-03-29 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-3483?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15216516#comment-15216516
 ] 

ASF GitHub Bot commented on KAFKA-3483:
---

GitHub user granthenke opened a pull request:

https://github.com/apache/kafka/pull/1162

KAFKA-3483: Restructure ducktape tests to simplify running subsets of…

… tests

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/granthenke/kafka ducktape-structure

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/1162.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #1162


commit 99affed54d29909d81c9f204b6d97d5ef5b5ebea
Author: Grant Henke 
Date:   2016-03-29T15:18:33Z

KAFKA-3483: Restructure ducktape tests to simplify running subsets of tests




> Restructure ducktape tests to simplify running subsets of tests
> ---
>
> Key: KAFKA-3483
> URL: https://issues.apache.org/jira/browse/KAFKA-3483
> Project: Kafka
>  Issue Type: Improvement
>Reporter: Grant Henke
>Assignee: Grant Henke
>
> Provides a convenient way of running ducktape tests for a single component 
> (core, connect, streams, etc). It also separates tests from benchmarks. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] kafka pull request: KAFKA-3483: Restructure ducktape tests to simp...

2016-03-29 Thread granthenke
GitHub user granthenke opened a pull request:

https://github.com/apache/kafka/pull/1162

KAFKA-3483: Restructure ducktape tests to simplify running subsets of…

… tests

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/granthenke/kafka ducktape-structure

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/1162.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #1162


commit 99affed54d29909d81c9f204b6d97d5ef5b5ebea
Author: Grant Henke 
Date:   2016-03-29T15:18:33Z

KAFKA-3483: Restructure ducktape tests to simplify running subsets of tests




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Created] (KAFKA-3483) Restructure ducktape tests to simplify running subsets of tests

2016-03-29 Thread Grant Henke (JIRA)
Grant Henke created KAFKA-3483:
--

 Summary: Restructure ducktape tests to simplify running subsets of 
tests
 Key: KAFKA-3483
 URL: https://issues.apache.org/jira/browse/KAFKA-3483
 Project: Kafka
  Issue Type: Improvement
Reporter: Grant Henke
Assignee: Grant Henke


Provides a convenient way of running ducktape tests for a single component 
(core, connect, streams, etc). It also separates tests from benchmarks. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: Messages corrupted in kafka

2016-03-29 Thread Roger Hoover
Are you using snappy compression?  There was a bug with snappy that caused 
corrupt messages.

Sent from my iPhone

> On Mar 29, 2016, at 8:15 AM, sunil kalva  wrote:
> 
> Hi
> Do we store message crc also on disk, and server verifies same when we are
> reading messages back from disk?
> And how to handle errors when we use async publish ?
> 
>> On Fri, Mar 25, 2016 at 4:17 AM, Becket Qin  wrote:
>> 
>> You mentioned that you saw few corrupted messages, (< 0.1%). If so are you
>> able to see some corrupted messages if you produce, say, 10M messages?
>> 
>> On Wed, Mar 23, 2016 at 9:40 PM, sunil kalva 
>> wrote:
>> 
>>> I am using java client and kafka 0.8.2, since events are corrupted in
>>> kafka broker i cant read and replay them again.
>>> 
 On Thu, Mar 24, 2016 at 9:42 AM, Becket Qin 
>>> wrote:
>>> 
 Hi Sunil,
 
 The messages in Kafka has a CRC stored with each of them. When consumer
 receives a message, it will compute the CRC from the message bytes and
 compare it to the stored CRC. If the computed CRC and stored CRC does
>> not
 match, that indicates the message has corrupted. I am not sure in your
>>> case
 why the message is corrupted. Corrupted message seems to  be pretty
>> rare
 because the broker actually validate the CRC before it stores the
>>> messages
 on to the disk.
 
 Is this problem reproduceable? If so, can you find out the messages
>> that
 are corrupted? Also, are you using the Java clients or some other
>>> clients?
 
 Jiangjie (Becket) Qin
 
 On Wed, Mar 23, 2016 at 8:28 PM, sunil kalva 
 wrote:
 
> can some one help me out here.
> 
> On Wed, Mar 23, 2016 at 7:36 PM, sunil kalva 
> wrote:
> 
>> Hi
>> I am seeing few messages getting corrupted in kafka, It is not
 happening
>> frequently and percentage is also very very less (less than 0.1%).
>> 
>> Basically i am publishing thrift events in byte array format to
>> kafka
>> topics(with out encoding like base64), and i also see more events
>>> than
 i
>> publish (i confirm this by looking at the offset for that topic).
>> For example if i publish 100 events and i see 110 as offset for
>> that
> topic
>> (since it is in production i could not get exact messages which
>>> causing
>> this problem, and we will only realize this problem when we consume
> because
>> our thrift deserialization fails).
>> 
>> So my question is, is there any magic byte which actually
>> determines
 the
>> boundary of the message which is same as the byte i am sending or
>> or
 for
>> any n/w issues messages get chopped and stores as one message to
 multiple
>> messages on server side ?
>> 
>> tx
>> SunilKalva
>> 


[jira] [Commented] (KAFKA-3334) First message on new topic not actually being sent, no exception thrown

2016-03-29 Thread Ashish K Singh (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-3334?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15216345#comment-15216345
 ] 

Ashish K Singh commented on KAFKA-3334:
---

[~becket_qin] thanks for the quick review. I think there are always two 
categories of users, naive users who do not want to deal with complexities of 
Kafka internals and pro users who know the details of Kafka. Almost always 
users start with very basic information and then with time they tend to get 
well aware of Kafka's expected behavior and how to get it to work. 
Documentations/ tutorials are of utmost importance to these new users. If we 
specify great deal of information in exceptions, it will still be puzzling to 
users as it happens when they hit the issue. IMHO, exceptions are not the best 
of communicating known potential issues to users. The exceptions in first few 
messages with auto topic creation comes as an issue/question users just once, 
once they know the reason, they never ask the same question again. Quite often 
we get response to such explanations as, "why is this not doc'd". To me, it 
will better to provide this information in the doc, but I can be convinced 
otherwise. Should we put in ops or in faq, and what exactly to put is up for 
debate. Also, I do not think we should be providing info on including how the 
internals work, rather just provide info on what users might see and how they 
can avoid it if it happens. [~ijuma] [~salex89] what do you guys suggest.

> First message on new topic not actually being sent, no exception thrown
> ---
>
> Key: KAFKA-3334
> URL: https://issues.apache.org/jira/browse/KAFKA-3334
> Project: Kafka
>  Issue Type: Bug
>Affects Versions: 0.9.0.0
> Environment: Linux, Java
>Reporter: Aleksandar Stojadinovic
>Assignee: Ashish K Singh
> Fix For: 0.10.1.0
>
>
> Although I've seen this issue pop around the internet in a few forms, I'm not 
> sure it is yet properly fixed. 
> When publishing to a new topic, with auto create-enabled, the java client 
> (0.9.0) shows this WARN message in the log, and the message is not sent 
> obviously:
> org.apache.kafka.clients.NetworkClient - Error while fetching metadata with 
> correlation id 0 : {file.topic=LEADER_NOT_AVAILABLE}
> In the meantime I see in the console the message that a log for partition is 
> created. The next messages are patched through normally, but the first one is 
> never sent. No exception is ever thrown, either by calling get on the future, 
> or with the async usage, like everything is perfect.
> I notice when I leave my application blocked on the get call, in the 
> debugger, then the message may be processed, but with significant delay. This 
> is consistent with another issue I found for the python client. Also, if I 
> call partitionsFor previously, the topic is created and the message is sent. 
> But it seems silly to call it every time, just to mitigate this issue.
> {code}
> Future recordMetadataFuture = producer.send(new 
> ProducerRecord<>(topic, key, file));
> RecordMetadata recordMetadata = recordMetadataFuture.get(30, 
> TimeUnit.SECONDS);
> {code}
> I hope I'm clear enough.
> Related similar (but not same) issues:
> https://issues.apache.org/jira/browse/KAFKA-1124
> https://github.com/dpkp/kafka-python/issues/150
> http://stackoverflow.com/questions/35187933/how-to-resolve-leader-not-available-kafka-error-when-trying-to-consume



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (KAFKA-3480) Autogenerate metrics documentation

2016-03-29 Thread Jason Gustafson (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-3480?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15216310#comment-15216310
 ] 

Jason Gustafson edited comment on KAFKA-3480 at 3/29/16 4:47 PM:
-

[~wushujames] Feel free to pick this up if you're interested. I actually 
haven't given it a lot of thought, but the JMX approach sounds a little iffy. 
You'd have to also run the consumer and producer to collect their metrics and 
I'm not sure all metrics are initialized upon startup (e.g. topic or node 
metrics are initialized dynamically), so you'd have to have the clients do some 
real work. It might also be difficult to extract patterns from the registered 
metrics. 

I would think doing something like ConfigDef would be a more promising approach 
if you can find a way to support variable substitution in the names (e.g. 
"topics.\{topic\}.bytes-fetched"). We'd probably want the documentation to 
include the variables anyway. There is definitely some nontrivial work here, 
but it would be a valuable contribution.


was (Author: hachikuji):
[~wushujames] Feel free to pick this up if you're interested. I actually 
haven't given it a lot of thought, but the JMX approach sounds a little iffy. 
You'd have to also run the consumer and producer to collect their metrics and 
I'm not sure all metrics are initialized upon startup (e.g. topic or node 
metrics are initialized dynamically), so you'd have to have the clients do some 
real work. It might also be difficult to extract patterns from the registered 
metrics. 

I would think doing something like ConfigDef would be a more promising approach 
if you can find a way to support variable substitution in the names (e.g. 
"topics.{topic}.bytes-fetched"). We'd probably want the documentation to 
include the variables anyway. There is definitely some nontrivial work here, 
but it would be a valuable contribution.

> Autogenerate metrics documentation
> --
>
> Key: KAFKA-3480
> URL: https://issues.apache.org/jira/browse/KAFKA-3480
> Project: Kafka
>  Issue Type: Improvement
>Reporter: Jason Gustafson
>
> Metrics documentation is done manually, which means it's hard to keep it 
> current. It would be nice to have automatic generation of this documentation 
> as we have for configs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-3480) Autogenerate metrics documentation

2016-03-29 Thread Jason Gustafson (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-3480?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15216310#comment-15216310
 ] 

Jason Gustafson commented on KAFKA-3480:


[~wushujames] Feel free to pick this up if you're interested. I actually 
haven't given it a lot of thought, but the JMX approach sounds a little iffy. 
You'd have to also run the consumer and producer to collect their metrics and 
I'm not sure all metrics are initialized upon startup (e.g. topic or node 
metrics are initialized dynamically), so you'd have to have the clients do some 
real work. It might also be difficult to extract patterns from the registered 
metrics. 

I would think doing something like ConfigDef would be a more promising approach 
if you can find a way to support variable substitution in the names (e.g. 
"topics.{topic}.bytes-fetched"). We'd probably want the documentation to 
include the variables anyway. There is definitely some nontrivial work here, 
but it would be a valuable contribution.

> Autogenerate metrics documentation
> --
>
> Key: KAFKA-3480
> URL: https://issues.apache.org/jira/browse/KAFKA-3480
> Project: Kafka
>  Issue Type: Improvement
>Reporter: Jason Gustafson
>
> Metrics documentation is done manually, which means it's hard to keep it 
> current. It would be nice to have automatic generation of this documentation 
> as we have for configs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Apache Kafka Meetup today, at San Jose Convention Center

2016-03-29 Thread Guozhang Wang
Hello Apache Kafka folks,

We invite you to join us for the March Apache Kafka Meetup today (Tuesday,
March 29) at San Jose Convention Center, starting at 6pm:

http://www.meetup.com/http-kafka-apache-org/events/229424437/

We have two great talks today:

--

Title: Introduction to Kafka Streams Abstract (by Guozhang Wang,
Confluent):

In the past few years Apache Kafka has emerged itself as the world's most
popular real-time data streaming platform backbone. In this talk, we
introduce Kafka Streams, the latest addition to the Apache Kafka project,
which is a new stream processing library natively integrated with Kafka.

Kafka Streams has a very low barrier to entry, easy operationalization, and
a natural DSL for writing stream processing applications. As such it is the
most convenient yet scalable option to analyze, transform, or otherwise
process data that is backed by Kafka. We will provide the audience with an
overview of Kafka Streams including its design and API, typical use cases,
code examples, and an outlook of its upcoming roadmap. We will also compare
Kafka Streams' light-weight library approach with heavier, framework-based
tools such as Spark Streaming or Storm, which require you to understand and
operate a whole different infrastructure for processing real-time data in
Kafka.

--

Title: Streaming Analytics at 300 billion events/day with Kafka, Samza, and
Druid (by Xavier Léauté, Metamarkets)

Abstract: Wonder what it takes to scale Kafka, Samza, and Druid to handle
complex analytics workloads at petabyte size? We will share a high level
overview of the Metamarkets realtime stack, the lessons learned scaling our
real-time processing to over 3 million events per second, and how we
leverage extensive metric collection to handle heterogeneous processing
workloads, while keeping down operational complexity and cost. Built
entirely on open source, our stack performs streaming joins using Kafka and
Samza, feeding into Druid to serve 1 million interactive queries per day.

--

Thanks,

-- Guozhang


Re: Messages corrupted in kafka

2016-03-29 Thread sunil kalva
Hi
Do we store message crc also on disk, and server verifies same when we are
reading messages back from disk?
And how to handle errors when we use async publish ?

On Fri, Mar 25, 2016 at 4:17 AM, Becket Qin  wrote:

> You mentioned that you saw few corrupted messages, (< 0.1%). If so are you
> able to see some corrupted messages if you produce, say, 10M messages?
>
> On Wed, Mar 23, 2016 at 9:40 PM, sunil kalva 
> wrote:
>
> >  I am using java client and kafka 0.8.2, since events are corrupted in
> > kafka broker i cant read and replay them again.
> >
> > On Thu, Mar 24, 2016 at 9:42 AM, Becket Qin 
> wrote:
> >
> > > Hi Sunil,
> > >
> > > The messages in Kafka has a CRC stored with each of them. When consumer
> > > receives a message, it will compute the CRC from the message bytes and
> > > compare it to the stored CRC. If the computed CRC and stored CRC does
> not
> > > match, that indicates the message has corrupted. I am not sure in your
> > case
> > > why the message is corrupted. Corrupted message seems to  be pretty
> rare
> > > because the broker actually validate the CRC before it stores the
> > messages
> > > on to the disk.
> > >
> > > Is this problem reproduceable? If so, can you find out the messages
> that
> > > are corrupted? Also, are you using the Java clients or some other
> > clients?
> > >
> > > Jiangjie (Becket) Qin
> > >
> > > On Wed, Mar 23, 2016 at 8:28 PM, sunil kalva 
> > > wrote:
> > >
> > > > can some one help me out here.
> > > >
> > > > On Wed, Mar 23, 2016 at 7:36 PM, sunil kalva 
> > > > wrote:
> > > >
> > > > > Hi
> > > > > I am seeing few messages getting corrupted in kafka, It is not
> > > happening
> > > > > frequently and percentage is also very very less (less than 0.1%).
> > > > >
> > > > > Basically i am publishing thrift events in byte array format to
> kafka
> > > > > topics(with out encoding like base64), and i also see more events
> > than
> > > i
> > > > > publish (i confirm this by looking at the offset for that topic).
> > > > > For example if i publish 100 events and i see 110 as offset for
> that
> > > > topic
> > > > > (since it is in production i could not get exact messages which
> > causing
> > > > > this problem, and we will only realize this problem when we consume
> > > > because
> > > > > our thrift deserialization fails).
> > > > >
> > > > > So my question is, is there any magic byte which actually
> determines
> > > the
> > > > > boundary of the message which is same as the byte i am sending or
> or
> > > for
> > > > > any n/w issues messages get chopped and stores as one message to
> > > multiple
> > > > > messages on server side ?
> > > > >
> > > > > tx
> > > > > SunilKalva
> > > > >
> > > >
> > >
> >
>


Re: [VOTE] KIP-43: Kafka SASL enhancements

2016-03-29 Thread Jun Rao
Rajini,

Thanks for the update. +1 on the proposal.

Jun

On Tue, Mar 29, 2016 at 3:32 AM, Rajini Sivaram <
rajinisiva...@googlemail.com> wrote:

> Jun,
>
> Thank you for reviewing the KIP. Answers below:
>
> 1. Yes, broker can specify *sasl.mechanism. *It is used for all client-mode
> connections including that in inter-broker communication.
>
> 2. If *sasl.enabled.mechanisms* is not specified, the default value of
> {'GSSAPI'} is used. If it is specified, only the protocols specified are
> enabled. This enables brokers to be run with SASL without enabling GSSAPI
> (as we do). Since GSSAPI requires complex Kerberos set up, it is useful to
> have the ability to turn it off.
>
> 3. For the default SASL/PLAIN implementation included in Kafka, username
> (authentication ID) is returned as principal.
>
> I will update the KIP to clarify these points.
>
> Thanks,
>
> Rajini
>
>
> On Mon, Mar 28, 2016 at 6:17 PM, Jun Rao  wrote:
>
> > Hi, Rajini,
> >
> > Sorry for the late response. The revised KIP looks good overall. Just a
> few
> > minor comments below.
> >
> > 1. Since the broker can also act as a client too (for inter broker
> > communication), sasl.mechanism can also be specified in the broker
> > configuration, right?
> > 2. Since we enable GSSAPI by default, is it true that one only needs to
> > specify non-GSSAPI mechanisms in sasl.enabled.mechanisms?
> > 3. For SASL/PLAIN, could we describe what the Principal will
> > Authenticator.principal()
> > return?
> >
> > I will also take a look at the patch. However, since we are getting
> pretty
> > close to 0.10.0.0 release, I think we likely will have to leave this out
> of
> > 0.10.0.0.
> >
> > Thanks,
> >
> > Jun
> >
> > On Thu, Mar 24, 2016 at 2:21 PM, Gwen Shapira  wrote:
> >
> > > I'm afraid it will be a challenge.
> > >
> > > I see few options:
> > > 1. Jun should be back in the office tomorrow. If he votes +1 and agrees
> > > that the PR is ready to merge and is safe and important enough to
> > > double-commit - this could get in yet.
> > > 2. Same as above, but not in time for the Monday release candidate. In
> > this
> > > case, we can get it into 0.10.0.0 if we find other blockers and need to
> > > roll-out another RC.
> > > 3. (most likely) We will finish the vote and review but not in time for
> > > 0.10.0.0. In this case, 0.10.1.0.0 should be out in around 3 month, and
> > > we'll get it in there. You'll be in good company with KIP-35, KIP-4,
> > KIP-48
> > > and few other things that are close to done, are super critical but are
> > > just not ready in time. Thats why we are trying to release more often.
> > >
> > > Gwen
> > >
> > > On Thu, Mar 24, 2016 at 2:08 PM, Rajini Sivaram <
> > > rajinisiva...@googlemail.com> wrote:
> > >
> > > > Gwen,
> > > >
> > > > Ah, I clearly don't know the rules. So it looks like it would not
> > really
> > > be
> > > > possible to get this into 0.10.0.0 after all.
> > > >
> > > > Rajini
> > > >
> > > > On Thu, Mar 24, 2016 at 8:38 PM, Gwen Shapira 
> > wrote:
> > > >
> > > > > Rajini,
> > > > >
> > > > > I think the vote didn't pass yet?
> > > > > If I can see correctly, Harsha and I are the only committers who
> > voted,
> > > > so
> > > > > we are missing a 3rd vote.
> > > > >
> > > > > Gwen
> > > > >
> > > > > On Thu, Mar 24, 2016 at 11:24 AM, Rajini Sivaram <
> > > > > rajinisiva...@googlemail.com> wrote:
> > > > >
> > > > > > Gwen,
> > > > > >
> > > > > > Thank you. I have pinged Ismael, Harsha and Jun Rao for PR
> review.
> > If
> > > > any
> > > > > > of them has time for reviewing the PR, I will update the PR over
> > the
> > > > > > weekend. If you can suggest any other reviewers, I can ping them
> > too.
> > > > > >
> > > > > > Many thanks.
> > > > > >
> > > > > > On Thu, Mar 24, 2016 at 5:03 PM, Gwen Shapira  >
> > > > wrote:
> > > > > >
> > > > > > > This can be discussed in the review.
> > > > > > > If there's good test coverage, is low risk and passes review
> and
> > > gets
> > > > > > > merged before Monday morning...
> > > > > > >
> > > > > > > We won't be doing an extra release candidate just for this
> > though.
> > > > > > >
> > > > > > > Gwen
> > > > > > >
> > > > > > > On Thu, Mar 24, 2016 at 1:21 AM, Rajini Sivaram <
> > > > > > > rajinisiva...@googlemail.com> wrote:
> > > > > > >
> > > > > > > > Gwen,
> > > > > > > >
> > > > > > > > Is it still possible to include this in 0.10.0.0?
> > > > > > > >
> > > > > > > > Thanks,
> > > > > > > >
> > > > > > > > Rajini
> > > > > > > >
> > > > > > > > On Wed, Mar 23, 2016 at 11:08 PM, Gwen Shapira <
> > > g...@confluent.io>
> > > > > > > wrote:
> > > > > > > >
> > > > > > > > > Sorry! Got distracted by the impending release!
> > > > > > > > >
> > > > > > > > > +1 on the current revision of the KIP.
> > > > > > > > >
> > > > > > > > > On Wed, Mar 23, 2016 at 3:33 PM, Harsha 
> > > wrote:
> > > > > > > > >
> > > > > > > > > > Any update on this. Gwen since 

[jira] [Commented] (KAFKA-3476) -Xloggc is not recognised by IBM java

2016-03-29 Thread Rajini Sivaram (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-3476?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15215837#comment-15215837
 ] 

Rajini Sivaram commented on KAFKA-3476:
---

You can export new values for GC and performance opts prior to running Kafka 
scripts with IBM Java. For example:
{quote}
export KAFKA_GC_LOG_OPTS="-Xverbosegclog:$TRACE_DIR/server-gc.log 
-verbose:gc -XX:+PrintGCDetails -XX:+PrintGCDateStamps -XX:+PrintGCTimeStamps"
export KAFKA_JVM_PERFORMANCE_OPTS="-server -Djava.awt.headless=true"
{quote}


>  -Xloggc is not recognised by IBM java
> --
>
> Key: KAFKA-3476
> URL: https://issues.apache.org/jira/browse/KAFKA-3476
> Project: Kafka
>  Issue Type: Bug
>  Components: admin, tools
>Affects Versions: 0.9.0.0
>Reporter: Khirod Patra
>
> Getting below error on AIX server.
> NOTE : java version is :
> --
> java version "1.8.0"
> Java(TM) SE Runtime Environment (build pap6480-20150129_02)
> IBM J9 VM (build 2.8, JRE 1.8.0 AIX ppc64-64 Compressed References 
> 20150116_231420 (JIT enabled, AOT enabled)
> J9VM - R28_Java8_GA_20150116_2030_B231420
> JIT  - tr.r14.java_20150109_82886.02
> GC   - R28_Java8_GA_20150116_2030_B231420_CMPRSS
> J9CL - 20150116_231420)
> JCL - 20150123_01 based on Oracle jdk8u31-b12
> Error :
> ---
> kafka-run-class.sh -name zookeeper -loggc  
> org.apache.zookeeper.server.quorum.QuorumPeerMain 
> ../config/zookeeper.properties
> 
> http://www.ibm.com/j9/verbosegc; 
> version="R28_Java8_GA_20150116_2030_B231420_CMPRSS">
> JVMJ9VM007E Command-line option unrecognised: 
> -Xloggc:/home/test_user/containers/kafka_2.11-0.9.0.0/bin/../logs/zookeeper-gc.log
> 
> Error: Could not create the Java Virtual Machine.
> Error: A fatal exception has occurred. Program will exit.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: [VOTE] KIP-43: Kafka SASL enhancements

2016-03-29 Thread Rajini Sivaram
Jun,

Thank you for reviewing the KIP. Answers below:

1. Yes, broker can specify *sasl.mechanism. *It is used for all client-mode
connections including that in inter-broker communication.

2. If *sasl.enabled.mechanisms* is not specified, the default value of
{'GSSAPI'} is used. If it is specified, only the protocols specified are
enabled. This enables brokers to be run with SASL without enabling GSSAPI
(as we do). Since GSSAPI requires complex Kerberos set up, it is useful to
have the ability to turn it off.

3. For the default SASL/PLAIN implementation included in Kafka, username
(authentication ID) is returned as principal.

I will update the KIP to clarify these points.

Thanks,

Rajini


On Mon, Mar 28, 2016 at 6:17 PM, Jun Rao  wrote:

> Hi, Rajini,
>
> Sorry for the late response. The revised KIP looks good overall. Just a few
> minor comments below.
>
> 1. Since the broker can also act as a client too (for inter broker
> communication), sasl.mechanism can also be specified in the broker
> configuration, right?
> 2. Since we enable GSSAPI by default, is it true that one only needs to
> specify non-GSSAPI mechanisms in sasl.enabled.mechanisms?
> 3. For SASL/PLAIN, could we describe what the Principal will
> Authenticator.principal()
> return?
>
> I will also take a look at the patch. However, since we are getting pretty
> close to 0.10.0.0 release, I think we likely will have to leave this out of
> 0.10.0.0.
>
> Thanks,
>
> Jun
>
> On Thu, Mar 24, 2016 at 2:21 PM, Gwen Shapira  wrote:
>
> > I'm afraid it will be a challenge.
> >
> > I see few options:
> > 1. Jun should be back in the office tomorrow. If he votes +1 and agrees
> > that the PR is ready to merge and is safe and important enough to
> > double-commit - this could get in yet.
> > 2. Same as above, but not in time for the Monday release candidate. In
> this
> > case, we can get it into 0.10.0.0 if we find other blockers and need to
> > roll-out another RC.
> > 3. (most likely) We will finish the vote and review but not in time for
> > 0.10.0.0. In this case, 0.10.1.0.0 should be out in around 3 month, and
> > we'll get it in there. You'll be in good company with KIP-35, KIP-4,
> KIP-48
> > and few other things that are close to done, are super critical but are
> > just not ready in time. Thats why we are trying to release more often.
> >
> > Gwen
> >
> > On Thu, Mar 24, 2016 at 2:08 PM, Rajini Sivaram <
> > rajinisiva...@googlemail.com> wrote:
> >
> > > Gwen,
> > >
> > > Ah, I clearly don't know the rules. So it looks like it would not
> really
> > be
> > > possible to get this into 0.10.0.0 after all.
> > >
> > > Rajini
> > >
> > > On Thu, Mar 24, 2016 at 8:38 PM, Gwen Shapira 
> wrote:
> > >
> > > > Rajini,
> > > >
> > > > I think the vote didn't pass yet?
> > > > If I can see correctly, Harsha and I are the only committers who
> voted,
> > > so
> > > > we are missing a 3rd vote.
> > > >
> > > > Gwen
> > > >
> > > > On Thu, Mar 24, 2016 at 11:24 AM, Rajini Sivaram <
> > > > rajinisiva...@googlemail.com> wrote:
> > > >
> > > > > Gwen,
> > > > >
> > > > > Thank you. I have pinged Ismael, Harsha and Jun Rao for PR review.
> If
> > > any
> > > > > of them has time for reviewing the PR, I will update the PR over
> the
> > > > > weekend. If you can suggest any other reviewers, I can ping them
> too.
> > > > >
> > > > > Many thanks.
> > > > >
> > > > > On Thu, Mar 24, 2016 at 5:03 PM, Gwen Shapira 
> > > wrote:
> > > > >
> > > > > > This can be discussed in the review.
> > > > > > If there's good test coverage, is low risk and passes review and
> > gets
> > > > > > merged before Monday morning...
> > > > > >
> > > > > > We won't be doing an extra release candidate just for this
> though.
> > > > > >
> > > > > > Gwen
> > > > > >
> > > > > > On Thu, Mar 24, 2016 at 1:21 AM, Rajini Sivaram <
> > > > > > rajinisiva...@googlemail.com> wrote:
> > > > > >
> > > > > > > Gwen,
> > > > > > >
> > > > > > > Is it still possible to include this in 0.10.0.0?
> > > > > > >
> > > > > > > Thanks,
> > > > > > >
> > > > > > > Rajini
> > > > > > >
> > > > > > > On Wed, Mar 23, 2016 at 11:08 PM, Gwen Shapira <
> > g...@confluent.io>
> > > > > > wrote:
> > > > > > >
> > > > > > > > Sorry! Got distracted by the impending release!
> > > > > > > >
> > > > > > > > +1 on the current revision of the KIP.
> > > > > > > >
> > > > > > > > On Wed, Mar 23, 2016 at 3:33 PM, Harsha 
> > wrote:
> > > > > > > >
> > > > > > > > > Any update on this. Gwen since the KIP is adjusted to
> address
> > > the
> > > > > > > > > pluggable classes we should make a move on this.
> > > > > > > > >
> > > > > > > > > Rajini,
> > > > > > > > >Can you restart the voting thread.
> > > > > > > > >
> > > > > > > > > Thanks,
> > > > > > > > > Harsha
> > > > > > > > >
> > > > > > > > > On Wed, Mar 16, 2016, at 06:42 AM, Rajini Sivaram wrote:
> > > > > > > > > > As discussed in the KIP meeting 

Re: Kafka Connect ++ Kafka Streams

2016-03-29 Thread Michal Haris
Replies from the dev list weren't getting to my original sender account. I
could create a pull request but it has some patchy code in ConnectEmbedded
and dependencies which shouldn't be there (connect-runtime) as it was
mostly intended for imagining how the integrated connect and streams
topology would look - but if you are happy for :stream:examples to have
 dependencies that can be removed after connect-api has a proper embedded
support than I can send a pull request your way.

On 25 March 2016 at 21:38, Guozhang Wang  wrote:

> I am thinking maybe we can even consider pulling the project as a whole
> into examples instead of adding the connector and streams implementation
> separately into Kafka Connect and Kafka Streams if Michal is interested to
> filing a PR: currently the examples folder only contains consumer /
> producer demos which is packaged as kafka.examples.
>
> Guozhang
>
>
> On Fri, Mar 25, 2016 at 9:28 AM, Neha Narkhede  wrote:
>
> > Michal -- This is really cool. Mind submitting a pull request?
> >
> > Also, would you like your IRC connector to be featured on the Kafka
> > Connector Hub ?
> >
> > On Fri, Mar 25, 2016 at 9:08 AM, Michal Hariš 
> > wrote:
> >
> > > So I had a go and hacked it up here: ConnectEmbedded.java
> > > <
> > >
> >
> https://github.com/amient/affinity-stack/blob/master/dev/connectors/connect-runtime/src/main/java/io/amient/kafka/connect/ConnectEmbedded.java
> > > >
> > >
> > >
> > > And this is how the wikipedia demo looks with it: hello-kafka-streams
> > > <
> > >
> >
> https://github.com/amient/affinity-stack/blob/master/dev/hello-kafka-streams/src/main/java/io/amient/kafka/streams/wikipedia/WikipediaStreamAppMain.java
> > > >
> > >
> > >
> > > As a side-effect there is a generic IRC connector too:
> kafka-connect-irc
> > > <
> > >
> >
> https://github.com/amient/affinity-stack/tree/master/dev/connectors/kafka-connect-irc/src/main/java/io/amient/kafka/connect/irc
> > > >
> > >
> > > It's kind of neat to have topology encapsulating connect and streams
> in a
> > > single instance that can just be scaled together symmetrically.
> > >
> > > Overall this was one of the most fun hack I had in a long time and the
> > > result compared to the Samza equivalent looks clean and lightweight. It
> > > also allows for zero-downtime with appropriate combination of
> deployment
> > > strategy and replication, which is something that was quite tricky with
> > > Samza and  YARN host affinity.
> > >
> > > One thing though I can't get my head around is why in Kafka Connect
> there
> > > has to be a custom internal schema format  for the in-memory runtime
> > > instead of just using Avro as the internal - the systems that talk in
> > Avro
> > > would have a performance gain and non-Avro guys would have converters
> the
> > > same way they have them now.
> > >
> > >
> > > On Thu, Mar 24, 2016 at 11:46 AM, Michal Hariš <
> michal.har...@gmail.com>
> > > wrote:
> > >
> > > > Hello Kafka people!
> > > >
> > > > Great to see Kafka Streams coming along, the design validates (and in
> > > many
> > > > way supersedes) my own findings from working with various stream
> > > processing
> > > > systems/frameworks and eventually ending-up using just a small custom
> > > > library built directly around Kafka.
> > > >
> > > > I have set out yesterday to translate Hello Samza (the wikipedia feed
> > > > example) into Kafka Streams application. Now because this workflow
> > starts
> > > > by polling wikipedia IRC and publishes to a topic from which the
> stream
> > > > processors pick-up it would be nice to have this first part done by
> > Kafka
> > > > Connect but:
> > > >
> > > > 1. IRC channels are not seekable and Kafka Connect architecture
> claims
> > > > that all sources must be seekable - is this still suitable ? (I guess
> > yes
> > > > as FileStreamSourceTask can read from stdin which is similar)
> > > >
> > > > 2. I would like to have ConnectEmbedded (as opposed to
> > ConnectStandalone
> > > > or ConnectDistributed) which is similar to ConnectDistributed, just
> > > without
> > > > the rest server - i.e. say I have the WikipediaFeedConnector and I
> want
> > > to
> > > > launch it programatically from all the instances along-side the Kafka
> > > > Streams - but reusing the connect distributed coordination so that
> only
> > > one
> > > > instance actually reads the IRC data but another instance picks up
> work
> > > if
> > > > that one dies - does it sound like a bad idea for some design reason
> ?
> > -
> > > > the only problem I see is rather technical that the coordination
> > process
> > > > uses the rest server for some actions.
> > > >
> > > > Cheers,
> > > > Michal
> > > >
> > >
> >
> >
> >
> > --
> > Thanks,
> > Neha
> >
>
>
>
> --
> -- Guozhang
>



-- 
Michal Haris
Technical Architect
direct line: +44 (0) 207 749 0229
www.visualdna.com | t: +44 (0) 207 734 7033
31 Old Nichol Street

Request for edit access to Kafka wiki

2016-03-29 Thread Ganesh Nikam
Hi All,



I want to publish C++ Kafka client. I have my git repository ready. Now I
want add entry on Kafka “Clients” page (Confluence wiki page) for this new
client.

I did create my login for the Confluence and login with that. But I am not
able to edit the page. Do I require to do some other steps to get the write
access ?



If you can give me the write access then that will be very helpful. Here is
my Confluence user name:

User name : ganesh.nikam





Regards

Ganesh Nikam


RE: [VOTE] 0.10.0.0 RC0

2016-03-29 Thread Simon Cooper
Sure, many thanks for looking at it

SimonC

-Original Message-
From: Gwen Shapira [mailto:g...@confluent.io] 
Sent: 28 March 2016 22:30
To: dev@kafka.apache.org
Subject: Re: [VOTE] 0.10.0.0 RC0

Since Dana and few others found critical issues for the release, I've rolled 
out another RC.

I'll send the vote thread in a second.

Simon, since KAFKA-3296 doesn't have a solution yet, I could not include it in 
the release candidate.
We can roll out another RC if we find a solution, or we can defer this to
0.10.0.1 if we don't manage to fix this issue in the next week or two.

Gwen

On Tue, Mar 22, 2016 at 3:33 AM, Simon Cooper < 
simon.coo...@featurespace.co.uk> wrote:

> I would like KAFKA-3296 to be looked at for 0.10, this is causing us 
> significant issues in our testing. There's a comment in the bug that 
> this is possibly caused by KAFKA-3215, but that was fixed in 0.9, and 
> KAFKA-3296 is reproducible in 0.9, so it could be a regression.
>
> SimonC
>
> -Original Message-
> From: Gwen Shapira [mailto:g...@confluent.io]
> Sent: 21 March 2016 20:53
> To: dev@kafka.apache.org; us...@kafka.apache.org; 
> kafka-clie...@googlegroups.com
> Subject: [VOTE] 0.10.0.0 RC0
>
> Hello Kafka users, developers and client-developers,
>
> This is the first candidate for release of Apache Kafka 0.10.0.0.
> This is a major release that includes: (1) New message format 
> including timestamps (2) client interceptor API (3) Kafka Streams. 
> Since this is a major release, we will give people more time to try it 
> out and give feedback.
>
> Release notes for the 0.10.0.0 release:
> http://home.apache.org/~gwenshap/0.10.0.0-rc0/RELEASE_NOTES.HTML
>
> *** Please download, test and vote by Monday, March 28, 9am PT
>
> Kafka's KEYS file containing PGP keys we use to sign the release:
> http://kafka.apache.org/KEYS
>
> * Release artifacts to be voted upon (source and binary):
> http://home.apache.org/~gwenshap/0.10.0.0-rc0/
>
> * Maven artifacts to be voted upon:
> https://repository.apache.org/content/groups/staging/
>
> * scala-doc
> http://home.apache.org/~gwenshap/0.10.0.0-rc0/scaladoc
>
> * java-doc
> http://home.apache.org/~gwenshap/0.10.0.0-rc0/javadoc/
>
> * tag to be voted upon (off 0.10.0 branch) is the 0.10.0.0 tag:
>
> https://git-wip-us.apache.org/repos/asf?p=kafka.git;a=tag;h=72fd542633
> a95a8bd5bdc9fdca56042b643cb4b0
>
> * Documentation:
> http://kafka.apache.org/0100/documentation.html
>
> * Protocol:
> http://kafka.apache.org/0100/protocol.html
>
> /**
>
> Thanks,
>
> Gwen
>


[jira] [Commented] (KAFKA-3334) First message on new topic not actually being sent, no exception thrown

2016-03-29 Thread Aleksandar Stojadinovic (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-3334?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15215668#comment-15215668
 ] 

Aleksandar Stojadinovic commented on KAFKA-3334:


I'm carefully following the discussion.

An exception would be my preferable choice, as an end user. However, now I know 
what would be the root cause of the exception. If I were a new user (evaluating 
Kafka, maybe), and if it is not properly documented that there might be an 
exception on the start, after encountering an exception I might go wandering 
around why my cluster is not working.

It's also not completely clear to me, is it the producer's responsibility to 
always issue a TopicMetadataRequest before issuing a message, which should 
supposedly resolve the "silent fail" issue? 

In the meanwhile I moved to manual topic on risky places.

> First message on new topic not actually being sent, no exception thrown
> ---
>
> Key: KAFKA-3334
> URL: https://issues.apache.org/jira/browse/KAFKA-3334
> Project: Kafka
>  Issue Type: Bug
>Affects Versions: 0.9.0.0
> Environment: Linux, Java
>Reporter: Aleksandar Stojadinovic
>Assignee: Ashish K Singh
> Fix For: 0.10.1.0
>
>
> Although I've seen this issue pop around the internet in a few forms, I'm not 
> sure it is yet properly fixed. 
> When publishing to a new topic, with auto create-enabled, the java client 
> (0.9.0) shows this WARN message in the log, and the message is not sent 
> obviously:
> org.apache.kafka.clients.NetworkClient - Error while fetching metadata with 
> correlation id 0 : {file.topic=LEADER_NOT_AVAILABLE}
> In the meantime I see in the console the message that a log for partition is 
> created. The next messages are patched through normally, but the first one is 
> never sent. No exception is ever thrown, either by calling get on the future, 
> or with the async usage, like everything is perfect.
> I notice when I leave my application blocked on the get call, in the 
> debugger, then the message may be processed, but with significant delay. This 
> is consistent with another issue I found for the python client. Also, if I 
> call partitionsFor previously, the topic is created and the message is sent. 
> But it seems silly to call it every time, just to mitigate this issue.
> {code}
> Future recordMetadataFuture = producer.send(new 
> ProducerRecord<>(topic, key, file));
> RecordMetadata recordMetadata = recordMetadataFuture.get(30, 
> TimeUnit.SECONDS);
> {code}
> I hope I'm clear enough.
> Related similar (but not same) issues:
> https://issues.apache.org/jira/browse/KAFKA-1124
> https://github.com/dpkp/kafka-python/issues/150
> http://stackoverflow.com/questions/35187933/how-to-resolve-leader-not-available-kafka-error-when-trying-to-consume



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (KAFKA-3482) [DOC] - Remove the part about LinkedIn running ZK 3.3.X

2016-03-29 Thread Ismael Juma (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-3482?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ismael Juma resolved KAFKA-3482.

Resolution: Duplicate

This is a duplicate of KAFKA-2930 (which has a PR).

> [DOC] - Remove the part about LinkedIn running ZK 3.3.X
> ---
>
> Key: KAFKA-3482
> URL: https://issues.apache.org/jira/browse/KAFKA-3482
> Project: Kafka
>  Issue Type: Bug
>Reporter: Gwen Shapira
>
> LinkedIn, and everyone else is currently running ZK 3.4.5 / 3.4.6 or maybe 
> higher.
> 3.3.X is definitely not recommended any more. Lets get it out of our docs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)