Re: [VOTE] KIP-160: Augment KStream.print() to allow users pass in extra parameters in the printed string

2017-06-22 Thread James Chain
Hi all,

I apply original idea on KStream#writeAsText() and also update my pull
request.
Please re-review and re-cast the vote.

James Chien


Re: Kafka streams KStream and ktable join issue

2017-06-22 Thread Matthias J. Sax
Hi,

can you reproduce the error reliably? Are use using 0.10.2.0 or 0.10.2.1?

It's unclear to me, how an NPE can occur. It seems to happen within
Streams library. Might be a bug. Not sure atm.


-Matthias

On 6/22/17 9:43 AM, Shekar Tippur wrote:
> Hello,
> 
> I am trying to perform a simple join operation. I am using Kafka 0.10.2
> 
> I have a "raw" table and a "cache" topics and just 1 partition in my local
> environment.
> 
> ktable has these entries
> 
> {"Joe": {"location": "US", "gender": "male"}}
> {"Julie": {"location": "US", "gender": "female"}}
> {"Kawasaki": {"location": "Japan", "gender": "male"}}
> 
> The kstream gets a event
> 
> {"user": "Joe", "custom": {"choice":"vegan"}}
> 
> I want a output as a join
> 
> {"user": "Joe", "custom": {"choice":"vegan","enriched":*{"location": "US",
> "gender": "male"}*} }
> 
> I want to take whats in ktable and add to enriched section of the output
> stream.
> 
> I have defined serde
> 
> //This is the same serde code from the example.
> 
> final TestStreamsSerializer jsonSerializer = new
> TestStreamsSerializer();
> final TestStreamsDeserialzer jsonDeserializer = new
> TestStreamsDeserialzer();
> final Serde jsonSerde = Serdes.serdeFrom(jsonSerializer,
> jsonDeserializer);
> 
> //
> 
> KStream raw = builder.stream(Serdes.String(),
> jsonSerde, "raw");
> KTable  cache = builder.table("cache", "local-cache");
> 
> raw.leftJoin(cache,
> (record1, record2) -> record1.get("user") + "-" + 
> record2).to("output");
> 
> I am having trouble understanding how to call the join api.
> 
> With the above code, I seem to get a error:
> 
> [2017-06-22 09:23:31,836] ERROR User provided listener
> org.apache.kafka.streams.processor.internals.StreamThread$1 for group
> streams-pipe failed on partition assignment
> (org.apache.kafka.clients.consumer.internals.ConsumerCoordinator)
> 
> java.lang.NullPointerException
> 
> at org.rocksdb.RocksDB.put(RocksDB.java:488)
> 
> at
> org.apache.kafka.streams.state.internals.RocksDBStore.putInternal(RocksDBStore.java:254)
> 
> at
> org.apache.kafka.streams.state.internals.RocksDBStore.access$000(RocksDBStore.java:67)
> 
> at
> org.apache.kafka.streams.state.internals.RocksDBStore$1.restore(RocksDBStore.java:164)
> 
> at
> org.apache.kafka.streams.processor.internals.ProcessorStateManager.restoreActiveState(ProcessorStateManager.java:242)
> 
> at
> org.apache.kafka.streams.processor.internals.ProcessorStateManager.register(ProcessorStateManager.java:201)
> 
> at
> org.apache.kafka.streams.processor.internals.AbstractProcessorContext.register(AbstractProcessorContext.java:99)
> 
> at
> org.apache.kafka.streams.state.internals.RocksDBStore.init(RocksDBStore.java:160)
> 
> at
> org.apache.kafka.streams.state.internals.MeteredKeyValueStore$7.run(MeteredKeyValueStore.java:100)
> 
> at
> org.apache.kafka.streams.processor.internals.StreamsMetricsImpl.measureLatencyNs(StreamsMetricsImpl.java:188)
> 
> at
> org.apache.kafka.streams.state.internals.MeteredKeyValueStore.init(MeteredKeyValueStore.java:131)
> 
> at
> org.apache.kafka.streams.state.internals.CachingKeyValueStore.init(CachingKeyValueStore.java:63)
> 
> at
> org.apache.kafka.streams.processor.internals.AbstractTask.initializeStateStores(AbstractTask.java:86)
> 
> at
> org.apache.kafka.streams.processor.internals.StreamTask.(StreamTask.java:141)
> 
> at
> org.apache.kafka.streams.processor.internals.StreamThread.createStreamTask(StreamThread.java:864)
> 
> at
> org.apache.kafka.streams.processor.internals.StreamThread$TaskCreator.createTask(StreamThread.java:1237)
> 
> at
> org.apache.kafka.streams.processor.internals.StreamThread$AbstractTaskCreator.retryWithBackoff(StreamThread.java:1210)
> 
> at
> org.apache.kafka.streams.processor.internals.StreamThread.addStreamTasks(StreamThread.java:967)
> 
> at
> org.apache.kafka.streams.processor.internals.StreamThread.access$600(StreamThread.java:69)
> 
> at
> org.apache.kafka.streams.processor.internals.StreamThread$1.onPartitionsAssigned(StreamThread.java:234)
> 
> at
> org.apache.kafka.clients.consumer.internals.ConsumerCoordinator.onJoinComplete(ConsumerCoordinator.java:259)
> 
> at
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator.joinGroupIfNeeded(AbstractCoordinator.java:352)
> 
> at
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator.ensureActiveGroup(AbstractCoordinator.java:303)
> 
> at
> org.apache.kafka.clients.consumer.internals.ConsumerCoordinator.poll(ConsumerCoordinator.java:290)
> 
> at
> org.apache.kafka.clients.consumer.KafkaConsumer.pollOnce(KafkaConsumer.java:1029)
> 
> at
> org.apache.kafka.clients.consumer.KafkaConsumer.poll(KafkaConsumer.java:995)
> 
> at
> org.apache.kafka.streams.processor.internals.StreamThread.runLoop(StreamThread.java:592)
> 
> at
> org.apache.kafka.streams.processor.internals.StreamThread.run(StreamThread.java:361)
> 
> [2017-06-22 09:23:31,849] WARN stream-thread [StreamThread-1] Unexpected
> state transition from 

[GitHub] kafka pull request #3417: MINOR: Correct the ConsumerPerformance print forma...

2017-06-22 Thread ConeyLiu
GitHub user ConeyLiu opened a pull request:

https://github.com/apache/kafka/pull/3417

MINOR: Correct the ConsumerPerformance print format

Currently, the output of `ConsumerPerformance` looks strange. Before the 
`header` format as follow:
```
"time, threadId, data.consumed.in.MB, MB.sec, data.consumed.in.nMsg, 
nMsg.sec"
```
while the `body` as follow:
```
println("%s, %d, %.4f, %.4f, %d, %.4f".format(dateFormat.format(endMs), id, 
totalMBRead,
1000.0 * (mbRead / elapsedMs), messagesRead, ((messagesRead - 
lastMessagesRead) / elapsedMs) * 1000.0))
```

So we get the follow result:
```
time, data.consumeed.in.MB, MB.sec, data.consumeed.in.nMsg, nMsg.Sec
09:52:00, 0, 1100.3086, 220.0177, 563358, 112649.0702
```
So the `header` and `body` mismatching.

And also, this pr makes the functions more readable.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/ConeyLiu/kafka consumertest

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/3417.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3417


commit 7e90edb4b7deb828a96d41aa0d66a8dd5b872c0d
Author: Xianyang Liu 
Date:   2017-06-23T02:06:03Z

small fix

commit b0889209f4cd245c6e7de39d694de2dd0e9dfb8b
Author: Xianyang Liu 
Date:   2017-06-23T02:06:31Z

Merge remote-tracking branch 'kafka/trunk' into consumertest




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


Re: [VOTE] 0.11.0.0 RC2

2017-06-22 Thread Ismael Juma
A quick note on notable changes since rc1:

1. A significant performance improvement if transactions are enabled:
https://github.com/apache/kafka/commit/f239f1f839f8bcbd80cce2a4a8643e15d340be8e
2. Fixed a controller regression if many brokers are started
simultaneously:
https://github.com/apache/kafka/commit/c0033b0e0b9e56242752c82f15c6388d041914a1
3. Fixed a couple of Connect regressions:
https://github.com/apache/kafka/commit/c029960bf4ae2cd79b22886f4ee519c4af0bcc8b
and
https://github.com/apache/kafka/commit/1d65f15f2b656b7817eeaf6ee1d36eb3e2cf063f
4. Fixed an import log cleaner issue:
https://github.com/apache/kafka/commit/186e3d5efc79ed803f0915d472ace77cbec88694

Full diff:
https://github.com/apache/kafka/compare/5b351216621f52a471c21826d0dec3ce3187e697...0.11.0.0-rc2

Ismael

On Fri, Jun 23, 2017 at 2:16 AM, Ismael Juma  wrote:

> Hello Kafka users, developers and client-developers,
>
> This is the third candidate for release of Apache Kafka 0.11.0.0.
>
> This is a major version release of Apache Kafka. It includes 32 new KIPs.
> See the release notes and release plan (https://cwiki.apache.org/
> confluence/display/KAFKA/Release+Plan+0.11.0.0) for more details. A few
> feature highlights:
>
> * Exactly-once delivery and transactional messaging
> * Streams exactly-once semantics
> * Admin client with support for topic, ACLs and config management
> * Record headers
> * Request rate quotas
> * Improved resiliency: replication protocol improvement and
> single-threaded controller
> * Richer and more efficient message format
>
> Release notes for the 0.11.0.0 release:
> http://home.apache.org/~ijuma/kafka-0.11.0.0-rc2/RELEASE_NOTES.html
>
> *** Please download, test and vote by Tuesday, June 27, 6pm PT
>
> Kafka's KEYS file containing PGP keys we use to sign the release:
> http://kafka.apache.org/KEYS
>
> * Release artifacts to be voted upon (source and binary):
> http://home.apache.org/~ijuma/kafka-0.11.0.0-rc2/
>
> * Maven artifacts to be voted upon:
> https://repository.apache.org/content/groups/staging/org/apache/kafka/
>
> * Javadoc:
> http://home.apache.org/~ijuma/kafka-0.11.0.0-rc2/javadoc/
>
> * Tag to be voted upon (off 0.11.0 branch) is the 0.11.0.0 tag:
> https://git-wip-us.apache.org/repos/asf?p=kafka.git;a=tag;h=
> 8698fa1f41102f1664b05baa4d6953fc9564d91e
>
> * Documentation:
> http://kafka.apache.org/0110/documentation.html
>
> * Protocol:
> http://kafka.apache.org/0110/protocol.html
>
> * Successful Jenkins builds for the 0.11.0 branch:
> Unit/integration tests: https://builds.apache.org/job/
> kafka-0.11.0-jdk7/187/
> System tests: pending (will send an update tomorrow)
>
> /**
>
> Thanks,
> Ismael
>


[VOTE] 0.11.0.0 RC2

2017-06-22 Thread Ismael Juma
Hello Kafka users, developers and client-developers,

This is the third candidate for release of Apache Kafka 0.11.0.0.

This is a major version release of Apache Kafka. It includes 32 new KIPs.
See the release notes and release plan (
https://cwiki.apache.org/confluence/display/KAFKA/Release+Plan+0.11.0.0)
for more details. A few feature highlights:

* Exactly-once delivery and transactional messaging
* Streams exactly-once semantics
* Admin client with support for topic, ACLs and config management
* Record headers
* Request rate quotas
* Improved resiliency: replication protocol improvement and single-threaded
controller
* Richer and more efficient message format

Release notes for the 0.11.0.0 release:
http://home.apache.org/~ijuma/kafka-0.11.0.0-rc2/RELEASE_NOTES.html

*** Please download, test and vote by Tuesday, June 27, 6pm PT

Kafka's KEYS file containing PGP keys we use to sign the release:
http://kafka.apache.org/KEYS

* Release artifacts to be voted upon (source and binary):
http://home.apache.org/~ijuma/kafka-0.11.0.0-rc2/

* Maven artifacts to be voted upon:
https://repository.apache.org/content/groups/staging/org/apache/kafka/

* Javadoc:
http://home.apache.org/~ijuma/kafka-0.11.0.0-rc2/javadoc/

* Tag to be voted upon (off 0.11.0 branch) is the 0.11.0.0 tag:
https://git-wip-us.apache.org/repos/asf?p=kafka.git;a=tag;h=8698fa1f41102f1664b05baa4d6953fc9564d91e

* Documentation:
http://kafka.apache.org/0110/documentation.html

* Protocol:
http://kafka.apache.org/0110/protocol.html

* Successful Jenkins builds for the 0.11.0 branch:
Unit/integration tests: https://builds.apache.org/job/kafka-0.11.0-jdk7/187/
System tests: pending (will send an update tomorrow)

/**

Thanks,
Ismael


Re: [DISCUSS]: KIP-161: streams record processing exception handlers

2017-06-22 Thread Matthias J. Sax
I also think, that one config is better, with two default
implementations: failing and log-and-continue

However, I think we should fail by default. Similar to timestamp
extractor as "silent" data loss is no good default behavior IMHO.


-Matthias

On 6/22/17 11:00 AM, Eno Thereska wrote:
> Answers inline: 
> 
>> On 22 Jun 2017, at 03:26, Guozhang Wang  wrote:
>>
>> Thanks for the updated KIP, some more comments:
>>
>> 1.The config name is "default.deserialization.exception.handler" while the
>> interface class name is "RecordExceptionHandler", which is more general
>> than the intended purpose. Could we rename the class name accordingly?
> 
> Sure.
> 
> 
>>
>> 2. Could you describe the full implementation of "DefaultExceptionHandler",
>> currently it is not clear to me how it is implemented with the configured
>> value.
>>
>> In addition, I think we do not need to include an additional
>> "DEFAULT_DESERIALIZATION_EXCEPTION_RESPONSE_CONFIG" as the configure()
>> function is mainly used for users to pass any customized parameters that is
>> out of the Streams library; plus adding such additional config sounds
>> over-complicated for a default exception handler. Instead I'd suggest we
>> just provide two handlers (or three if people feel strong about the
>> LogAndThresholdExceptionHandler), one for FailOnExceptionHandler and one
>> for LogAndContinueOnExceptionHandler. And we can set
>> LogAndContinueOnExceptionHandler
>> by default.
>>
> 
> That's what I had originally. Jay mentioned he preferred one default class, 
> with config options.
> So with that approach, you'd have 2 config options, one for failing, one for 
> continuing, and the one
> exception handler would take those options during it's configure() call.
> 
> After checking the other exception handlers in the code, I might revert back 
> to what I originally had (2 default handlers) 
> as Guozhang also re-suggests, but still have the interface extend 
> Configurable. Guozhang, you ok with that? In that case
> there is no need for the response config option.
> 
> Thanks
> Eno
> 
> 
>>
>> Guozhang
>>
>>
>>
>>
>>
>>
>>
>>
>> On Wed, Jun 21, 2017 at 1:39 AM, Eno Thereska > >
>> wrote:
>>
>>> Thanks Guozhang,
>>>
>>> I’ve updated the KIP and hopefully addressed all the comments so far. In
>>> the process also changed the name of the KIP to reflect its scope better:
>>> https://cwiki.apache.org/confluence/display/KAFKA/KIP-161%3A+streams+ 
>>> 
>>> deserialization+exception+handlers >> 
>>> confluence/display/KAFKA/KIP-161:+streams+deserialization+
>>> exception+handlers>
>>>
>>> Any other feedback appreciated, otherwise I’ll start the vote soon.
>>>
>>> Thanks
>>> Eno
>>>
 On Jun 12, 2017, at 6:28 AM, Guozhang Wang  wrote:

 Eno, Thanks for bringing this proposal up and sorry for getting late on
 this. Here are my two cents:

 1. First some meta comments regarding "fail fast" v.s. "making
>>> progress". I
 agree that in general we should better "enforce user to do the right
>>> thing"
 in system design, but we also need to keep in mind that Kafka is a
 multi-tenant system, i.e. from a Streams app's pov you probably would not
 control the whole streaming processing pipeline end-to-end. E.g. Your
>>> input
 data may not be controlled by yourself; it could be written by another
>>> app,
 or another team in your company, or even a different organization, and if
 an error happens maybe you cannot fix "to do the right thing" just by
 yourself in time. In such an environment I think it is important to leave
 the door open to let users be more resilient. So I find the current
 proposal which does leave the door open for either fail-fast or make
 progress quite reasonable.

 2. On the other hand, if the question is whether we should provide a
 built-in "send to bad queue" handler from the library, I think that might
 be an overkill: with some tweaks (see my detailed comments below) on the
 API we can allow users to implement such handlers pretty easily. In
>>> fact, I
 feel even "LogAndThresholdExceptionHandler" is not necessary as a
>>> built-in
 handler, as it would then require users to specify the threshold via
 configs, etc. I think letting people provide such "eco-libraries" may be
 better.

 3. Regarding the CRC error: today we validate CRC on both the broker end
 upon receiving produce requests and on consumer end upon receiving fetch
 responses; and if the CRC validation fails in the former case it would
>>> not
 be appended to the broker logs. So if we do see a CRC failure on the
 consumer side it has to be that either we have a flipped bit on the
>>> broker
 disks or over the wire. For the first 

Jenkins build is back to normal : kafka-0.11.0-jdk7 #187

2017-06-22 Thread Apache Jenkins Server
See 




Minimum Replication Factor

2017-06-22 Thread Stephane Maarek
Hi all,

 

Interested in getting people’s opinion on something.

The problem I have is that some people launch streams app in our cluster but 
forget to set a replication factor > 1. Then it’s a pain to increase the 
topic’s RF, when we do notice some topic partitions go offline because we 
reboot brokers. 

 

I have two solutions for this, which I’m interested in hearing:
Make the replication.factor in Kafka Streams “opiniated / smart” by changing 
the default to a dynamic min(3, # brokers).
Create a “minimum.replication.factor” in Kafka broker settings. If any topic is 
trying to be created using a RF less than the min, Kafka says no and doesn’t 
create the topic. That would ensure no topics get “miscreated” in production 
clusters and ease the pain on both devs, devops and support.
 

Thoughts? 

My preference goes towards 2). 

 

Cheers!

Stephane 



Re: CreateTopicResponse Error Code 41

2017-06-22 Thread Ismael Juma
Yes, it needs to be sent to the Controller. Metadata response has the
controller id.

Ismael

On Fri, Jun 23, 2017 at 12:45 AM, Vineet Goel  wrote:

> Hi,
>
> I get an Error (Code 41) when sending a CreateTopicRequest to any of the
> brokers except 1. Why might this be? Does this request need to be sent to a
> specific broker?
>
> Best,
> Vineet
>


CreateTopicResponse Error Code 41

2017-06-22 Thread Vineet Goel
Hi,

I get an Error (Code 41) when sending a CreateTopicRequest to any of the
brokers except 1. Why might this be? Does this request need to be sent to a
specific broker?

Best,
Vineet


[GitHub] kafka pull request #2575: MINOR: update AWS test setup guide

2017-06-22 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/kafka/pull/2575


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] kafka pull request #3416: MINOR: improve test README

2017-06-22 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/kafka/pull/3416


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] kafka pull request #3416: MINOR: improve test README

2017-06-22 Thread mjsax
GitHub user mjsax opened a pull request:

https://github.com/apache/kafka/pull/3416

MINOR: improve test README



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/mjsax/kafka minor-aws

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/3416.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3416


commit df1e189ea751af809bab69079ccd63dbd237dc0c
Author: Matthias J. Sax 
Date:   2017-06-22T23:13:45Z

MINOR: improve test README




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


Jenkins build is back to normal : kafka-trunk-jdk7 #2450

2017-06-22 Thread Apache Jenkins Server
See 




[GitHub] kafka pull request #3414: HOTFIX: reduce log verbosity on commit

2017-06-22 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/kafka/pull/3414


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Created] (KAFKA-5504) Kafka controller is not getting elected

2017-06-22 Thread Ashish Kumar (JIRA)
Ashish Kumar created KAFKA-5504:
---

 Summary: Kafka controller is not getting elected
 Key: KAFKA-5504
 URL: https://issues.apache.org/jira/browse/KAFKA-5504
 Project: Kafka
  Issue Type: Bug
  Components: controller
Affects Versions: 0.9.0.1
Reporter: Ashish Kumar


I am having a kafka cluster of 20 nodes and I was facing issue of 
under-replicated topic for last few days so I decided to restart the broker 
which was working as a controller but after restart I'm getting below logs in 
all the brokers (It seems controller is not finalized and getting elected 
continuously):

[2017-06-23 02:59:50,388] INFO Result of znode creation is: NODEEXISTS 
(kafka.utils.ZKCheckedEphemeral)
[2017-06-23 02:59:50,396] INFO New leader is 12 
(kafka.server.ZookeeperLeaderElector$LeaderChangeListener)
[2017-06-23 02:59:50,410] INFO Rolled new log segment for 
'dpe_feedback_rating_history-4' in 0 ms. (kafka.log.Log)
[2017-06-23 02:59:51,585] INFO Creating /controller (is it secure? false) 
(kafka.utils.ZKCheckedEphemeral)
[2017-06-23 02:59:51,590] INFO Result of znode creation is: NODEEXISTS 
(kafka.utils.ZKCheckedEphemeral)
[2017-06-23 02:59:51,609] INFO New leader is 11 
(kafka.server.ZookeeperLeaderElector$LeaderChangeListener)
[2017-06-23 02:59:52,792] INFO Creating /controller (is it secure? false) 
(kafka.utils.ZKCheckedEphemeral)
[2017-06-23 02:59:52,799] INFO Result of znode creation is: NODEEXISTS 
(kafka.utils.ZKCheckedEphemeral)
[2017-06-23 02:59:52,808] INFO New leader is 12 
(kafka.server.ZookeeperLeaderElector$LeaderChangeListener)
[2017-06-23 02:59:54,122] INFO New leader is 3 
(kafka.server.ZookeeperLeaderElector$LeaderChangeListener)
[2017-06-23 02:59:55,504] INFO Creating /controller (is it secure? false) 
(kafka.utils.ZKCheckedEphemeral)
[2017-06-23 02:59:55,512] INFO Result of znode creation is: NODEEXISTS 
(kafka.utils.ZKCheckedEphemeral)
[2017-06-23 02:59:55,520] INFO New leader is 11 
(kafka.server.ZookeeperLeaderElector$LeaderChangeListener)
[2017-06-23 02:59:56,695] INFO Creating /controller (is it secure? false) 
(kafka.utils.ZKCheckedEphemeral)
[2017-06-23 02:59:56,701] INFO Result of znode creation is: NODEEXISTS 
(kafka.utils.ZKCheckedEphemeral)
[2017-06-23 02:59:56,709] INFO New leader is 11 
(kafka.server.ZookeeperLeaderElector$LeaderChangeListener)
[2017-06-23 02:59:57,949] INFO Creating /controller (is it secure? false) 
(kafka.utils.ZKCheckedEphemeral)
[2017-06-23 02:59:57,955] INFO Result of znode creation is: NODEEXISTS 
(kafka.utils.ZKCheckedEphemeral)
[2017-06-23 02:59:57,965] INFO New leader is 12 
(kafka.server.ZookeeperLeaderElector$LeaderChangeListener)
[2017-06-23 02:59:59,378] INFO Creating /controller (is it secure? false) 
(kafka.utils.ZKCheckedEphemeral)
[2017-06-23 02:59:59,384] INFO Result of znode creation is: NODEEXISTS 
(kafka.utils.ZKCheckedEphemeral)
[2017-06-23 02:59:59,395] INFO New leader is 12 
(kafka.server.ZookeeperLeaderElector$LeaderChangeListener)
.
Please help.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


Build failed in Jenkins: kafka-trunk-jdk7 #2449

2017-06-22 Thread Apache Jenkins Server
See 

--
[...truncated 2.42 MB...]

org.apache.kafka.streams.integration.QueryableStateIntegrationTest > 
shouldBeAbleToQueryStateWithNonZeroSizedCache PASSED

org.apache.kafka.streams.integration.KStreamAggregationDedupIntegrationTest > 
shouldReduce STARTED

org.apache.kafka.streams.integration.KStreamAggregationDedupIntegrationTest > 
shouldReduce PASSED

org.apache.kafka.streams.integration.KStreamAggregationDedupIntegrationTest > 
shouldGroupByKey STARTED

org.apache.kafka.streams.integration.KStreamAggregationDedupIntegrationTest > 
shouldGroupByKey PASSED

org.apache.kafka.streams.integration.KStreamAggregationDedupIntegrationTest > 
shouldReduceWindowed STARTED

org.apache.kafka.streams.integration.KStreamAggregationDedupIntegrationTest > 
shouldReduceWindowed PASSED

org.apache.kafka.streams.KafkaStreamsTest > testIllegalMetricsConfig STARTED

org.apache.kafka.streams.KafkaStreamsTest > testIllegalMetricsConfig PASSED

org.apache.kafka.streams.KafkaStreamsTest > shouldNotGetAllTasksWhenNotRunning 
STARTED

org.apache.kafka.streams.KafkaStreamsTest > shouldNotGetAllTasksWhenNotRunning 
PASSED

org.apache.kafka.streams.KafkaStreamsTest > 
testInitializesAndDestroysMetricsReporters STARTED

org.apache.kafka.streams.KafkaStreamsTest > 
testInitializesAndDestroysMetricsReporters PASSED

org.apache.kafka.streams.KafkaStreamsTest > 
shouldNotGetTaskWithKeyAndPartitionerWhenNotRunning STARTED

org.apache.kafka.streams.KafkaStreamsTest > 
shouldNotGetTaskWithKeyAndPartitionerWhenNotRunning PASSED

org.apache.kafka.streams.KafkaStreamsTest > testLegalMetricsConfig STARTED

org.apache.kafka.streams.KafkaStreamsTest > testLegalMetricsConfig PASSED

org.apache.kafka.streams.KafkaStreamsTest > 
shouldNotGetTaskWithKeyAndSerializerWhenNotRunning STARTED

org.apache.kafka.streams.KafkaStreamsTest > 
shouldNotGetTaskWithKeyAndSerializerWhenNotRunning PASSED

org.apache.kafka.streams.KafkaStreamsTest > testToString STARTED

org.apache.kafka.streams.KafkaStreamsTest > testToString PASSED

org.apache.kafka.streams.KafkaStreamsTest > 
shouldReturnFalseOnCloseWhenThreadsHaventTerminated STARTED

org.apache.kafka.streams.KafkaStreamsTest > 
shouldReturnFalseOnCloseWhenThreadsHaventTerminated PASSED

org.apache.kafka.streams.KafkaStreamsTest > 
shouldNotGetAllTasksWithStoreWhenNotRunning STARTED

org.apache.kafka.streams.KafkaStreamsTest > 
shouldNotGetAllTasksWithStoreWhenNotRunning PASSED

org.apache.kafka.streams.KafkaStreamsTest > testCannotStartOnceClosed STARTED

org.apache.kafka.streams.KafkaStreamsTest > testCannotStartOnceClosed PASSED

org.apache.kafka.streams.KafkaStreamsTest > testCleanup STARTED

org.apache.kafka.streams.KafkaStreamsTest > testCleanup PASSED

org.apache.kafka.streams.KafkaStreamsTest > testNumberDefaultMetrics STARTED

org.apache.kafka.streams.KafkaStreamsTest > testNumberDefaultMetrics PASSED

org.apache.kafka.streams.KafkaStreamsTest > testCloseIsIdempotent STARTED

org.apache.kafka.streams.KafkaStreamsTest > testCloseIsIdempotent PASSED

org.apache.kafka.streams.KafkaStreamsTest > testCannotCleanupWhileRunning 
STARTED

org.apache.kafka.streams.KafkaStreamsTest > testCannotCleanupWhileRunning PASSED

org.apache.kafka.streams.KafkaStreamsTest > testCannotStartTwice STARTED

org.apache.kafka.streams.KafkaStreamsTest > testCannotStartTwice PASSED

org.apache.kafka.streams.StreamsConfigTest > testGetConsumerConfigs STARTED

org.apache.kafka.streams.StreamsConfigTest > testGetConsumerConfigs PASSED

org.apache.kafka.streams.StreamsConfigTest > shouldAcceptAtLeastOnce STARTED

org.apache.kafka.streams.StreamsConfigTest > shouldAcceptAtLeastOnce PASSED

org.apache.kafka.streams.StreamsConfigTest > 
shouldAllowSettingProducerEnableIdempotenceIfEosDisabled STARTED

org.apache.kafka.streams.StreamsConfigTest > 
shouldAllowSettingProducerEnableIdempotenceIfEosDisabled PASSED

org.apache.kafka.streams.StreamsConfigTest > defaultSerdeShouldBeConfigured 
STARTED

org.apache.kafka.streams.StreamsConfigTest > defaultSerdeShouldBeConfigured 
PASSED

org.apache.kafka.streams.StreamsConfigTest > 
shouldSetDifferentDefaultsIfEosEnabled STARTED

org.apache.kafka.streams.StreamsConfigTest > 
shouldSetDifferentDefaultsIfEosEnabled PASSED

org.apache.kafka.streams.StreamsConfigTest > 
shouldNotOverrideUserConfigRetriesIfExactlyOnceEnabled STARTED

org.apache.kafka.streams.StreamsConfigTest > 
shouldNotOverrideUserConfigRetriesIfExactlyOnceEnabled PASSED

org.apache.kafka.streams.StreamsConfigTest > 
shouldSupportPrefixedConsumerConfigs STARTED

org.apache.kafka.streams.StreamsConfigTest > 
shouldSupportPrefixedConsumerConfigs PASSED

org.apache.kafka.streams.StreamsConfigTest > 
shouldOverrideStreamsDefaultConsumerConfigs STARTED

org.apache.kafka.streams.StreamsConfigTest > 
shouldOverrideStreamsDefaultConsumerConfigs PASSED

org.apache.kafka.streams.StreamsConfigTest > testGetProducerConfigs STARTED


[GitHub] kafka pull request #3413: KAFKA-5502: read current brokers from zookeeper up...

2017-06-22 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/kafka/pull/3413


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] kafka pull request #3415: MINOR: Make 'Topic-Level Configs' a doc section fo...

2017-06-22 Thread vahidhashemian
GitHub user vahidhashemian opened a pull request:

https://github.com/apache/kafka/pull/3415

MINOR: Make 'Topic-Level Configs' a doc section for easier access



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/vahidhashemian/kafka 
doc/make_topic_config_a_section

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/3415.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3415


commit 231af5933abf1665ef77af76417cc851d26de58f
Author: Vahid Hashemian 
Date:   2017-06-22T21:16:13Z

MINOR: Make 'Topic-Level Configs' a doc section for easier access




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


Jenkins build is back to normal : kafka-trunk-jdk8 #1747

2017-06-22 Thread Apache Jenkins Server
See 



[jira] [Created] (KAFKA-5503) Idempotent producer ignores shutdown while fetching ProducerId

2017-06-22 Thread Jason Gustafson (JIRA)
Jason Gustafson created KAFKA-5503:
--

 Summary: Idempotent producer ignores shutdown while fetching 
ProducerId
 Key: KAFKA-5503
 URL: https://issues.apache.org/jira/browse/KAFKA-5503
 Project: Kafka
  Issue Type: Sub-task
Affects Versions: 0.11.0.0
Reporter: Jason Gustafson
 Fix For: 0.11.0.1


When using the idempotent producer, we initially block the sender thread while 
we attempt to get the ProducerId. During this time, a concurrent call to 
close() will be ignored.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


Build failed in Jenkins: kafka-trunk-jdk7 #2448

2017-06-22 Thread Apache Jenkins Server
See 


Changes:

[cshapi] KAFKA-5498: ConfigDef derived from another ConfigDef did not correctly

--
[...truncated 967.04 KB...]
kafka.log.ProducerStateManagerTest > 
testTruncateAndReloadRemovesOutOfRangeSnapshots PASSED

kafka.log.ProducerStateManagerTest > testStartOffset STARTED

kafka.log.ProducerStateManagerTest > testStartOffset PASSED

kafka.log.ProducerStateManagerTest > testProducerSequenceInvalidWrapAround 
STARTED

kafka.log.ProducerStateManagerTest > testProducerSequenceInvalidWrapAround 
PASSED

kafka.log.ProducerStateManagerTest > testTruncateHead STARTED

kafka.log.ProducerStateManagerTest > testTruncateHead PASSED

kafka.log.ProducerStateManagerTest > 
testNonTransactionalAppendWithOngoingTransaction STARTED

kafka.log.ProducerStateManagerTest > 
testNonTransactionalAppendWithOngoingTransaction PASSED

kafka.log.ProducerStateManagerTest > testSkipSnapshotIfOffsetUnchanged STARTED

kafka.log.ProducerStateManagerTest > testSkipSnapshotIfOffsetUnchanged PASSED

kafka.log.LogCleanerLagIntegrationTest > cleanerTest[0] STARTED

kafka.log.LogCleanerLagIntegrationTest > cleanerTest[0] PASSED

kafka.log.LogCleanerLagIntegrationTest > cleanerTest[1] STARTED

kafka.log.LogCleanerLagIntegrationTest > cleanerTest[1] PASSED

kafka.log.LogCleanerLagIntegrationTest > cleanerTest[2] STARTED

kafka.log.LogCleanerLagIntegrationTest > cleanerTest[2] PASSED

kafka.log.LogCleanerLagIntegrationTest > cleanerTest[3] STARTED

kafka.log.LogCleanerLagIntegrationTest > cleanerTest[3] PASSED

kafka.controller.ControllerIntegrationTest > classMethod STARTED

kafka.controller.ControllerIntegrationTest > classMethod FAILED
java.lang.AssertionError: Found unexpected threads, 
allThreads=Set(ThrottledRequestReaper-Produce, metrics-meter-tick-thread-2, 
SessionTracker, main, Signal Dispatcher, Reference Handler, 
ExpirationReaper-0-Produce, ForkJoinPool-1-worker-1, 
ExpirationReaper-0-DeleteRecords, ThrottledRequestReaper-Fetch, 
ThrottledRequestReaper-Request, /0:0:0:0:0:0:0:1:40078 to 
/0:0:0:0:0:0:0:1:43548 workers Thread 2, /0:0:0:0:0:0:0:1:40078 to 
/0:0:0:0:0:0:0:1:43548 workers Thread 3, Test worker, kafka-request-handler-3, 
SyncThread:0, ReplicaFetcherThread-0-1, 
ZkClient-EventThread-27006-127.0.0.1:38549, Test 
worker-SendThread(127.0.0.1:38549), NIOServerCxn.Factory:/127.0.0.1:0, Test 
worker-EventThread, ExpirationReaper-0-Fetch, Finalizer, ProcessThread(sid:0 
cport:38549):, kafka-coordinator-heartbeat-thread | group1, 
metrics-meter-tick-thread-1)

kafka.controller.ControllerIntegrationTest > classMethod STARTED

kafka.controller.ControllerIntegrationTest > classMethod FAILED
java.lang.AssertionError: Found unexpected threads, 
allThreads=Set(ThrottledRequestReaper-Produce, metrics-meter-tick-thread-2, 
SessionTracker, main, Signal Dispatcher, Reference Handler, 
ExpirationReaper-0-Produce, ForkJoinPool-1-worker-1, 
ExpirationReaper-0-DeleteRecords, ThrottledRequestReaper-Fetch, 
ThrottledRequestReaper-Request, /0:0:0:0:0:0:0:1:40078 to 
/0:0:0:0:0:0:0:1:43548 workers Thread 2, /0:0:0:0:0:0:0:1:40078 to 
/0:0:0:0:0:0:0:1:43548 workers Thread 3, Test worker, kafka-request-handler-3, 
SyncThread:0, ReplicaFetcherThread-0-1, 
ZkClient-EventThread-27006-127.0.0.1:38549, Test 
worker-SendThread(127.0.0.1:38549), NIOServerCxn.Factory:/127.0.0.1:0, Test 
worker-EventThread, ExpirationReaper-0-Fetch, Finalizer, ProcessThread(sid:0 
cport:38549):, kafka-coordinator-heartbeat-thread | group1, 
metrics-meter-tick-thread-1)

kafka.controller.ControllerEventManagerTest > testEventThatThrowsException 
STARTED

kafka.controller.ControllerEventManagerTest > testEventThatThrowsException 
PASSED

kafka.controller.ControllerEventManagerTest > testSuccessfulEvent STARTED

kafka.controller.ControllerEventManagerTest > testSuccessfulEvent PASSED

kafka.controller.ControllerFailoverTest > classMethod STARTED

kafka.controller.ControllerFailoverTest > classMethod FAILED
java.lang.AssertionError: Found unexpected threads, 
allThreads=Set(ThrottledRequestReaper-Produce, metrics-meter-tick-thread-2, 
SessionTracker, main, Signal Dispatcher, Reference Handler, 
ExpirationReaper-0-Produce, ForkJoinPool-1-worker-1, 
ExpirationReaper-0-DeleteRecords, ThrottledRequestReaper-Fetch, 
ThrottledRequestReaper-Request, /0:0:0:0:0:0:0:1:40078 to 
/0:0:0:0:0:0:0:1:43548 workers Thread 2, /0:0:0:0:0:0:0:1:40078 to 
/0:0:0:0:0:0:0:1:43548 workers Thread 3, Test worker, kafka-request-handler-3, 
SyncThread:0, ReplicaFetcherThread-0-1, 
ZkClient-EventThread-27006-127.0.0.1:38549, Test 
worker-SendThread(127.0.0.1:38549), NIOServerCxn.Factory:/127.0.0.1:0, Test 
worker-EventThread, ExpirationReaper-0-Fetch, Finalizer, ProcessThread(sid:0 
cport:38549):, kafka-coordinator-heartbeat-thread | group1, 
metrics-meter-tick-thread-1)

kafka.controller.ControllerFailoverTest > classMethod STARTED


[GitHub] kafka pull request #3414: HOTFIX: reduce log verbosity on commit

2017-06-22 Thread mjsax
GitHub user mjsax opened a pull request:

https://github.com/apache/kafka/pull/3414

HOTFIX: reduce log verbosity on commit



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/mjsax/kafka hotfix-commit-logging

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/3414.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3414


commit eea1e86874482ff391c73cccad1cd7513d2fd98c
Author: Matthias J. Sax 
Date:   2017-06-22T20:27:51Z

HOTFIX: reduce log verbosity on commit




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] kafka pull request #3413: KAFKA-5502: read current brokers from zookeeper up...

2017-06-22 Thread onurkaraman
GitHub user onurkaraman opened a pull request:

https://github.com/apache/kafka/pull/3413

KAFKA-5502: read current brokers from zookeeper upon processing broker 
change

Dong Lin's testing of the 0.11.0 release revealed a controller-side 
performance regression in clusters with many brokers and many partitions when 
bringing up many brokers simultaneously.

The regerssion is caused by KAFKA-5028: a Watcher receives WatchedEvent 
notifications from the raw ZooKeeper client EventThread. A WatchedEvent only 
contains the following information:
- KeeperState
- EventType
- path

Note that it does not actually contain the current data or current set of 
children associated with the data/child change notification. It is up to the 
user to do this lookup to see the current data or set of children.

ZkClient is itself a Watcher. When it receives a WatchedEvent, it puts a 
ZkEvent into its own queue which its own ZkEventThread processes. Users of 
ZkClient interact with these notifications through listeners (IZkDataListener, 
IZkChildListener). IZkDataListener actually expects as input the current data 
of the watched znode, and likewise IZkChildListener actually expects as input 
the current set of children of the watched znode. In order to provide this 
information to the listeners, the ZkEventThread, when processing the ZkEvent in 
its queue, looks up the information (either the current data or current set of 
children) simultaneously sets up the next watch, and passes the result to the 
listener.

The regression introduced in KAFKA-5028 is the time at which we lookup the 
information needed for the event processing.

In the past, the lookup from the ZkEventThread during ZkEvent processing 
would be passed into the listener which is processed immediately after. For 
instance in ZkClient.fireChildChangedEvents:
```
List children = getChildren(path);
listener.handleChildChange(path, children);
```
Now, however, there are multiple listeners that pass information looked up 
by the ZkEventThread into a ControllerEvent which gets processed potentially 
much later. For instance in BrokerChangeListener:
```
class BrokerChangeListener(controller: KafkaController) extends 
IZkChildListener with Logging {
  override def handleChildChange(parentPath: String, currentChilds: 
java.util.List[String]): Unit = {
import JavaConverters._

controller.addToControllerEventQueue(controller.BrokerChange(currentChilds.asScala))
  }
}
```

In terms of impact, this:
- increases the odds of working with stale information by the time the 
ControllerEvent gets processed.
- can cause the cluster to take a long time to stabilize if you bring up 
many brokers simultaneously.

In terms of how to solve it:
- (short term) just ignore the ZkClient's information lookup and repeat the 
lookup at the start of the ControllerEvent. This is the approach taken in this 
ticket.
- (long term) try to remove a queue. This basically means getting rid of 
ZkClient. This is likely the approach that will be taken in KAFKA-5501.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/onurkaraman/kafka KAFKA-5502

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/3413.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3413


commit 585defde3400e62faaa65c5175e6db9e82c8ad18
Author: Onur Karaman 
Date:   2017-06-22T20:05:29Z

KAFKA-5502: read current brokers from zookeeper upon processing broker 
change

Dong Lin's testing of the 0.11.0 release revealed a controller-side 
performance regression in clusters with many brokers and many partitions when 
bringing up many brokers simultaneously.

The regerssion is caused by KAFKA-5028: a Watcher receives WatchedEvent 
notifications from the raw ZooKeeper client EventThread. A WatchedEvent only 
contains the following information:
- KeeperState
- EventType
- path

Note that it does not actually contain the current data or current set of 
children associated with the data/child change notification. It is up to the 
user to do this lookup to see the current data or set of children.

ZkClient is itself a Watcher. When it receives a WatchedEvent, it puts a 
ZkEvent into its own queue which its own ZkEventThread processes. Users of 
ZkClient interact with these notifications through listeners (IZkDataListener, 
IZkChildListener). IZkDataListener actually expects as input the current data 
of the watched znode, and likewise IZkChildListener actually expects as input 
the current set of children of the watched znode. In order to provide this 
information to the listeners, the ZkEventThread, when 

[GitHub] kafka pull request #3412: KAFKA-5498: ConfigDef derived from another ConfigD...

2017-06-22 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/kafka/pull/3412


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Resolved] (KAFKA-5498) Connect validation API stops returning recommendations for some fields after the right sequence of requests

2017-06-22 Thread Gwen Shapira (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-5498?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gwen Shapira resolved KAFKA-5498.
-
   Resolution: Fixed
Fix Version/s: 0.11.1.0

Issue resolved by pull request 3412
[https://github.com/apache/kafka/pull/3412]

> Connect validation API stops returning recommendations for some fields after 
> the right sequence of requests
> ---
>
> Key: KAFKA-5498
> URL: https://issues.apache.org/jira/browse/KAFKA-5498
> Project: Kafka
>  Issue Type: Bug
>  Components: KafkaConnect
>Reporter: Ewen Cheslack-Postava
>Assignee: Ewen Cheslack-Postava
> Fix For: 0.11.0.0, 0.11.1.0
>
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> If you issue the right sequence of requests against this API, it starts 
> behaving differently, omitting  certain fields (at a minimum recommended 
> values, which is how I noticed this). If you start with
> {code}
> $ curl -X PUT -H "Content-Type: application/json" --data '{"connector.class": 
> "org.apache.kafka.connect.file.FileStreamSourceConnector", "name": "file", 
> "transforms": "foo"}' 
> http://localhost:8083/connector-plugins/org.apache.kafka.connect.file.FileStreamSourceConnector/config/validate
>   | jq
>   % Total% Received % Xferd  Average Speed   TimeTime Time  
> Current
>  Dload  Upload   Total   SpentLeft  Speed
> 100  5845  100  5730  100   115  36642735 --:--:-- --:--:-- --:--:-- 36496
> {
>   "name": "org.apache.kafka.connect.file.FileStreamSourceConnector",
>   "error_count": 4,
>   "groups": [
> "Common",
> "Transforms",
> "Transforms: foo"
>   ],
>   "configs": [
> {
>   "definition": {
> "name": "name",
> "type": "STRING",
> "required": true,
> "default_value": null,
> "importance": "HIGH",
> "documentation": "Globally unique name to use for this connector.",
> "group": "Common",
> "width": "MEDIUM",
> "display_name": "Connector name",
> "dependents": [],
> "order": 1
>   },
>   "value": {
> "name": "name",
> "value": "file",
> "recommended_values": [],
> "errors": [],
> "visible": true
>   }
> },
> {
>   "definition": {
> "name": "connector.class",
> "type": "STRING",
> "required": true,
> "default_value": null,
> "importance": "HIGH",
> "documentation": "Name or alias of the class for this connector. Must 
> be a subclass of org.apache.kafka.connect.connector.Connector. If the 
> connector is org.apache.kafka.connect.file.FileStreamSinkConnector, you can 
> either specify this full name,  or use \"FileStreamSink\" or 
> \"FileStreamSinkConnector\" to make the configuration a bit shorter",
> "group": "Common",
> "width": "LONG",
> "display_name": "Connector class",
> "dependents": [],
> "order": 2
>   },
>   "value": {
> "name": "connector.class",
> "value": "org.apache.kafka.connect.file.FileStreamSourceConnector",
> "recommended_values": [],
> "errors": [],
> "visible": true
>   }
> },
> {
>   "definition": {
> "name": "tasks.max",
> "type": "INT",
> "required": false,
> "default_value": "1",
> "importance": "HIGH",
> "documentation": "Maximum number of tasks to use for this connector.",
> "group": "Common",
> "width": "SHORT",
> "display_name": "Tasks max",
> "dependents": [],
> "order": 3
>   },
>   "value": {
> "name": "tasks.max",
> "value": "1",
> "recommended_values": [],
> "errors": [],
> "visible": true
>   }
> },
> {
>   "definition": {
> "name": "key.converter",
> "type": "CLASS",
> "required": false,
> "default_value": null,
> "importance": "LOW",
> "documentation": "Converter class used to convert between Kafka 
> Connect format and the serialized form that is written to Kafka. This 
> controls the format of the keys in messages written to or read from Kafka, 
> and since this is independent of connectors it allows any connector to work 
> with any serialization format. Examples of common formats include JSON and 
> Avro.",
> "group": "Common",
> "width": "SHORT",
> "display_name": "Key converter class",
> "dependents": [],
> "order": 4
>   },
>   "value": {
> "name": "key.converter",
> "value": null,
> "recommended_values": [],
> "errors": [],
> "visible": true
>   }
> },
>

[jira] [Created] (KAFKA-5502) read current brokers from zookeeper upon processing broker change

2017-06-22 Thread Onur Karaman (JIRA)
Onur Karaman created KAFKA-5502:
---

 Summary: read current brokers from zookeeper upon processing 
broker change
 Key: KAFKA-5502
 URL: https://issues.apache.org/jira/browse/KAFKA-5502
 Project: Kafka
  Issue Type: Sub-task
Reporter: Onur Karaman
Assignee: Onur Karaman


[~lindong]'s testing of the 0.11.0 release revealed a controller-side 
performance regression in clusters with many brokers and many partitions when 
bringing up many brokers simultaneously.

The regerssion is caused by KAFKA-5028: a Watcher receives WatchedEvent 
notifications from the raw ZooKeeper client EventThread. A WatchedEvent only 
contains the following information:
- KeeperState
- EventType
- path

Note that it does not actually contain the current data or current set of 
children associated with the data/child change notification. It is up to the 
user to do this lookup to see the current data or set of children.

ZkClient is itself a Watcher. When it receives a WatchedEvent, it puts a 
ZkEvent into its own queue which its own ZkEventThread processes. Users of 
ZkClient interact with these notifications through listeners (IZkDataListener, 
IZkChildListener). IZkDataListener actually expects as input the current data 
of the watched znode, and likewise IZkChildListener actually expects as input 
the current set of children of the watched znode. In order to provide this 
information to the listeners, the ZkEventThread, when processing the ZkEvent in 
its queue, looks up the information (either the current data or current set of 
children) simultaneously sets up the next watch, and passes the result to the 
listener.

The regression introduced in KAFKA-5028 is the time at which we lookup the 
information needed for the event processing.

In the past, the lookup from the ZkEventThread during ZkEvent processing would 
be passed into the listener which is processed immediately after. For instance 
in ZkClient.fireChildChangedEvents:
{code}
List children = getChildren(path);
listener.handleChildChange(path, children);
{code}
Now, however, there are multiple listeners that pass information looked up by 
the ZkEventThread into a ControllerEvent which gets processed potentially much 
later. For instance in BrokerChangeListener:
{code}
class BrokerChangeListener(controller: KafkaController) extends 
IZkChildListener with Logging {
  override def handleChildChange(parentPath: String, currentChilds: 
java.util.List[String]): Unit = {
import JavaConverters._

controller.addToControllerEventQueue(controller.BrokerChange(currentChilds.asScala))
  }
}
{code}

In terms of impact, this:
- increases the odds of working with stale information by the time the 
ControllerEvent gets processed.
- can cause the cluster to take a long time to stabilize if you bring up many 
brokers simultaneously.

In terms of how to solve it:
- (short term) just ignore the ZkClient's information lookup and repeat the 
lookup at the start of the ControllerEvent. This is the approach taken in this 
ticket.
- (long term) try to remove a queue. This basically means getting rid of 
ZkClient. This is likely the approach that will be taken in KAFKA-5501.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (KAFKA-5501) use async zookeeper apis everywhere

2017-06-22 Thread Onur Karaman (JIRA)
Onur Karaman created KAFKA-5501:
---

 Summary: use async zookeeper apis everywhere
 Key: KAFKA-5501
 URL: https://issues.apache.org/jira/browse/KAFKA-5501
 Project: Kafka
  Issue Type: Sub-task
Reporter: Onur Karaman
Assignee: Onur Karaman


Synchronous zookeeper writes means that we wait an entire round trip before 
doing the next write. These synchronous writes are happening at a per-partition 
granularity in several places, so partition-heavy clusters suffer from the 
controller doing many sequential round trips to zookeeper.
* PartitionStateMachine.electLeaderForPartition updates leaderAndIsr in 
zookeeper on transition to OnlinePartition. This gets triggered per-partition 
sequentially with synchronous writes during controlled shutdown of the shutting 
down broker's replicas for which it is the leader.
* ReplicaStateMachine updates leaderAndIsr in zookeeper on transition to 
OfflineReplica when calling KafkaController.removeReplicaFromIsr. This gets 
triggered per-partition sequentially with synchronous writes for failed or 
controlled shutdown brokers.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[GitHub] kafka pull request #3412: KAFKA-5498: ConfigDef derived from another ConfigD...

2017-06-22 Thread ewencp
GitHub user ewencp opened a pull request:

https://github.com/apache/kafka/pull/3412

KAFKA-5498: ConfigDef derived from another ConfigDef did not correctly 
compute parentless configs



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/ewencp/kafka 
kafka-5498-base-configdef-parentless-configs

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/3412.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3412


commit 81020788083a9fe18bc465f1ab39fb4dac122f0a
Author: Ewen Cheslack-Postava 
Date:   2017-06-22T19:03:47Z

KAFKA-5498: ConfigDef derived from another ConfigDef did not correctly 
compute parentless configs




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


Build failed in Jenkins: kafka-trunk-jdk8 #1746

2017-06-22 Thread Apache Jenkins Server
See 


Changes:

[ismael] KAFKA-5490; Skip empty record batches in the consumer

[ismael] MINOR: Switch ZK client logging to INFO

--
[...truncated 992.95 KB...]
kafka.utils.CommandLineUtilsTest > testParseArgs STARTED

kafka.utils.CommandLineUtilsTest > testParseArgs PASSED

kafka.utils.CommandLineUtilsTest > testParseEmptyArgAsValid STARTED

kafka.utils.CommandLineUtilsTest > testParseEmptyArgAsValid PASSED

kafka.utils.ReplicationUtilsTest > classMethod STARTED

kafka.utils.ReplicationUtilsTest > classMethod FAILED
java.lang.AssertionError: Found unexpected threads, 
allThreads=Set(ThrottledRequestReaper-Produce, metrics-meter-tick-thread-2, 
/0:0:0:0:0:0:0:1:41069 to /0:0:0:0:0:0:0:1:36745 workers Thread 2, main, 
SessionTracker, Signal Dispatcher, ZkClient-EventThread-33295-127.0.0.1:51309, 
/0:0:0:0:0:0:0:1:41069 to /0:0:0:0:0:0:0:1:36745 workers Thread 3, Reference 
Handler, ExpirationReaper-0-Produce, Test worker-SendThread(127.0.0.1:51309), 
ExpirationReaper-0-DeleteRecords, ReplicaFetcherThread-0-0, 
ThrottledRequestReaper-Fetch, kafka-request-handler-1, 
ThrottledRequestReaper-Request, ForkJoinPool-1-worker-3, Test worker, 
SyncThread:0, NIOServerCxn.Factory:/127.0.0.1:0, Test worker-EventThread, 
ExpirationReaper-0-Fetch, Finalizer, kafka-coordinator-heartbeat-thread | 
group1, ProcessThread(sid:0 cport:51309):, metrics-meter-tick-thread-1)

kafka.utils.ReplicationUtilsTest > classMethod STARTED

kafka.utils.ReplicationUtilsTest > classMethod FAILED
java.lang.AssertionError: Found unexpected threads, 
allThreads=Set(ThrottledRequestReaper-Produce, metrics-meter-tick-thread-2, 
/0:0:0:0:0:0:0:1:41069 to /0:0:0:0:0:0:0:1:36745 workers Thread 2, main, 
SessionTracker, Signal Dispatcher, ZkClient-EventThread-33295-127.0.0.1:51309, 
/0:0:0:0:0:0:0:1:41069 to /0:0:0:0:0:0:0:1:36745 workers Thread 3, Reference 
Handler, ExpirationReaper-0-Produce, Test worker-SendThread(127.0.0.1:51309), 
ExpirationReaper-0-DeleteRecords, ReplicaFetcherThread-0-0, 
ThrottledRequestReaper-Fetch, kafka-request-handler-1, 
ThrottledRequestReaper-Request, ForkJoinPool-1-worker-3, Test worker, 
SyncThread:0, NIOServerCxn.Factory:/127.0.0.1:0, Test worker-EventThread, 
ExpirationReaper-0-Fetch, Finalizer, kafka-coordinator-heartbeat-thread | 
group1, ProcessThread(sid:0 cport:51309):, metrics-meter-tick-thread-1)

kafka.utils.JsonTest > testJsonEncoding STARTED

kafka.utils.JsonTest > testJsonEncoding PASSED

kafka.utils.ShutdownableThreadTest > testShutdownWhenCalledAfterThreadStart 
STARTED

kafka.utils.ShutdownableThreadTest > testShutdownWhenCalledAfterThreadStart 
PASSED

kafka.utils.SchedulerTest > testMockSchedulerNonPeriodicTask STARTED

kafka.utils.SchedulerTest > testMockSchedulerNonPeriodicTask PASSED

kafka.utils.SchedulerTest > testMockSchedulerPeriodicTask STARTED

kafka.utils.SchedulerTest > testMockSchedulerPeriodicTask PASSED

kafka.utils.SchedulerTest > testNonPeriodicTask STARTED

kafka.utils.SchedulerTest > testNonPeriodicTask PASSED

kafka.utils.SchedulerTest > testRestart STARTED

kafka.utils.SchedulerTest > testRestart PASSED

kafka.utils.SchedulerTest > testReentrantTaskInMockScheduler STARTED

kafka.utils.SchedulerTest > testReentrantTaskInMockScheduler PASSED

kafka.utils.SchedulerTest > testPeriodicTask STARTED

kafka.utils.SchedulerTest > testPeriodicTask PASSED

kafka.utils.ZkUtilsTest > classMethod STARTED

kafka.utils.ZkUtilsTest > classMethod FAILED
java.lang.AssertionError: Found unexpected threads, 
allThreads=Set(ThrottledRequestReaper-Produce, metrics-meter-tick-thread-2, 
/0:0:0:0:0:0:0:1:41069 to /0:0:0:0:0:0:0:1:36745 workers Thread 2, main, 
SessionTracker, Signal Dispatcher, ZkClient-EventThread-33295-127.0.0.1:51309, 
/0:0:0:0:0:0:0:1:41069 to /0:0:0:0:0:0:0:1:36745 workers Thread 3, Reference 
Handler, ExpirationReaper-0-Produce, Test worker-SendThread(127.0.0.1:51309), 
ExpirationReaper-0-DeleteRecords, ReplicaFetcherThread-0-0, 
ThrottledRequestReaper-Fetch, kafka-request-handler-1, 
ThrottledRequestReaper-Request, ForkJoinPool-1-worker-3, Test worker, 
SyncThread:0, NIOServerCxn.Factory:/127.0.0.1:0, Test worker-EventThread, 
ExpirationReaper-0-Fetch, Finalizer, kafka-coordinator-heartbeat-thread | 
group1, ProcessThread(sid:0 cport:51309):, metrics-meter-tick-thread-1)

kafka.utils.ZkUtilsTest > classMethod STARTED

kafka.utils.ZkUtilsTest > classMethod FAILED
java.lang.AssertionError: Found unexpected threads, 
allThreads=Set(ThrottledRequestReaper-Produce, metrics-meter-tick-thread-2, 
/0:0:0:0:0:0:0:1:41069 to /0:0:0:0:0:0:0:1:36745 workers Thread 2, main, 
SessionTracker, Signal Dispatcher, ZkClient-EventThread-33295-127.0.0.1:51309, 
/0:0:0:0:0:0:0:1:41069 to /0:0:0:0:0:0:0:1:36745 workers Thread 3, Reference 
Handler, ExpirationReaper-0-Produce, Test worker-SendThread(127.0.0.1:51309), 
ExpirationReaper-0-DeleteRecords, 

Re: [DISCUSS]: KIP-161: streams record processing exception handlers

2017-06-22 Thread Eno Thereska
Answers inline: 

> On 22 Jun 2017, at 03:26, Guozhang Wang  wrote:
> 
> Thanks for the updated KIP, some more comments:
> 
> 1.The config name is "default.deserialization.exception.handler" while the
> interface class name is "RecordExceptionHandler", which is more general
> than the intended purpose. Could we rename the class name accordingly?

Sure.


> 
> 2. Could you describe the full implementation of "DefaultExceptionHandler",
> currently it is not clear to me how it is implemented with the configured
> value.
> 
> In addition, I think we do not need to include an additional
> "DEFAULT_DESERIALIZATION_EXCEPTION_RESPONSE_CONFIG" as the configure()
> function is mainly used for users to pass any customized parameters that is
> out of the Streams library; plus adding such additional config sounds
> over-complicated for a default exception handler. Instead I'd suggest we
> just provide two handlers (or three if people feel strong about the
> LogAndThresholdExceptionHandler), one for FailOnExceptionHandler and one
> for LogAndContinueOnExceptionHandler. And we can set
> LogAndContinueOnExceptionHandler
> by default.
> 

That's what I had originally. Jay mentioned he preferred one default class, 
with config options.
So with that approach, you'd have 2 config options, one for failing, one for 
continuing, and the one
exception handler would take those options during it's configure() call.

After checking the other exception handlers in the code, I might revert back to 
what I originally had (2 default handlers) 
as Guozhang also re-suggests, but still have the interface extend Configurable. 
Guozhang, you ok with that? In that case
there is no need for the response config option.

Thanks
Eno


> 
> Guozhang
> 
> 
> 
> 
> 
> 
> 
> 
> On Wed, Jun 21, 2017 at 1:39 AM, Eno Thereska  >
> wrote:
> 
>> Thanks Guozhang,
>> 
>> I’ve updated the KIP and hopefully addressed all the comments so far. In
>> the process also changed the name of the KIP to reflect its scope better:
>> https://cwiki.apache.org/confluence/display/KAFKA/KIP-161%3A+streams+ 
>> 
>> deserialization+exception+handlers > 
>> confluence/display/KAFKA/KIP-161:+streams+deserialization+
>> exception+handlers>
>> 
>> Any other feedback appreciated, otherwise I’ll start the vote soon.
>> 
>> Thanks
>> Eno
>> 
>>> On Jun 12, 2017, at 6:28 AM, Guozhang Wang  wrote:
>>> 
>>> Eno, Thanks for bringing this proposal up and sorry for getting late on
>>> this. Here are my two cents:
>>> 
>>> 1. First some meta comments regarding "fail fast" v.s. "making
>> progress". I
>>> agree that in general we should better "enforce user to do the right
>> thing"
>>> in system design, but we also need to keep in mind that Kafka is a
>>> multi-tenant system, i.e. from a Streams app's pov you probably would not
>>> control the whole streaming processing pipeline end-to-end. E.g. Your
>> input
>>> data may not be controlled by yourself; it could be written by another
>> app,
>>> or another team in your company, or even a different organization, and if
>>> an error happens maybe you cannot fix "to do the right thing" just by
>>> yourself in time. In such an environment I think it is important to leave
>>> the door open to let users be more resilient. So I find the current
>>> proposal which does leave the door open for either fail-fast or make
>>> progress quite reasonable.
>>> 
>>> 2. On the other hand, if the question is whether we should provide a
>>> built-in "send to bad queue" handler from the library, I think that might
>>> be an overkill: with some tweaks (see my detailed comments below) on the
>>> API we can allow users to implement such handlers pretty easily. In
>> fact, I
>>> feel even "LogAndThresholdExceptionHandler" is not necessary as a
>> built-in
>>> handler, as it would then require users to specify the threshold via
>>> configs, etc. I think letting people provide such "eco-libraries" may be
>>> better.
>>> 
>>> 3. Regarding the CRC error: today we validate CRC on both the broker end
>>> upon receiving produce requests and on consumer end upon receiving fetch
>>> responses; and if the CRC validation fails in the former case it would
>> not
>>> be appended to the broker logs. So if we do see a CRC failure on the
>>> consumer side it has to be that either we have a flipped bit on the
>> broker
>>> disks or over the wire. For the first case it is fatal while for the
>> second
>>> it is retriable. Unfortunately we cannot tell which case it is when
>> seeing
>>> CRC validation failures. But in either case, just skipping and making
>>> progress seems not a good choice here, and hence I would personally
>> exclude
>>> these errors from the general serde errors to NOT leave the door open of
>>> making progress.
>>> 
>>> Currently such errors 

[jira] [Resolved] (KAFKA-2611) Document MirrorMaker

2017-06-22 Thread Gwen Shapira (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-2611?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gwen Shapira resolved KAFKA-2611.
-
Resolution: Fixed

MirrorMaker is actually documented.

> Document MirrorMaker
> 
>
> Key: KAFKA-2611
> URL: https://issues.apache.org/jira/browse/KAFKA-2611
> Project: Kafka
>  Issue Type: Bug
>Reporter: Gwen Shapira
>
> Its been part of our platform for a while, will be nice to add some 
> documentation around how to use it. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (KAFKA-5500) it is impossible to have custom Login Modules for PLAIN SASL mechanism

2017-06-22 Thread Anton Patrushev (JIRA)
Anton Patrushev created KAFKA-5500:
--

 Summary: it is impossible to have custom Login Modules for PLAIN 
SASL mechanism
 Key: KAFKA-5500
 URL: https://issues.apache.org/jira/browse/KAFKA-5500
 Project: Kafka
  Issue Type: Wish
Reporter: Anton Patrushev
Priority: Minor


This change -
 
https://github.com/apache/kafka/commit/275c5e1df237808fe72b8d9933f826949d4b5781#diff-3e86ea3ab586f9b6f920c00508a0d5bcR95
 - makes it impossible have login modules other than PlainLoginModule used for 
PLAIN SASL mechanism. Could it be changed the way that doesn't use particular 
login module class name?



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


Re: [VOTE] KIP-168: Add TotalTopicCount metric per cluster

2017-06-22 Thread Onur Karaman
+1

On Thu, Jun 22, 2017 at 10:05 AM, Dong Lin  wrote:

> Thanks for the KIP. +1 (non-binding)
>
> On Wed, Jun 21, 2017 at 1:17 PM, Abhishek Mendhekar <
> abhishek.mendhe...@gmail.com> wrote:
>
> > Hi Kafka Dev,
> >
> > I did like to start the voting on -
> > https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> > 168%3A+Add+TotalTopicCount+metric+per+cluster
> >
> > Discussions will continue on -
> > http://mail-archives.apache.org/mod_mbox/kafka-dev/201706.
> > mbox/%3CCAMcwe-ugep-UiSn9TkKEMwwTM%3DAzGC4jPro9LnyYRezyZg_NKA%
> > 40mail.gmail.com%3E
> >
> > Thanks,
> > Abhishek
> >
>


Re: [VOTE] KIP-168: Add TotalTopicCount metric per cluster

2017-06-22 Thread Dong Lin
Thanks for the KIP. +1 (non-binding)

On Wed, Jun 21, 2017 at 1:17 PM, Abhishek Mendhekar <
abhishek.mendhe...@gmail.com> wrote:

> Hi Kafka Dev,
>
> I did like to start the voting on -
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> 168%3A+Add+TotalTopicCount+metric+per+cluster
>
> Discussions will continue on -
> http://mail-archives.apache.org/mod_mbox/kafka-dev/201706.
> mbox/%3CCAMcwe-ugep-UiSn9TkKEMwwTM%3DAzGC4jPro9LnyYRezyZg_NKA%
> 40mail.gmail.com%3E
>
> Thanks,
> Abhishek
>


[jira] [Created] (KAFKA-5499) Double check how we handle exceptions when commits fail

2017-06-22 Thread Eno Thereska (JIRA)
Eno Thereska created KAFKA-5499:
---

 Summary: Double check how we handle exceptions when commits fail
 Key: KAFKA-5499
 URL: https://issues.apache.org/jira/browse/KAFKA-5499
 Project: Kafka
  Issue Type: Sub-task
  Components: streams
Reporter: Eno Thereska


When a task does a lot of processing in-between calls to poll() it happens that 
it might miss a rebalance. It can find that out once it tries to commit() since 
it will get an exception. Double check what is supposed to happen on such an 
exception, e.g., should the thread fail, or should it continue? 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


Jenkins build is back to normal : kafka-trunk-jdk7 #2446

2017-06-22 Thread Apache Jenkins Server
See 




Build failed in Jenkins: kafka-0.11.0-jdk7 #185

2017-06-22 Thread Apache Jenkins Server
See 


Changes:

[ismael] KAFKA-5490; Skip empty record batches in the consumer

[ismael] MINOR: Switch ZK client logging to INFO

--
[...truncated 966.79 KB...]
kafka.utils.UtilsTest > testCsvMap PASSED

kafka.utils.UtilsTest > testInLock STARTED

kafka.utils.UtilsTest > testInLock PASSED

kafka.utils.UtilsTest > testSwallow STARTED

kafka.utils.UtilsTest > testSwallow PASSED

kafka.producer.AsyncProducerTest > testFailedSendRetryLogic STARTED

kafka.producer.AsyncProducerTest > testFailedSendRetryLogic PASSED

kafka.producer.AsyncProducerTest > testQueueTimeExpired STARTED

kafka.producer.AsyncProducerTest > testQueueTimeExpired PASSED

kafka.producer.AsyncProducerTest > testPartitionAndCollateEvents STARTED

kafka.producer.AsyncProducerTest > testPartitionAndCollateEvents PASSED

kafka.producer.AsyncProducerTest > testBatchSize STARTED

kafka.producer.AsyncProducerTest > testBatchSize PASSED

kafka.producer.AsyncProducerTest > testSerializeEvents STARTED

kafka.producer.AsyncProducerTest > testSerializeEvents PASSED

kafka.producer.AsyncProducerTest > testProducerQueueSize STARTED

kafka.producer.AsyncProducerTest > testProducerQueueSize PASSED

kafka.producer.AsyncProducerTest > testRandomPartitioner STARTED

kafka.producer.AsyncProducerTest > testRandomPartitioner PASSED

kafka.producer.AsyncProducerTest > testInvalidConfiguration STARTED

kafka.producer.AsyncProducerTest > testInvalidConfiguration PASSED

kafka.producer.AsyncProducerTest > testInvalidPartition STARTED

kafka.producer.AsyncProducerTest > testInvalidPartition PASSED

kafka.producer.AsyncProducerTest > testNoBroker STARTED

kafka.producer.AsyncProducerTest > testNoBroker PASSED

kafka.producer.AsyncProducerTest > testProduceAfterClosed STARTED

kafka.producer.AsyncProducerTest > testProduceAfterClosed PASSED

kafka.producer.AsyncProducerTest > testJavaProducer STARTED

kafka.producer.AsyncProducerTest > testJavaProducer PASSED

kafka.producer.AsyncProducerTest > testIncompatibleEncoder STARTED

kafka.producer.AsyncProducerTest > testIncompatibleEncoder PASSED

kafka.producer.SyncProducerTest > testReachableServer STARTED

kafka.producer.SyncProducerTest > testReachableServer PASSED

kafka.producer.SyncProducerTest > testMessageSizeTooLarge STARTED

kafka.producer.SyncProducerTest > testMessageSizeTooLarge PASSED

kafka.producer.SyncProducerTest > testNotEnoughReplicas STARTED

kafka.producer.SyncProducerTest > testNotEnoughReplicas PASSED

kafka.producer.SyncProducerTest > testMessageSizeTooLargeWithAckZero STARTED

kafka.producer.SyncProducerTest > testMessageSizeTooLargeWithAckZero PASSED

kafka.producer.SyncProducerTest > testProducerCanTimeout STARTED

kafka.producer.SyncProducerTest > testProducerCanTimeout PASSED

kafka.producer.SyncProducerTest > testProduceRequestWithNoResponse STARTED

kafka.producer.SyncProducerTest > testProduceRequestWithNoResponse PASSED

kafka.producer.SyncProducerTest > testEmptyProduceRequest STARTED

kafka.producer.SyncProducerTest > testEmptyProduceRequest PASSED

kafka.producer.SyncProducerTest > testProduceCorrectlyReceivesResponse STARTED

kafka.producer.SyncProducerTest > testProduceCorrectlyReceivesResponse PASSED

kafka.producer.ProducerTest > testSendToNewTopic STARTED

kafka.producer.ProducerTest > testSendToNewTopic PASSED

kafka.producer.ProducerTest > testAsyncSendCanCorrectlyFailWithTimeout STARTED

kafka.producer.ProducerTest > testAsyncSendCanCorrectlyFailWithTimeout PASSED

kafka.producer.ProducerTest > testSendNullMessage STARTED

kafka.producer.ProducerTest > testSendNullMessage PASSED

kafka.producer.ProducerTest > testUpdateBrokerPartitionInfo STARTED

kafka.producer.ProducerTest > testUpdateBrokerPartitionInfo PASSED

kafka.producer.ProducerTest > testSendWithDeadBroker STARTED

kafka.producer.ProducerTest > testSendWithDeadBroker PASSED

unit.kafka.server.KafkaApisTest > 
shouldRespondWithUnsupportedForMessageFormatOnHandleWriteTxnMarkersWhenMagicLowerThanRequired
 STARTED

unit.kafka.server.KafkaApisTest > 
shouldRespondWithUnsupportedForMessageFormatOnHandleWriteTxnMarkersWhenMagicLowerThanRequired
 PASSED

unit.kafka.server.KafkaApisTest > 
shouldThrowUnsupportedVersionExceptionOnHandleTxnOffsetCommitRequestWhenInterBrokerProtocolNotSupported
 STARTED

unit.kafka.server.KafkaApisTest > 
shouldThrowUnsupportedVersionExceptionOnHandleTxnOffsetCommitRequestWhenInterBrokerProtocolNotSupported
 PASSED

unit.kafka.server.KafkaApisTest > 
shouldThrowUnsupportedVersionExceptionOnHandleAddPartitionsToTxnRequestWhenInterBrokerProtocolNotSupported
 STARTED

unit.kafka.server.KafkaApisTest > 
shouldThrowUnsupportedVersionExceptionOnHandleAddPartitionsToTxnRequestWhenInterBrokerProtocolNotSupported
 PASSED

unit.kafka.server.KafkaApisTest > 
shouldAppendToLogOnWriteTxnMarkersWhenCorrectMagicVersion STARTED

unit.kafka.server.KafkaApisTest > 

[jira] [Created] (KAFKA-5498) Connect validation API stops returning recommendations for some fields after the right sequence of requests

2017-06-22 Thread Ewen Cheslack-Postava (JIRA)
Ewen Cheslack-Postava created KAFKA-5498:


 Summary: Connect validation API stops returning recommendations 
for some fields after the right sequence of requests
 Key: KAFKA-5498
 URL: https://issues.apache.org/jira/browse/KAFKA-5498
 Project: Kafka
  Issue Type: Bug
  Components: KafkaConnect
Reporter: Ewen Cheslack-Postava
Assignee: Ewen Cheslack-Postava
 Fix For: 0.11.0.0


If you issue the right sequence of requests against this API, it starts 
behaving differently, omitting  certain fields (at a minimum recommended 
values, which is how I noticed this). If you start with

{code}
$ curl -X PUT -H "Content-Type: application/json" --data '{"connector.class": 
"org.apache.kafka.connect.file.FileStreamSourceConnector", "name": "file", 
"transforms": "foo"}' 
http://localhost:8083/connector-plugins/org.apache.kafka.connect.file.FileStreamSourceConnector/config/validate
  | jq
  % Total% Received % Xferd  Average Speed   TimeTime Time  Current
 Dload  Upload   Total   SpentLeft  Speed
100  5845  100  5730  100   115  36642735 --:--:-- --:--:-- --:--:-- 36496
{
  "name": "org.apache.kafka.connect.file.FileStreamSourceConnector",
  "error_count": 4,
  "groups": [
"Common",
"Transforms",
"Transforms: foo"
  ],
  "configs": [
{
  "definition": {
"name": "name",
"type": "STRING",
"required": true,
"default_value": null,
"importance": "HIGH",
"documentation": "Globally unique name to use for this connector.",
"group": "Common",
"width": "MEDIUM",
"display_name": "Connector name",
"dependents": [],
"order": 1
  },
  "value": {
"name": "name",
"value": "file",
"recommended_values": [],
"errors": [],
"visible": true
  }
},
{
  "definition": {
"name": "connector.class",
"type": "STRING",
"required": true,
"default_value": null,
"importance": "HIGH",
"documentation": "Name or alias of the class for this connector. Must 
be a subclass of org.apache.kafka.connect.connector.Connector. If the connector 
is org.apache.kafka.connect.file.FileStreamSinkConnector, you can either 
specify this full name,  or use \"FileStreamSink\" or 
\"FileStreamSinkConnector\" to make the configuration a bit shorter",
"group": "Common",
"width": "LONG",
"display_name": "Connector class",
"dependents": [],
"order": 2
  },
  "value": {
"name": "connector.class",
"value": "org.apache.kafka.connect.file.FileStreamSourceConnector",
"recommended_values": [],
"errors": [],
"visible": true
  }
},
{
  "definition": {
"name": "tasks.max",
"type": "INT",
"required": false,
"default_value": "1",
"importance": "HIGH",
"documentation": "Maximum number of tasks to use for this connector.",
"group": "Common",
"width": "SHORT",
"display_name": "Tasks max",
"dependents": [],
"order": 3
  },
  "value": {
"name": "tasks.max",
"value": "1",
"recommended_values": [],
"errors": [],
"visible": true
  }
},
{
  "definition": {
"name": "key.converter",
"type": "CLASS",
"required": false,
"default_value": null,
"importance": "LOW",
"documentation": "Converter class used to convert between Kafka Connect 
format and the serialized form that is written to Kafka. This controls the 
format of the keys in messages written to or read from Kafka, and since this is 
independent of connectors it allows any connector to work with any 
serialization format. Examples of common formats include JSON and Avro.",
"group": "Common",
"width": "SHORT",
"display_name": "Key converter class",
"dependents": [],
"order": 4
  },
  "value": {
"name": "key.converter",
"value": null,
"recommended_values": [],
"errors": [],
"visible": true
  }
},
{
  "definition": {
"name": "value.converter",
"type": "CLASS",
"required": false,
"default_value": null,
"importance": "LOW",
"documentation": "Converter class used to convert between Kafka Connect 
format and the serialized form that is written to Kafka. This controls the 
format of the values in messages written to or read from Kafka, and since this 
is independent of connectors it allows any connector to work with any 
serialization format. Examples of common formats include JSON and Avro.",
"group": "Common",
"width": "SHORT",
"display_name": "Value converter 

Kafka streams KStream and ktable join issue

2017-06-22 Thread Shekar Tippur
Hello,

I am trying to perform a simple join operation. I am using Kafka 0.10.2

I have a "raw" table and a "cache" topics and just 1 partition in my local
environment.

ktable has these entries

{"Joe": {"location": "US", "gender": "male"}}
{"Julie": {"location": "US", "gender": "female"}}
{"Kawasaki": {"location": "Japan", "gender": "male"}}

The kstream gets a event

{"user": "Joe", "custom": {"choice":"vegan"}}

I want a output as a join

{"user": "Joe", "custom": {"choice":"vegan","enriched":*{"location": "US",
"gender": "male"}*} }

I want to take whats in ktable and add to enriched section of the output
stream.

I have defined serde

//This is the same serde code from the example.

final TestStreamsSerializer jsonSerializer = new
TestStreamsSerializer();
final TestStreamsDeserialzer jsonDeserializer = new
TestStreamsDeserialzer();
final Serde jsonSerde = Serdes.serdeFrom(jsonSerializer,
jsonDeserializer);

//

KStream raw = builder.stream(Serdes.String(),
jsonSerde, "raw");
KTable  cache = builder.table("cache", "local-cache");

raw.leftJoin(cache,
(record1, record2) -> record1.get("user") + "-" + record2).to("output");

I am having trouble understanding how to call the join api.

With the above code, I seem to get a error:

[2017-06-22 09:23:31,836] ERROR User provided listener
org.apache.kafka.streams.processor.internals.StreamThread$1 for group
streams-pipe failed on partition assignment
(org.apache.kafka.clients.consumer.internals.ConsumerCoordinator)

java.lang.NullPointerException

at org.rocksdb.RocksDB.put(RocksDB.java:488)

at
org.apache.kafka.streams.state.internals.RocksDBStore.putInternal(RocksDBStore.java:254)

at
org.apache.kafka.streams.state.internals.RocksDBStore.access$000(RocksDBStore.java:67)

at
org.apache.kafka.streams.state.internals.RocksDBStore$1.restore(RocksDBStore.java:164)

at
org.apache.kafka.streams.processor.internals.ProcessorStateManager.restoreActiveState(ProcessorStateManager.java:242)

at
org.apache.kafka.streams.processor.internals.ProcessorStateManager.register(ProcessorStateManager.java:201)

at
org.apache.kafka.streams.processor.internals.AbstractProcessorContext.register(AbstractProcessorContext.java:99)

at
org.apache.kafka.streams.state.internals.RocksDBStore.init(RocksDBStore.java:160)

at
org.apache.kafka.streams.state.internals.MeteredKeyValueStore$7.run(MeteredKeyValueStore.java:100)

at
org.apache.kafka.streams.processor.internals.StreamsMetricsImpl.measureLatencyNs(StreamsMetricsImpl.java:188)

at
org.apache.kafka.streams.state.internals.MeteredKeyValueStore.init(MeteredKeyValueStore.java:131)

at
org.apache.kafka.streams.state.internals.CachingKeyValueStore.init(CachingKeyValueStore.java:63)

at
org.apache.kafka.streams.processor.internals.AbstractTask.initializeStateStores(AbstractTask.java:86)

at
org.apache.kafka.streams.processor.internals.StreamTask.(StreamTask.java:141)

at
org.apache.kafka.streams.processor.internals.StreamThread.createStreamTask(StreamThread.java:864)

at
org.apache.kafka.streams.processor.internals.StreamThread$TaskCreator.createTask(StreamThread.java:1237)

at
org.apache.kafka.streams.processor.internals.StreamThread$AbstractTaskCreator.retryWithBackoff(StreamThread.java:1210)

at
org.apache.kafka.streams.processor.internals.StreamThread.addStreamTasks(StreamThread.java:967)

at
org.apache.kafka.streams.processor.internals.StreamThread.access$600(StreamThread.java:69)

at
org.apache.kafka.streams.processor.internals.StreamThread$1.onPartitionsAssigned(StreamThread.java:234)

at
org.apache.kafka.clients.consumer.internals.ConsumerCoordinator.onJoinComplete(ConsumerCoordinator.java:259)

at
org.apache.kafka.clients.consumer.internals.AbstractCoordinator.joinGroupIfNeeded(AbstractCoordinator.java:352)

at
org.apache.kafka.clients.consumer.internals.AbstractCoordinator.ensureActiveGroup(AbstractCoordinator.java:303)

at
org.apache.kafka.clients.consumer.internals.ConsumerCoordinator.poll(ConsumerCoordinator.java:290)

at
org.apache.kafka.clients.consumer.KafkaConsumer.pollOnce(KafkaConsumer.java:1029)

at
org.apache.kafka.clients.consumer.KafkaConsumer.poll(KafkaConsumer.java:995)

at
org.apache.kafka.streams.processor.internals.StreamThread.runLoop(StreamThread.java:592)

at
org.apache.kafka.streams.processor.internals.StreamThread.run(StreamThread.java:361)

[2017-06-22 09:23:31,849] WARN stream-thread [StreamThread-1] Unexpected
state transition from ASSIGNING_PARTITIONS to NOT_RUNNING.
(org.apache.kafka.streams.processor.internals.StreamThread)

Exception in thread "StreamThread-1"
org.apache.kafka.streams.errors.StreamsException: stream-thread
[StreamThread-1] Failed to rebalance

at
org.apache.kafka.streams.processor.internals.StreamThread.runLoop(StreamThread.java:598)

at
org.apache.kafka.streams.processor.internals.StreamThread.run(StreamThread.java:361)

Caused by: java.lang.NullPointerException

at org.rocksdb.RocksDB.put(RocksDB.java:488)

at

[GitHub] kafka pull request #3411: KAFKA-5487: upgrade and downgrade streams app syst...

2017-06-22 Thread enothereska
GitHub user enothereska opened a pull request:

https://github.com/apache/kafka/pull/3411

KAFKA-5487: upgrade and downgrade streams app system test



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/enothereska/kafka 
KAFKA-5487-upgrade-test-streams

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/3411.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3411


commit eae7e3953f0f1028862d3d2b84104a4806cc6f31
Author: Eno Thereska 
Date:   2017-06-22T10:52:34Z

Checkpoint

commit ac0ebfa6aee0d7a385058e2342d65ccbf515182b
Author: Eno Thereska 
Date:   2017-06-22T16:35:03Z

Checkpoint 2




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


Re: [VOTE] 0.11.0.0 RC1

2017-06-22 Thread Tom Crayford
That's fair, and nice find with the transaction performance improvement!

Once the RC is out, we'll do a final round of performance testing with the
new ProducerPerformance changes enabled.

I think it's fair that this shouldn't delay the release. Is there an
official stance on what should and shouldn't delay a release documented
somewhere?

Thanks

Tom Crayford
Heroku Kafka

On Thu, Jun 22, 2017 at 4:45 PM, Ismael Juma  wrote:

> Hi Tom,
>
> We are going to do another RC to include Apurva's significant performance
> improvement when transactions are enabled:
>
> https://github.com/apache/kafka/commit/f239f1f839f8bcbd80cce2a4a8643e
> 15d340be8e
>
> Given that, we can also include the ProducerPerformance changes that
> Apurva did to find and fix the performance issue.
>
> In my opinion, the ProducerPerformance change alone would not be enough
> reason for another RC as users can run the tool from trunk to test older
> releases. In any case, this is hypothetical at this point. :)
>
> And thanks for continuing your testing, it's very much appreciated!
>
> Ismael
>
> On Wed, Jun 21, 2017 at 8:03 PM, Tom Crayford 
> wrote:
>
>> That looks better than mine, nice! I think the tooling matters a lot to
>> the usability of the product we're shipping, being able to test out Kafka's
>> features on your own hardware/setup is very important to knowing if it can
>> work.
>>
>> On Wed, Jun 21, 2017 at 8:01 PM, Apurva Mehta 
>> wrote:
>>
>>> Hi Tom,
>>>
>>> I actually made modifications to the produce performance tool to do real
>>> transactions earlier this week as part of our benchmarking (results
>>> published here: bit.ly/kafka-eos-perf). I just submitted that patch
>>> here:
>>> https://github.com/apache/kafka/pull/3400/files
>>>
>>> I think my version is more complete since it runs the full gamut of APIs:
>>> initTransactions, beginTransaction, commitTransaction. Also, it is the
>>> version used for our published benchmarks.
>>>
>>> I am not sure that this tool is a blocker for the release though, since
>>> it
>>> doesn't really affect the usability of the feature any way.
>>>
>>> Thanks,
>>> Apurva
>>>
>>> On Wed, Jun 21, 2017 at 11:12 AM, Tom Crayford 
>>> wrote:
>>>
>>> > Hi there,
>>> >
>>> > I'm -1 (non-binding) on shipping this RC.
>>> >
>>> > Heroku has carried on performance testing with 0.11 RC1. We have
>>> updated
>>> > our test setup to use 0.11.0.0 RC1 client libraries. Without any of the
>>> > transactional features enabled, we get slightly better performance than
>>> > 0.10.2.1 with 10.2.1 client libraries.
>>> >
>>> > However, we attempted to run a performance test today with
>>> transactions,
>>> > idempotence and consumer read_committed enabled, but couldn't, because
>>> > enabling transactions requires the producer to call `initTransactions`
>>> > before starting to send messages, and the producer performance tool
>>> doesn't
>>> > allow for that.
>>> >
>>> > I'm -1 (non-binding) on shipping this RC in this state, because users
>>> > expect to be able to use the inbuilt performance testing tools, and
>>> > preventing them from testing the impact of the new features using the
>>> > inbuilt tools isn't great. I made a PR for this:
>>> > https://github.com/apache/kafka/pull/3398 (the change is very small).
>>> > Happy
>>> > to make a jira as well, if that makes sense.
>>> >
>>> > Thanks
>>> >
>>> > Tom Crayford
>>> > Heroku Kafka
>>> >
>>> > On Tue, Jun 20, 2017 at 8:32 PM, Vahid S Hashemian <
>>> > vahidhashem...@us.ibm.com> wrote:
>>> >
>>> > > Hi Ismael,
>>> > >
>>> > > Thanks for running the release.
>>> > >
>>> > > Running tests ('gradlew.bat test') on my Windows 64-bit VM results in
>>> > > these checkstyle errors:
>>> > >
>>> > > :clients:checkstyleMain
>>> > > [ant:checkstyle] [ERROR]
>>> > > C:\Users\User\Downloads\kafka-0.11.0.0-src\clients\src\main\
>>> > > java\org\apache\kafka\common\protocol\Errors.java:89:1:
>>> > > Class Data Abstraction Coupling is 57 (max allowed is 20) classes
>>> > > [ApiExceptionBuilder, BrokerNotAvailableException,
>>> > > ClusterAuthorizationException, ConcurrentTransactionsException,
>>> > > ControllerMovedException, CoordinatorLoadInProgressException,
>>> > > CoordinatorNotAvailableException, CorruptRecordException,
>>> > > DuplicateSequenceNumberException, GroupAuthorizationException,
>>> > > IllegalGenerationException, IllegalSaslStateException,
>>> > > InconsistentGroupProtocolException, InvalidCommitOffsetSizeExcepti
>>> on,
>>> > > InvalidConfigurationException, InvalidFetchSizeException,
>>> > > InvalidGroupIdException, InvalidPartitionsException,
>>> > > InvalidPidMappingException, InvalidReplicaAssignmentException,
>>> > > InvalidReplicationFactorException, InvalidRequestException,
>>> > > InvalidRequiredAcksException, InvalidSessionTimeoutException,
>>> > > InvalidTimestampException, InvalidTopicException,
>>> > > InvalidTxnStateException, InvalidTxnTimeoutException,

[GitHub] kafka pull request #3409: MINOR: Switch ZK client logging to INFO

2017-06-22 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/kafka/pull/3409


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] kafka pull request #3408: KAFKA-5490: Skip empty record batches in the consu...

2017-06-22 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/kafka/pull/3408


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


Re: [VOTE] 0.11.0.0 RC1

2017-06-22 Thread Ismael Juma
Hi Tom,

We are going to do another RC to include Apurva's significant performance
improvement when transactions are enabled:

https://github.com/apache/kafka/commit/f239f1f839f8bcbd80cce2a4a8643e15d340be8e

Given that, we can also include the ProducerPerformance changes that Apurva
did to find and fix the performance issue.

In my opinion, the ProducerPerformance change alone would not be enough
reason for another RC as users can run the tool from trunk to test older
releases. In any case, this is hypothetical at this point. :)

And thanks for continuing your testing, it's very much appreciated!

Ismael

On Wed, Jun 21, 2017 at 8:03 PM, Tom Crayford  wrote:

> That looks better than mine, nice! I think the tooling matters a lot to
> the usability of the product we're shipping, being able to test out Kafka's
> features on your own hardware/setup is very important to knowing if it can
> work.
>
> On Wed, Jun 21, 2017 at 8:01 PM, Apurva Mehta  wrote:
>
>> Hi Tom,
>>
>> I actually made modifications to the produce performance tool to do real
>> transactions earlier this week as part of our benchmarking (results
>> published here: bit.ly/kafka-eos-perf). I just submitted that patch here:
>> https://github.com/apache/kafka/pull/3400/files
>>
>> I think my version is more complete since it runs the full gamut of APIs:
>> initTransactions, beginTransaction, commitTransaction. Also, it is the
>> version used for our published benchmarks.
>>
>> I am not sure that this tool is a blocker for the release though, since it
>> doesn't really affect the usability of the feature any way.
>>
>> Thanks,
>> Apurva
>>
>> On Wed, Jun 21, 2017 at 11:12 AM, Tom Crayford 
>> wrote:
>>
>> > Hi there,
>> >
>> > I'm -1 (non-binding) on shipping this RC.
>> >
>> > Heroku has carried on performance testing with 0.11 RC1. We have updated
>> > our test setup to use 0.11.0.0 RC1 client libraries. Without any of the
>> > transactional features enabled, we get slightly better performance than
>> > 0.10.2.1 with 10.2.1 client libraries.
>> >
>> > However, we attempted to run a performance test today with transactions,
>> > idempotence and consumer read_committed enabled, but couldn't, because
>> > enabling transactions requires the producer to call `initTransactions`
>> > before starting to send messages, and the producer performance tool
>> doesn't
>> > allow for that.
>> >
>> > I'm -1 (non-binding) on shipping this RC in this state, because users
>> > expect to be able to use the inbuilt performance testing tools, and
>> > preventing them from testing the impact of the new features using the
>> > inbuilt tools isn't great. I made a PR for this:
>> > https://github.com/apache/kafka/pull/3398 (the change is very small).
>> > Happy
>> > to make a jira as well, if that makes sense.
>> >
>> > Thanks
>> >
>> > Tom Crayford
>> > Heroku Kafka
>> >
>> > On Tue, Jun 20, 2017 at 8:32 PM, Vahid S Hashemian <
>> > vahidhashem...@us.ibm.com> wrote:
>> >
>> > > Hi Ismael,
>> > >
>> > > Thanks for running the release.
>> > >
>> > > Running tests ('gradlew.bat test') on my Windows 64-bit VM results in
>> > > these checkstyle errors:
>> > >
>> > > :clients:checkstyleMain
>> > > [ant:checkstyle] [ERROR]
>> > > C:\Users\User\Downloads\kafka-0.11.0.0-src\clients\src\main\
>> > > java\org\apache\kafka\common\protocol\Errors.java:89:1:
>> > > Class Data Abstraction Coupling is 57 (max allowed is 20) classes
>> > > [ApiExceptionBuilder, BrokerNotAvailableException,
>> > > ClusterAuthorizationException, ConcurrentTransactionsException,
>> > > ControllerMovedException, CoordinatorLoadInProgressException,
>> > > CoordinatorNotAvailableException, CorruptRecordException,
>> > > DuplicateSequenceNumberException, GroupAuthorizationException,
>> > > IllegalGenerationException, IllegalSaslStateException,
>> > > InconsistentGroupProtocolException, InvalidCommitOffsetSizeException,
>> > > InvalidConfigurationException, InvalidFetchSizeException,
>> > > InvalidGroupIdException, InvalidPartitionsException,
>> > > InvalidPidMappingException, InvalidReplicaAssignmentException,
>> > > InvalidReplicationFactorException, InvalidRequestException,
>> > > InvalidRequiredAcksException, InvalidSessionTimeoutException,
>> > > InvalidTimestampException, InvalidTopicException,
>> > > InvalidTxnStateException, InvalidTxnTimeoutException,
>> > > LeaderNotAvailableException, NetworkException, NotControllerException,
>> > > NotCoordinatorException, NotEnoughReplicasAfterAppendException,
>> > > NotEnoughReplicasException, NotLeaderForPartitionException,
>> > > OffsetMetadataTooLarge, OffsetOutOfRangeException,
>> > > OperationNotAttemptedException, OutOfOrderSequenceException,
>> > > PolicyViolationException, ProducerFencedException,
>> > > RebalanceInProgressException, RecordBatchTooLargeException,
>> > > RecordTooLargeException, ReplicaNotAvailableException,
>> > > SecurityDisabledException, TimeoutException,
>> 

Re: [VOTE] 0.11.0.0 RC0

2017-06-22 Thread Ismael Juma
Thanks Tom. Sounds good. :)

Ismael

On Thu, Jun 22, 2017 at 4:21 PM, Tom Crayford  wrote:

> Hi Ismal,
>
> Sure. It's a standard heroku plan, the `extended-2`, which has 8 brokers.
> We did several rounds of performance testing, some with low numbers of
> partitions and topics (16 partitions with one topic, which we typically see
> the highest throughput on) and some with many more (many thousands of
> partitions and topics). Replication factor was always 3. Partitions per
> broker was roughly equal in each round (we just use the inbuilt kafka
> scripts to create topics). Lastly, all tests run were with SSL enabled for
> both client and inter-broker traffic.
>
> These are the same performance tests we've run for previous versions, and
> wrote about when testing 0.10 here: https://blog.heroku.com/
> apache-kafka-010-evaluating-performance-in-distributed-systems
>
> In all performance tests we've ever done, Kafka has been network limited,
> not disk or CPU. This, I think is common for Kafka setups and workloads,
> but it does change significantly with what hardware you're using.
>
> Because our performance tests run across cloud instances where there might
> be contention and variability, we typically do a multiple runs on different
> clusters with each setup before reporting results.
>
> Thanks
>
> Tom Crayford
> Heroku Kafka
>
> On Thu, Jun 22, 2017 at 4:08 PM, Ismael Juma  wrote:
>
>> Hi Tom,
>>
>> Would you be able to share the details of the Kafka Cluster that you used
>> for performance testing? We are interested in the number of brokers,
>> topics, partitions per broker and replication factor. Thanks!
>>
>> Ismael
>>
>> On Mon, Jun 19, 2017 at 3:15 PM, Tom Crayford 
>> wrote:
>>
>>> Hello,
>>>
>>> Heroku has been testing 0.11.0.0 RC0, mostly focussed on backwards
>>> compatibility and performance. So far, we note a slight performance
>>> increase from older versions when using not-current clients
>>>
>>> Testing a 0.9 client against 0.10.2.1 vs 0.11.0.0 rc0: 0.11.0.0 rc0 has
>>> slightly higher throughput for both consumers and producers. We expect
>>> this
>>> because the message format improvements should lead to greater
>>> efficiency.
>>>
>>> We have not yet tested a 0.11.0.0 rc0 client against a 0.11.0.0 rc0
>>> cluster, because our test setup needs updating for that.
>>>
>>> We've tested simple demo apps against 0.11.0.0rc0 (that we've run against
>>> older clusters):
>>> http://github.com/heroku/heroku-kafka-demo-ruby
>>> https://github.com/heroku/heroku-kafka-demo-node
>>> https://github.com/heroku/heroku-kafka-demo-java
>>> https://github.com/heroku/heroku-kafka-demo-go
>>>
>>> This comprises a range of community supported clients: ruby-kafka,
>>> no-kafka, the main JVM client and sarama.
>>>
>>> We didn't see any notable issues there, but it's worth noting that all of
>>> these demo apps do little more than produce and consume messages.
>>>
>>> We have also tested failure handling in 0.11.0.0 rc0, by terminating
>>> nodes.
>>> Note that this does *not* test any of the new exactly-once features, just
>>> "can I terminate a broker whilst producing to/consuming from the cluster.
>>> We see the same behaviour as 0.10.2.1 there, just a round of errors from
>>> the client, like this:
>>>
>>> org.apache.kafka.common.errors.NetworkException: The server disconnected
>>> before a response was received.
>>>
>>> but that's expected.
>>>
>>> We have tested creating and deleting topics heavily, including deleting a
>>> topic in the middle of broker failure (the controller waits for the
>>> broker
>>> to come back before being deleted, as expected)
>>>
>>> We have also tested upgrading a 0.10.2.1 cluster to 0.11.0.0 rc0 without
>>> issue
>>>
>>> We have also tested partition preferred leader election (manual, with the
>>> admin script), and partition rebalancing to grow and shrink clusters.
>>>
>>> We have not yet tested the exactly once features, because various core
>>> committers said that they didn't expect this feature to be perfect in
>>> this
>>> release. We expect to test this this week though.
>>>
>>> Given that the blockers fixed between RC0 and RC1 haven't changed much in
>>> the areas we tested, I think the positive results here still apply.
>>>
>>> Thanks
>>>
>>> Tom Crayford
>>> Heroku Kafka
>>>
>>> On Thu, Jun 8, 2017 at 2:55 PM, Ismael Juma  wrote:
>>>
>>> > Hello Kafka users, developers and client-developers,
>>> >
>>> > This is the first candidate for release of Apache Kafka 0.11.0.0. It's
>>> > worth noting that there are a small number of unresolved issues
>>> (including
>>> > documentation and system tests) related to the new AdminClient and
>>> > Exactly-once functionality[1] that we hope to resolve in the next few
>>> days.
>>> > To encourage early testing, we are releasing the first release
>>> candidate
>>> > now, but there will be at least one more release candidate.
>>> >
>>> > Any and all testing 

[GitHub] kafka pull request #3401: MINOR: explain producer naming within Streams

2017-06-22 Thread mjsax
Github user mjsax closed the pull request at:

https://github.com/apache/kafka/pull/3401


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] kafka pull request #3410: KAFKA-4913: prevent creation of window stores with...

2017-06-22 Thread dguy
GitHub user dguy opened a pull request:

https://github.com/apache/kafka/pull/3410

KAFKA-4913: prevent creation of window stores with less than 2 segments

Throw IllegalArgumentException when attempting to create a `WindowStore` 
via `Stores` or directly with `RocksDBWindowStoreSupplier` when it has less 
than 2 segments.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/dguy/kafka kafka-4913

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/3410.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3410


commit 16a0772f39a91195941cdb4be9e4639bdb20c84e
Author: Damian Guy 
Date:   2017-06-22T15:30:48Z

prevent creation of window store with < 2 segments




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


Build failed in Jenkins: kafka-trunk-jdk7 #2445

2017-06-22 Thread Apache Jenkins Server
See 


Changes:

[damian.guy] KAFKA-4785; Records from internal repartitioning topics should 
always

--
[...truncated 895.30 KB...]

kafka.server.epoch.LeaderEpochFileCacheTest > 
shouldReturnUndefinedOffsetIfUndefinedEpochRequested STARTED

kafka.server.epoch.LeaderEpochFileCacheTest > 
shouldReturnUndefinedOffsetIfUndefinedEpochRequested PASSED

kafka.server.epoch.LeaderEpochFileCacheTest > 
shouldIncreaseAndTrackEpochsAsLeadersChangeManyTimes STARTED

kafka.server.epoch.LeaderEpochFileCacheTest > 
shouldIncreaseAndTrackEpochsAsLeadersChangeManyTimes PASSED

kafka.server.epoch.LeaderEpochFileCacheTest > 
shouldSupportEpochsThatDoNotStartFromZero STARTED

kafka.server.epoch.LeaderEpochFileCacheTest > 
shouldSupportEpochsThatDoNotStartFromZero PASSED

kafka.server.epoch.LeaderEpochFileCacheTest > 
shouldReturnUnsupportedIfRequestedEpochLessThanFirstEpoch STARTED

kafka.server.epoch.LeaderEpochFileCacheTest > 
shouldReturnUnsupportedIfRequestedEpochLessThanFirstEpoch PASSED

kafka.server.epoch.LeaderEpochFileCacheTest > 
shouldReturnNextAvailableEpochIfThereIsNoExactEpochForTheOneRequested STARTED

kafka.server.epoch.LeaderEpochFileCacheTest > 
shouldReturnNextAvailableEpochIfThereIsNoExactEpochForTheOneRequested PASSED

kafka.server.epoch.LeaderEpochFileCacheTest > 
shouldAddEpochAndMessageOffsetToCache STARTED

kafka.server.epoch.LeaderEpochFileCacheTest > 
shouldAddEpochAndMessageOffsetToCache PASSED

kafka.server.epoch.LeaderEpochFileCacheTest > 
shouldDropEntriesOnEpochBoundaryWhenRemovingLatestEntries STARTED

kafka.server.epoch.LeaderEpochFileCacheTest > 
shouldDropEntriesOnEpochBoundaryWhenRemovingLatestEntries PASSED

kafka.server.epoch.LeaderEpochFileCacheTest > 
shouldUpdateSavedOffsetWhenOffsetToClearToIsBetweenEpochs STARTED

kafka.server.epoch.LeaderEpochFileCacheTest > 
shouldUpdateSavedOffsetWhenOffsetToClearToIsBetweenEpochs PASSED

kafka.server.epoch.LeaderEpochFileCacheTest > 
shouldNotResetEpochHistoryTailIfUndefinedPassed STARTED

kafka.server.epoch.LeaderEpochFileCacheTest > 
shouldNotResetEpochHistoryTailIfUndefinedPassed PASSED

kafka.server.epoch.LeaderEpochFileCacheTest > 
shouldReturnUnsupportedIfNoEpochRecorded STARTED

kafka.server.epoch.LeaderEpochFileCacheTest > 
shouldReturnUnsupportedIfNoEpochRecorded PASSED

kafka.server.epoch.LeaderEpochFileCacheTest > 
shouldRetainLatestEpochOnClearAllEarliest STARTED

kafka.server.epoch.LeaderEpochFileCacheTest > 
shouldRetainLatestEpochOnClearAllEarliest PASSED

kafka.server.epoch.LeaderEpochFileCacheTest > 
shouldPersistEpochsBetweenInstances STARTED

kafka.server.epoch.LeaderEpochFileCacheTest > 
shouldPersistEpochsBetweenInstances PASSED

kafka.server.epoch.LeaderEpochFileCacheTest > 
shouldNotClearAnythingIfOffsetToFirstOffset STARTED

kafka.server.epoch.LeaderEpochFileCacheTest > 
shouldNotClearAnythingIfOffsetToFirstOffset PASSED

kafka.server.epoch.LeaderEpochFileCacheTest > 
shouldNotLetOffsetsGoBackwardsEvenIfEpochsProgress STARTED

kafka.server.epoch.LeaderEpochFileCacheTest > 
shouldNotLetOffsetsGoBackwardsEvenIfEpochsProgress PASSED

kafka.server.epoch.LeaderEpochFileCacheTest > 
shouldGetFirstOffsetOfSubsequentEpochWhenOffsetRequestedForPreviousEpoch STARTED

kafka.server.epoch.LeaderEpochFileCacheTest > 
shouldGetFirstOffsetOfSubsequentEpochWhenOffsetRequestedForPreviousEpoch PASSED

kafka.server.epoch.LeaderEpochFileCacheTest > 
shouldUpdateOffsetBetweenEpochBoundariesOnClearEarliest2 STARTED

kafka.server.epoch.LeaderEpochFileCacheTest > 
shouldUpdateOffsetBetweenEpochBoundariesOnClearEarliest2 PASSED

kafka.server.epoch.LeaderEpochFileCacheTest > shouldClearEarliestOnEmptyCache 
STARTED

kafka.server.epoch.LeaderEpochFileCacheTest > shouldClearEarliestOnEmptyCache 
PASSED

kafka.server.epoch.LeaderEpochFileCacheTest > 
shouldPreserveResetOffsetOnClearEarliestIfOneExists STARTED

kafka.server.epoch.LeaderEpochFileCacheTest > 
shouldPreserveResetOffsetOnClearEarliestIfOneExists PASSED

kafka.server.epoch.LeaderEpochFileCacheTest > 
shouldUpdateOffsetBetweenEpochBoundariesOnClearEarliest STARTED

kafka.server.epoch.LeaderEpochFileCacheTest > 
shouldUpdateOffsetBetweenEpochBoundariesOnClearEarliest PASSED

kafka.server.epoch.LeaderEpochFileCacheTest > 
shouldReturnInvalidOffsetIfEpochIsRequestedWhichIsNotCurrentlyTracked STARTED

kafka.server.epoch.LeaderEpochFileCacheTest > 
shouldReturnInvalidOffsetIfEpochIsRequestedWhichIsNotCurrentlyTracked PASSED

kafka.server.epoch.LeaderEpochFileCacheTest > shouldFetchEndOffsetOfEmptyCache 
STARTED

kafka.server.epoch.LeaderEpochFileCacheTest > shouldFetchEndOffsetOfEmptyCache 
PASSED

kafka.server.epoch.LeaderEpochFileCacheTest > 
shouldRetainLatestEpochOnClearAllEarliestAndUpdateItsOffset STARTED

kafka.server.epoch.LeaderEpochFileCacheTest > 
shouldRetainLatestEpochOnClearAllEarliestAndUpdateItsOffset PASSED

kafka.server.epoch.LeaderEpochFileCacheTest > 

Re: [VOTE] 0.11.0.0 RC0

2017-06-22 Thread Tom Crayford
Hi Ismal,

Sure. It's a standard heroku plan, the `extended-2`, which has 8 brokers.
We did several rounds of performance testing, some with low numbers of
partitions and topics (16 partitions with one topic, which we typically see
the highest throughput on) and some with many more (many thousands of
partitions and topics). Replication factor was always 3. Partitions per
broker was roughly equal in each round (we just use the inbuilt kafka
scripts to create topics). Lastly, all tests run were with SSL enabled for
both client and inter-broker traffic.

These are the same performance tests we've run for previous versions, and
wrote about when testing 0.10 here:
https://blog.heroku.com/apache-kafka-010-evaluating-performance-in-distributed-systems

In all performance tests we've ever done, Kafka has been network limited,
not disk or CPU. This, I think is common for Kafka setups and workloads,
but it does change significantly with what hardware you're using.

Because our performance tests run across cloud instances where there might
be contention and variability, we typically do a multiple runs on different
clusters with each setup before reporting results.

Thanks

Tom Crayford
Heroku Kafka

On Thu, Jun 22, 2017 at 4:08 PM, Ismael Juma  wrote:

> Hi Tom,
>
> Would you be able to share the details of the Kafka Cluster that you used
> for performance testing? We are interested in the number of brokers,
> topics, partitions per broker and replication factor. Thanks!
>
> Ismael
>
> On Mon, Jun 19, 2017 at 3:15 PM, Tom Crayford 
> wrote:
>
>> Hello,
>>
>> Heroku has been testing 0.11.0.0 RC0, mostly focussed on backwards
>> compatibility and performance. So far, we note a slight performance
>> increase from older versions when using not-current clients
>>
>> Testing a 0.9 client against 0.10.2.1 vs 0.11.0.0 rc0: 0.11.0.0 rc0 has
>> slightly higher throughput for both consumers and producers. We expect
>> this
>> because the message format improvements should lead to greater efficiency.
>>
>> We have not yet tested a 0.11.0.0 rc0 client against a 0.11.0.0 rc0
>> cluster, because our test setup needs updating for that.
>>
>> We've tested simple demo apps against 0.11.0.0rc0 (that we've run against
>> older clusters):
>> http://github.com/heroku/heroku-kafka-demo-ruby
>> https://github.com/heroku/heroku-kafka-demo-node
>> https://github.com/heroku/heroku-kafka-demo-java
>> https://github.com/heroku/heroku-kafka-demo-go
>>
>> This comprises a range of community supported clients: ruby-kafka,
>> no-kafka, the main JVM client and sarama.
>>
>> We didn't see any notable issues there, but it's worth noting that all of
>> these demo apps do little more than produce and consume messages.
>>
>> We have also tested failure handling in 0.11.0.0 rc0, by terminating
>> nodes.
>> Note that this does *not* test any of the new exactly-once features, just
>> "can I terminate a broker whilst producing to/consuming from the cluster.
>> We see the same behaviour as 0.10.2.1 there, just a round of errors from
>> the client, like this:
>>
>> org.apache.kafka.common.errors.NetworkException: The server disconnected
>> before a response was received.
>>
>> but that's expected.
>>
>> We have tested creating and deleting topics heavily, including deleting a
>> topic in the middle of broker failure (the controller waits for the broker
>> to come back before being deleted, as expected)
>>
>> We have also tested upgrading a 0.10.2.1 cluster to 0.11.0.0 rc0 without
>> issue
>>
>> We have also tested partition preferred leader election (manual, with the
>> admin script), and partition rebalancing to grow and shrink clusters.
>>
>> We have not yet tested the exactly once features, because various core
>> committers said that they didn't expect this feature to be perfect in this
>> release. We expect to test this this week though.
>>
>> Given that the blockers fixed between RC0 and RC1 haven't changed much in
>> the areas we tested, I think the positive results here still apply.
>>
>> Thanks
>>
>> Tom Crayford
>> Heroku Kafka
>>
>> On Thu, Jun 8, 2017 at 2:55 PM, Ismael Juma  wrote:
>>
>> > Hello Kafka users, developers and client-developers,
>> >
>> > This is the first candidate for release of Apache Kafka 0.11.0.0. It's
>> > worth noting that there are a small number of unresolved issues
>> (including
>> > documentation and system tests) related to the new AdminClient and
>> > Exactly-once functionality[1] that we hope to resolve in the next few
>> days.
>> > To encourage early testing, we are releasing the first release candidate
>> > now, but there will be at least one more release candidate.
>> >
>> > Any and all testing is welcome, but the following areas are worth
>> > highlighting:
>> >
>> > 1. Client developers should verify that their clients can
>> produce/consume
>> > to/from 0.11.0 brokers (ideally with compressed and uncompressed data).
>> > Even though we have compatibility 

Re: [VOTE] 0.11.0.0 RC0

2017-06-22 Thread Ismael Juma
Hi Tom,

Would you be able to share the details of the Kafka Cluster that you used
for performance testing? We are interested in the number of brokers,
topics, partitions per broker and replication factor. Thanks!

Ismael

On Mon, Jun 19, 2017 at 3:15 PM, Tom Crayford  wrote:

> Hello,
>
> Heroku has been testing 0.11.0.0 RC0, mostly focussed on backwards
> compatibility and performance. So far, we note a slight performance
> increase from older versions when using not-current clients
>
> Testing a 0.9 client against 0.10.2.1 vs 0.11.0.0 rc0: 0.11.0.0 rc0 has
> slightly higher throughput for both consumers and producers. We expect this
> because the message format improvements should lead to greater efficiency.
>
> We have not yet tested a 0.11.0.0 rc0 client against a 0.11.0.0 rc0
> cluster, because our test setup needs updating for that.
>
> We've tested simple demo apps against 0.11.0.0rc0 (that we've run against
> older clusters):
> http://github.com/heroku/heroku-kafka-demo-ruby
> https://github.com/heroku/heroku-kafka-demo-node
> https://github.com/heroku/heroku-kafka-demo-java
> https://github.com/heroku/heroku-kafka-demo-go
>
> This comprises a range of community supported clients: ruby-kafka,
> no-kafka, the main JVM client and sarama.
>
> We didn't see any notable issues there, but it's worth noting that all of
> these demo apps do little more than produce and consume messages.
>
> We have also tested failure handling in 0.11.0.0 rc0, by terminating nodes.
> Note that this does *not* test any of the new exactly-once features, just
> "can I terminate a broker whilst producing to/consuming from the cluster.
> We see the same behaviour as 0.10.2.1 there, just a round of errors from
> the client, like this:
>
> org.apache.kafka.common.errors.NetworkException: The server disconnected
> before a response was received.
>
> but that's expected.
>
> We have tested creating and deleting topics heavily, including deleting a
> topic in the middle of broker failure (the controller waits for the broker
> to come back before being deleted, as expected)
>
> We have also tested upgrading a 0.10.2.1 cluster to 0.11.0.0 rc0 without
> issue
>
> We have also tested partition preferred leader election (manual, with the
> admin script), and partition rebalancing to grow and shrink clusters.
>
> We have not yet tested the exactly once features, because various core
> committers said that they didn't expect this feature to be perfect in this
> release. We expect to test this this week though.
>
> Given that the blockers fixed between RC0 and RC1 haven't changed much in
> the areas we tested, I think the positive results here still apply.
>
> Thanks
>
> Tom Crayford
> Heroku Kafka
>
> On Thu, Jun 8, 2017 at 2:55 PM, Ismael Juma  wrote:
>
> > Hello Kafka users, developers and client-developers,
> >
> > This is the first candidate for release of Apache Kafka 0.11.0.0. It's
> > worth noting that there are a small number of unresolved issues
> (including
> > documentation and system tests) related to the new AdminClient and
> > Exactly-once functionality[1] that we hope to resolve in the next few
> days.
> > To encourage early testing, we are releasing the first release candidate
> > now, but there will be at least one more release candidate.
> >
> > Any and all testing is welcome, but the following areas are worth
> > highlighting:
> >
> > 1. Client developers should verify that their clients can produce/consume
> > to/from 0.11.0 brokers (ideally with compressed and uncompressed data).
> > Even though we have compatibility tests for older Java clients and we
> have
> > verified that librdkafka works fine, the only way to be sure is to test
> > every client.
> > 2. Performance and stress testing. Heroku and LinkedIn have helped with
> > this in the past (and issues have been found and fixed).
> > 3. End users can verify that their apps work correctly with the new
> > release.
> >
> > This is a major version release of Apache Kafka. It includes 32 new KIPs.
> > See
> > the release notes and release plan (https://cwiki.apache.org/
> > confluence/display/KAFKA/Release+Plan+0.11.0.0) for more details. A few
> > feature highlights:
> >
> > * Exactly-once delivery and transactional messaging
> > * Streams exactly-once semantics
> > * Admin client with support for topic, ACLs and config management
> > * Record headers
> > * Request rate quotas
> > * Improved resiliency: replication protocol improvement and
> single-threaded
> > controller
> > * Richer and more efficient message format
> >
> > Release notes for the 0.11.0.0 release:
> > http://home.apache.org/~ijuma/kafka-0.11.0.0-rc0/RELEASE_NOTES.html
> >
> > Kafka's KEYS file containing PGP keys we use to sign the release:
> > http://kafka.apache.org/KEYS
> >
> > * Release artifacts to be voted upon (source and binary):
> > http://home.apache.org/~ijuma/kafka-0.11.0.0-rc0/
> >
> > * Maven artifacts to be voted upon:
> > 

[GitHub] kafka pull request #3409: MINOR: Switch ZK client logging to INFO

2017-06-22 Thread ijuma
GitHub user ijuma opened a pull request:

https://github.com/apache/kafka/pull/3409

MINOR: Switch ZK client logging to INFO



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/ijuma/kafka tweak-log-config

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/3409.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3409


commit 412a6c58ba3b027863d7f9db20ec25d6f32e559d
Author: Ismael Juma 
Date:   2017-06-22T14:55:31Z

MINOR: Switch ZK client logging to INFO




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] kafka pull request #3293: KAFKA-4658: Improve test coverage InMemoryKeyValue...

2017-06-22 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/kafka/pull/3293


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


Re: [DISCUSS] KIP-170: Enhanced TopicCreatePolicy and introduction of TopicDeletePolicy

2017-06-22 Thread Ismael Juma
Thanks for the KIP, Edoardo. A few comments:

1. Have you considered extending RequestMetadata with the additional
information you need? We could add Cluster to it, which has topic
assignment information, for example. This way, there would be no need for a
V2 interface.

2. Something else that could be useful is passing an instance of `Session`
so that one can provide custom behaviour depending on the logged in user.
Would this be useful?

3. For the delete case, we may consider passing a class instead of just a
string to the validate method so that we have options if we need to extend
it.

4. Do we want to enhance the AlterConfigs policy as well?

Ismael

On Thu, Jun 22, 2017 at 2:41 PM, Edoardo Comar  wrote:

> Hi all,
>
> We've drafted "KIP-170: Enhanced TopicCreatePolicy and introduction of
> TopicDeletePolicy" for discussion:
>
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-170%3A+Enhanced+
> TopicCreatePolicy+and+introduction+of+TopicDeletePolicy
>
> Please take a look. Your feedback is welcome and much needed.
>
> Thanks,
> Edoardo
> --
> Edoardo Comar
> IBM Message Hub
> eco...@uk.ibm.com
> IBM UK Ltd, Hursley Park, SO21 2JN
> Unless stated otherwise above:
> IBM United Kingdom Limited - Registered in England and Wales with number
> 741598.
> Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU
>


[GitHub] kafka pull request #3290: KAFKA-4655: Improve test coverage of CompositeRead...

2017-06-22 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/kafka/pull/3290


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


Long awaiting pull request

2017-06-22 Thread Arseniy Tashoyan
Friends,

I have created a pull request 1.5 months ago and asked 5 or 6 people for
review:
https://github.com/apache/kafka/pull/3051
Nobody answers.  Maybe I am doing something wrong? I understand, everybody
is busy, but it would be great to receive any communication.
I cannot also assign the corresponding Jira issue on myself. Probably
because I don't have permissions. I cannot find why.
Is there anybody on the other end of the wire?

@tashoyan


[DISCUSS] KIP-170: Enhanced TopicCreatePolicy and introduction of TopicDeletePolicy

2017-06-22 Thread Edoardo Comar
Hi all,

We've drafted "KIP-170: Enhanced TopicCreatePolicy and introduction of 
TopicDeletePolicy" for discussion:

https://cwiki.apache.org/confluence/display/KAFKA/KIP-170%3A+Enhanced+TopicCreatePolicy+and+introduction+of+TopicDeletePolicy

Please take a look. Your feedback is welcome and much needed.

Thanks,
Edoardo
--
Edoardo Comar
IBM Message Hub
eco...@uk.ibm.com
IBM UK Ltd, Hursley Park, SO21 2JN
Unless stated otherwise above:
IBM United Kingdom Limited - Registered in England and Wales with number 
741598. 
Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU


[jira] [Resolved] (KAFKA-4059) Documentation still refers to AsyncProducer and SyncProducer

2017-06-22 Thread Ismael Juma (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-4059?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ismael Juma resolved KAFKA-4059.

   Resolution: Fixed
Fix Version/s: 0.11.0.0

> Documentation still refers to AsyncProducer and SyncProducer
> 
>
> Key: KAFKA-4059
> URL: https://issues.apache.org/jira/browse/KAFKA-4059
> Project: Kafka
>  Issue Type: Bug
>  Components: producer 
>Affects Versions: 0.10.0.1
>Reporter: Andrew B
>Assignee: Tom Bentley
>  Labels: patch-available
> Fix For: 0.11.0.0
>
>
> The 0.10 docs are still referring to AsyncProducer and SyncProducer.
> See: https://github.com/apache/kafka/search?utf8=%E2%9C%93=AsyncProducer



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[GitHub] kafka pull request #3372: Provide link to ZooKeeper within Quickstart

2017-06-22 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/kafka/pull/3372


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Created] (KAFKA-5497) KIP-170: Enhanced CreateTopicPolicy and DeleteTopicPolicy

2017-06-22 Thread Edoardo Comar (JIRA)
Edoardo Comar created KAFKA-5497:


 Summary: KIP-170: Enhanced CreateTopicPolicy and DeleteTopicPolicy
 Key: KAFKA-5497
 URL: https://issues.apache.org/jira/browse/KAFKA-5497
 Project: Kafka
  Issue Type: Improvement
Reporter: Edoardo Comar
Assignee: Edoardo Comar


JIRA  for the implementation of KIP-170

https://cwiki.apache.org/confluence/display/KAFKA/KIP-170%3A+Enhanced+TopicCreatePolicy+and+introduction+of+TopicDeletePolicy



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[GitHub] kafka pull request #3385: KAFKA-4059: Documentation still refers to AsyncPro...

2017-06-22 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/kafka/pull/3385


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] kafka pull request #1535: KAFKA-3727 - ConsumerListener for UnknownTopicOrPa...

2017-06-22 Thread edoardocomar
Github user edoardocomar closed the pull request at:

https://github.com/apache/kafka/pull/1535


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] kafka pull request #1545: KAFKA-3727 - ClientListener for UnknownTopicOrPart...

2017-06-22 Thread edoardocomar
Github user edoardocomar closed the pull request at:

https://github.com/apache/kafka/pull/1545


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] kafka pull request #3408: KAFKA-5490: Skip empty record batches in the consu...

2017-06-22 Thread ijuma
GitHub user ijuma opened a pull request:

https://github.com/apache/kafka/pull/3408

KAFKA-5490: Skip empty record batches in the consumer

The proper fix for KAFKA-5490 (including tests) is in
https://github.com/apache/kafka/pull/3406.

This is just the consumer change that will allow us
to fix the broker without breaking the existing
consumers. This is a safe change even if we decide
to go with a different option for KAFKA-5490 and
I'd like to include it in RC2.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/ijuma/kafka 
kafka-5490-consumer-should-skip-empty-batches

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/3408.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3408


commit b1121c48993b62684714b6884f392554d844971a
Author: Jason Gustafson 
Date:   2017-06-21T23:55:36Z

KAFKA-5490: Skip empty record batches in the consumer




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] kafka pull request #3402: KAFKA-5486: org.apache.kafka logging should go to ...

2017-06-22 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/kafka/pull/3402


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


Re: Handling 2 to 3 Million Events before Kafka

2017-06-22 Thread SenthilKumar K
Hi Barton -  I think we can use Async Producer with Call Back api(s) to
keep track on which event failed ..

--Senthil

On Thu, Jun 22, 2017 at 4:58 PM, SenthilKumar K 
wrote:

> Thanks Barton.. I'll look into these ..
>
> On Thu, Jun 22, 2017 at 7:12 AM, Garrett Barton 
> wrote:
>
>> Getting good concurrency in a webapp is more than doable.  Check out
>> these benchmarks:
>> https://www.techempower.com/benchmarks/#section=data-r14=ph=db
>> I linked to the single query one because thats closest to a single
>> operation like you will be doing.
>>
>> I'd also note if the data delivery does not need to be guaranteed you
>> could go faster switching the web servers over to UDP and using async mode
>> on the kafka producers.
>>
>> On Wed, Jun 21, 2017 at 2:23 PM, Tauzell, Dave <
>> dave.tauz...@surescripts.com> wrote:
>>
>>> I’m not really familiar with Netty so I won’t be of much help.   Maybe
>>> try posting on a Netty forum to see what they think?
>>> -Dave
>>>
>>> From: SenthilKumar K [mailto:senthilec...@gmail.com]
>>> Sent: Wednesday, June 21, 2017 10:28 AM
>>> To: Tauzell, Dave
>>> Cc: us...@kafka.apache.org; senthilec...@apache.org;
>>> dev@kafka.apache.org
>>> Subject: Re: Handling 2 to 3 Million Events before Kafka
>>>
>>> So netty would work for this case ?  I do have netty server and seems to
>>> be i'm not getting the expected results .. here is the git
>>> https://github.com/senthilec566/netty4-server , is this right
>>> implementation ?
>>>
>>> Cheers,
>>> Senthil
>>>
>>> On Wed, Jun 21, 2017 at 7:45 PM, Tauzell, Dave <
>>> dave.tauz...@surescripts.com>
>>> wrote:
>>> I see.
>>>
>>> 1.   You don’t want the 100k machines sending directly to kafka.
>>>
>>> 2.   You can only have a small number of web servers
>>>
>>> People certainly have web-servers handling over 100k concurrent
>>> connections.  See this for some examples:
>>> https://github.com/smallnest/C1000K-Servers .
>>>
>>> It seems possible with the right sort of kafka producer tuning.
>>>
>>> -Dave
>>>
>>> From: SenthilKumar K [mailto:senthilec...@gmail.com>> senthilec...@gmail.com>]
>>> Sent: Wednesday, June 21, 2017 8:55 AM
>>> To: Tauzell, Dave
>>> Cc: us...@kafka.apache.org;
>>> senthilec...@apache.org;
>>> dev@kafka.apache.org; Senthil kumar
>>> Subject: Re: Handling 2 to 3 Million Events before Kafka
>>>
>>> Thanks Jeyhun. Yes http server would be problematic here w.r.t network ,
>>> memory ..
>>>
>>> Hi Dave ,  The problem is not with Kafka , it's all about how do you
>>> handle huge data before kafka.  I did a simple test with 5 node Kafka
>>> Cluster which gives good result ( ~950 MB/s ) ..So Kafka side i dont see a
>>> scaling issue ...
>>>
>>> All we are trying is before kafka how do we handle messages from
>>> different servers ...  Webservers can send fast to kafka but still i can
>>> handle only 50k events per second which is less for my use case.. also i
>>> can't deploy 20 webservers to handle this load. I'm looking for an option
>>> what could be the best candidate before kafka , it should be super fast in
>>> getting all and send it to kafka producer ..
>>>
>>>
>>> --Senthil
>>>
>>> On Wed, Jun 21, 2017 at 6:53 PM, Tauzell, Dave <
>>> dave.tauz...@surescripts.com>
>>> wrote:
>>> What are your configurations?
>>>
>>> - production
>>> - brokers
>>> - consumers
>>>
>>> Is the problem that web servers cannot send to Kafka fast enough or your
>>> consumers cannot process messages off of kafka fast enough?
>>> What is the average size of these messages?
>>>
>>> -Dave
>>>
>>> -Original Message-
>>> From: SenthilKumar K [mailto:senthilec...@gmail.com>> senthilec...@gmail.com>]
>>> Sent: Wednesday, June 21, 2017 7:58 AM
>>> To: us...@kafka.apache.org
>>> Cc: senthilec...@apache.org; Senthil
>>> kumar; dev@kafka.apache.org
>>> Subject: Handling 2 to 3 Million Events before Kafka
>>>
>>> Hi Team ,   Sorry if this question is irrelevant to Kafka Group ...
>>>
>>> I have been trying to solve problem of handling 5 GB/sec ingestion.
>>> Kafka is really good candidate for us to handle this ingestion rate ..
>>>
>>>
>>> 100K machines > { Http Server (Jetty/Netty) } --> Kafka Cluster..
>>>
>>> I see the problem in Http Server where it can't handle beyond 50K events
>>> per instance ..  I'm thinking some other solution would be right choice
>>> before Kafka ..
>>>
>>> Anyone worked on similar use case and similar load ?
>>> Suggestions/Thoughts ?
>>>
>>> --Senthil
>>> This e-mail and any files transmitted with it are confidential, may
>>> contain sensitive information, and are intended solely for the use of the
>>> individual or entity to whom they are addressed. If you have received this
>>> e-mail in error, please 

Re: Handling 2 to 3 Million Events before Kafka

2017-06-22 Thread SenthilKumar K
Thanks Barton.. I'll look into these ..

On Thu, Jun 22, 2017 at 7:12 AM, Garrett Barton 
wrote:

> Getting good concurrency in a webapp is more than doable.  Check out these
> benchmarks:
> https://www.techempower.com/benchmarks/#section=data-r14=ph=db
> I linked to the single query one because thats closest to a single
> operation like you will be doing.
>
> I'd also note if the data delivery does not need to be guaranteed you
> could go faster switching the web servers over to UDP and using async mode
> on the kafka producers.
>
> On Wed, Jun 21, 2017 at 2:23 PM, Tauzell, Dave <
> dave.tauz...@surescripts.com> wrote:
>
>> I’m not really familiar with Netty so I won’t be of much help.   Maybe
>> try posting on a Netty forum to see what they think?
>> -Dave
>>
>> From: SenthilKumar K [mailto:senthilec...@gmail.com]
>> Sent: Wednesday, June 21, 2017 10:28 AM
>> To: Tauzell, Dave
>> Cc: us...@kafka.apache.org; senthilec...@apache.org; dev@kafka.apache.org
>> Subject: Re: Handling 2 to 3 Million Events before Kafka
>>
>> So netty would work for this case ?  I do have netty server and seems to
>> be i'm not getting the expected results .. here is the git
>> https://github.com/senthilec566/netty4-server , is this right
>> implementation ?
>>
>> Cheers,
>> Senthil
>>
>> On Wed, Jun 21, 2017 at 7:45 PM, Tauzell, Dave <
>> dave.tauz...@surescripts.com> wrote:
>> I see.
>>
>> 1.   You don’t want the 100k machines sending directly to kafka.
>>
>> 2.   You can only have a small number of web servers
>>
>> People certainly have web-servers handling over 100k concurrent
>> connections.  See this for some examples:  https://github.com/smallnest/C
>> 1000K-Servers .
>>
>> It seems possible with the right sort of kafka producer tuning.
>>
>> -Dave
>>
>> From: SenthilKumar K [mailto:senthilec...@gmail.com> senthilec...@gmail.com>]
>> Sent: Wednesday, June 21, 2017 8:55 AM
>> To: Tauzell, Dave
>> Cc: us...@kafka.apache.org;
>> senthilec...@apache.org;
>> dev@kafka.apache.org; Senthil kumar
>> Subject: Re: Handling 2 to 3 Million Events before Kafka
>>
>> Thanks Jeyhun. Yes http server would be problematic here w.r.t network ,
>> memory ..
>>
>> Hi Dave ,  The problem is not with Kafka , it's all about how do you
>> handle huge data before kafka.  I did a simple test with 5 node Kafka
>> Cluster which gives good result ( ~950 MB/s ) ..So Kafka side i dont see a
>> scaling issue ...
>>
>> All we are trying is before kafka how do we handle messages from
>> different servers ...  Webservers can send fast to kafka but still i can
>> handle only 50k events per second which is less for my use case.. also i
>> can't deploy 20 webservers to handle this load. I'm looking for an option
>> what could be the best candidate before kafka , it should be super fast in
>> getting all and send it to kafka producer ..
>>
>>
>> --Senthil
>>
>> On Wed, Jun 21, 2017 at 6:53 PM, Tauzell, Dave <
>> dave.tauz...@surescripts.com> wrote:
>> What are your configurations?
>>
>> - production
>> - brokers
>> - consumers
>>
>> Is the problem that web servers cannot send to Kafka fast enough or your
>> consumers cannot process messages off of kafka fast enough?
>> What is the average size of these messages?
>>
>> -Dave
>>
>> -Original Message-
>> From: SenthilKumar K [mailto:senthilec...@gmail.com> senthilec...@gmail.com>]
>> Sent: Wednesday, June 21, 2017 7:58 AM
>> To: us...@kafka.apache.org
>> Cc: senthilec...@apache.org; Senthil
>> kumar; dev@kafka.apache.org
>> Subject: Handling 2 to 3 Million Events before Kafka
>>
>> Hi Team ,   Sorry if this question is irrelevant to Kafka Group ...
>>
>> I have been trying to solve problem of handling 5 GB/sec ingestion. Kafka
>> is really good candidate for us to handle this ingestion rate ..
>>
>>
>> 100K machines > { Http Server (Jetty/Netty) } --> Kafka Cluster..
>>
>> I see the problem in Http Server where it can't handle beyond 50K events
>> per instance ..  I'm thinking some other solution would be right choice
>> before Kafka ..
>>
>> Anyone worked on similar use case and similar load ? Suggestions/Thoughts
>> ?
>>
>> --Senthil
>> This e-mail and any files transmitted with it are confidential, may
>> contain sensitive information, and are intended solely for the use of the
>> individual or entity to whom they are addressed. If you have received this
>> e-mail in error, please notify the sender by reply e-mail immediately and
>> destroy all copies of the e-mail and any attachments.
>>
>>
>>
>


[GitHub] kafka pull request #3106: KAFKA-4785: Records from internal repartitioning t...

2017-06-22 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/kafka/pull/3106


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


Re: [DISCUSS] Streams DSL/StateStore Refactoring

2017-06-22 Thread Damian Guy
Thanks everyone. My latest attempt is below. It builds on the fluent
approach, but i think it is slightly nicer.
I agree with some of what Eno said about mixing configy stuff in the DSL,
but i think that enabling caching and enabling logging are things that
aren't actually config. I'd probably not add withLogConfig(...) (even
though it is below) as this is actually config and we already have a way of
doing that, via the StateStoreSupplier. Arguably we could use the
StateStoreSupplier for disabling caching etc, but as it stands that is a
bit of a tedious process for someone that just wants to use the default
storage engine, but not have caching enabled.

There is also an orthogonal concern that Guozhang alluded to If you
want to plug in a custom storage engine and you want it to be logged etc,
you would currently need to implement that yourself. Ideally we can provide
a way where we will wrap the custom store with logging, metrics, etc. I
need to think about where this fits, it is probably more appropriate on the
Stores API.

final KeyValueMapper keyMapper = null;
// count with mapped key
final KTable count = stream.grouped()
.withKeyMapper(keyMapper)
.withKeySerde(Serdes.Long())
.withValueSerde(Serdes.String())
.withQueryableName("my-store")
.count();

// windowed count
final KTable windowedCount = stream.grouped()
.withQueryableName("my-window-store")
.windowed(TimeWindows.of(10L).until(10))
.count();

// windowed reduce
final Reducer windowedReducer = null;
final KTable windowedReduce = stream.grouped()
.withQueryableName("my-window-store")
.windowed(TimeWindows.of(10L).until(10))
.reduce(windowedReducer);

final Aggregator aggregator = null;
final Initializer init = null;

// aggregate
final KTable aggregate = stream.grouped()
.withQueryableName("my-aggregate-store")
.aggregate(aggregator, init, Serdes.Long());

final StateStoreSupplier> stateStoreSupplier = null;

// aggregate with custom store
final KTable aggWithCustomStore = stream.grouped()
.withStateStoreSupplier(stateStoreSupplier)
.aggregate(aggregator, init);

// disable caching
stream.grouped()
.withQueryableName("name")
.withCachingEnabled(false)
.count();

// disable logging
stream.grouped()
.withQueryableName("q")
.withLoggingEnabled(false)
.count();

// override log config
final Reducer reducer = null;
stream.grouped()
.withLogConfig(Collections.singletonMap("segment.size", "10"))
.reduce(reducer);


If anyone wants to play around with this you can find the code here:
https://github.com/dguy/kafka/tree/dsl-experiment

Note: It won't actually work as most of the methods just return null.

Thanks,
Damian


On Thu, 22 Jun 2017 at 11:18 Ismael Juma  wrote:

> Thanks Damian. I think both options have pros and cons. And both are better
> than overload abuse.
>
> The fluent API approach reads better, no mention of builder or build
> anywhere. The main downside is that the method signatures are a little less
> clear. By reading the method signature, one doesn't necessarily knows what
> it returns. Also, one needs to figure out the special method (`table()` in
> this case) that gives you what you actually care about (`KTable` in this
> case). Not major issues, but worth mentioning while doing the comparison.
>
> The builder approach avoids the issues mentioned above, but it doesn't read
> as well.
>
> Ismael
>
> On Wed, Jun 21, 2017 at 3:37 PM, Damian Guy  wrote:
>
> > Hi,
> >
> > I'd like to get a discussion going around some of the API choices we've
> > made in the DLS. In particular those that relate to stateful operations
> > (though this could expand).
> > As it stands we lean heavily on overloaded methods in the API, i.e, there
> > are 9 overloads for KGroupedStream.count(..)! It is becoming noisy and i
> > feel it is only going to get worse as we add more optional params. In
> > particular we've had some requests to be able to turn caching off, or
> > change log configs,  on a per operator basis (note this can be done now
> if
> > you pass in a StateStoreSupplier, but this can be a bit cumbersome).
> >
> > So this is a bit of an open question. How can we change the DSL overloads
> > so that it flows, is simple to use and understand, and is easily extended
> > in the future?
> >
> > One option would be to use a fluent API approach for providing the
> optional
> > params, so something like this:
> >
> > groupedStream.count()
> >.withStoreName("name")
> >.withCachingEnabled(false)
> >.withLoggingEnabled(config)
> >.table()
> >
> >
> >
> > Another option would be to provide a Builder to the count method, so it
> > would look something like this:
> > 

[jira] [Created] (KAFKA-5496) Consistency in documentation

2017-06-22 Thread Tom Bentley (JIRA)
Tom Bentley created KAFKA-5496:
--

 Summary: Consistency in documentation
 Key: KAFKA-5496
 URL: https://issues.apache.org/jira/browse/KAFKA-5496
 Project: Kafka
  Issue Type: Improvement
  Components: documentation
Reporter: Tom Bentley
Assignee: Tom Bentley
Priority: Minor


The documentation is full of inconsistencies, including

* Some tool examples feature a {{>}} prompt, but others do not.
* Code/config in {{}} tags with different amounts of indentation (often 
there's no actual need for indentation at all)
* Missing or inconsistent typographical conventions for file and script names, 
class and method names etc, making some of the documentation harder to read.
* {{}} tags for syntax highlighting, but no syntax 
highlighting on the site.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[GitHub] kafka pull request #3407: KAFKA-3881: Remove the replacing logic from "." to...

2017-06-22 Thread tombentley
GitHub user tombentley opened a pull request:

https://github.com/apache/kafka/pull/3407

KAFKA-3881: Remove the replacing logic from "." to "_" in Fetcher



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/tombentley/kafka KAFKA-3881

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/3407.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3407


commit be0de3a22b30648e2c811ee1442145702024af7a
Author: Tom Bentley 
Date:   2017-06-22T10:28:58Z

KAFKA-3881: Remove the replacing logic from "." to "_" in Fetcher




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


Re: [DISCUSS] Streams DSL/StateStore Refactoring

2017-06-22 Thread Ismael Juma
Thanks Damian. I think both options have pros and cons. And both are better
than overload abuse.

The fluent API approach reads better, no mention of builder or build
anywhere. The main downside is that the method signatures are a little less
clear. By reading the method signature, one doesn't necessarily knows what
it returns. Also, one needs to figure out the special method (`table()` in
this case) that gives you what you actually care about (`KTable` in this
case). Not major issues, but worth mentioning while doing the comparison.

The builder approach avoids the issues mentioned above, but it doesn't read
as well.

Ismael

On Wed, Jun 21, 2017 at 3:37 PM, Damian Guy  wrote:

> Hi,
>
> I'd like to get a discussion going around some of the API choices we've
> made in the DLS. In particular those that relate to stateful operations
> (though this could expand).
> As it stands we lean heavily on overloaded methods in the API, i.e, there
> are 9 overloads for KGroupedStream.count(..)! It is becoming noisy and i
> feel it is only going to get worse as we add more optional params. In
> particular we've had some requests to be able to turn caching off, or
> change log configs,  on a per operator basis (note this can be done now if
> you pass in a StateStoreSupplier, but this can be a bit cumbersome).
>
> So this is a bit of an open question. How can we change the DSL overloads
> so that it flows, is simple to use and understand, and is easily extended
> in the future?
>
> One option would be to use a fluent API approach for providing the optional
> params, so something like this:
>
> groupedStream.count()
>.withStoreName("name")
>.withCachingEnabled(false)
>.withLoggingEnabled(config)
>.table()
>
>
>
> Another option would be to provide a Builder to the count method, so it
> would look something like this:
> groupedStream.count(new
> CountBuilder("storeName").withCachingEnabled(false).build())
>
> Another option is to say: Hey we don't need this, what are you on about!
>
> The above has focussed on state store related overloads, but the same ideas
> could  be applied to joins etc, where we presently have many join methods
> and many overloads.
>
> Anyway, i look forward to hearing your opinions.
>
> Thanks,
> Damian
>


Re: [DISCUSS] Streams DSL/StateStore Refactoring

2017-06-22 Thread Jan Filipiak

Hi Eno,

I am less interested in the user facing interface but more in the actual 
implementation. Any hints where I can follow the discussion on this? As 
I still want to discuss upstreaming of KAFKA-3705 with someone


Best Jan


On 21.06.2017 17:24, Eno Thereska wrote:

(cc’ing user-list too)

Given that we already have StateStoreSuppliers that are configurable using the 
fluent-like API, probably it’s worth discussing the other examples with joins 
and serdes first since those have many overloads and are in need of some TLC.

So following your example, I guess you’d have something like:
.join()
.withKeySerdes(…)
.withValueSerdes(…)
.withJoinType(“outer”)

etc?

I like the approach since it still remains declarative and it’d reduce the 
number of overloads by quite a bit.

Eno


On Jun 21, 2017, at 3:37 PM, Damian Guy  wrote:

Hi,

I'd like to get a discussion going around some of the API choices we've
made in the DLS. In particular those that relate to stateful operations
(though this could expand).
As it stands we lean heavily on overloaded methods in the API, i.e, there
are 9 overloads for KGroupedStream.count(..)! It is becoming noisy and i
feel it is only going to get worse as we add more optional params. In
particular we've had some requests to be able to turn caching off, or
change log configs,  on a per operator basis (note this can be done now if
you pass in a StateStoreSupplier, but this can be a bit cumbersome).

So this is a bit of an open question. How can we change the DSL overloads
so that it flows, is simple to use and understand, and is easily extended
in the future?

One option would be to use a fluent API approach for providing the optional
params, so something like this:

groupedStream.count()
   .withStoreName("name")
   .withCachingEnabled(false)
   .withLoggingEnabled(config)
   .table()



Another option would be to provide a Builder to the count method, so it
would look something like this:
groupedStream.count(new
CountBuilder("storeName").withCachingEnabled(false).build())

Another option is to say: Hey we don't need this, what are you on about!

The above has focussed on state store related overloads, but the same ideas
could  be applied to joins etc, where we presently have many join methods
and many overloads.

Anyway, i look forward to hearing your opinions.

Thanks,
Damian




Re: [DISCUSS] Streams DSL/StateStore Refactoring

2017-06-22 Thread Eno Thereska
Note that while I agree with the initial proposal (withKeySerdes, withJoinType, 
etc), I don't agree with things like .materialize(), .enableCaching(), 
.enableLogging(). 

The former maintain the declarative DSL, while the later break the declarative 
part by mixing system decisions in the DSL.  I think there is a difference 
between the two proposals.

Eno

> On 22 Jun 2017, at 03:46, Guozhang Wang  wrote:
> 
> I have been thinking about reducing all these overloaded functions for
> stateful operations (there are some other places that introduces overloaded
> functions but let's focus on these only in this discussion), what I used to
> have is to use some "materialize" function on the KTables, like:
> 
> ---
> 
> // specifying the topology
> 
> KStream stream1 = builder.stream();
> KTable table1 = stream1.groupby(...).aggregate(initializer, aggregator,
> sessionMerger, sessionWindows);  // do not allow to pass-in a state store
> supplier here any more
> 
> // additional specs along with the topology above
> 
> table1.materialize("queryableStoreName"); // or..
> table1.materialize("queryableStoreName").enableCaching().enableLogging();
> // or..
> table1.materialize(stateStoreSupplier); // add the metrics / logging /
> caching / windowing functionalities on top of the store, or..
> table1.materialize(stateStoreSupplier).enableCaching().enableLogging(); //
> etc..
> 
> ---
> 
> But thinking about it more, I feel Damian's first proposal is better since
> my proposal would likely to break the concatenation (e.g. we may not be
> able to do sth. like "table1.filter().map().groupBy().aggregate()" if we
> want to use different specs for the intermediate filtered KTable).
> 
> 
> But since this is a incompatibility change, and we are going to remove the
> compatibility annotations soon it means we only have one chance and we
> really have to make it right. So I'd call out for anyone try to rewrite
> your examples / demo code with the proposed new API and see if it feel
> natural, for example, if I want to use a different storage engine than the
> default rockDB engine how could I easily specify that with the proposed
> APIs?
> 
> Meanwhile Damian could you provide a formal set of APIs for people to
> exercise on them? Also could you briefly describe how custom storage
> engines could be swapped in with the above APIs?
> 
> 
> 
> Guozhang
> 
> 
> On Wed, Jun 21, 2017 at 9:08 AM, Eno Thereska 
> wrote:
> 
>> To make it clear, it’s outlined by Damian, I just copy pasted what he told
>> me in person :)
>> 
>> Eno
>> 
>>> On Jun 21, 2017, at 4:40 PM, Bill Bejeck  wrote:
>>> 
>>> +1 for the approach outlined above by Eno.
>>> 
>>> On Wed, Jun 21, 2017 at 11:28 AM, Damian Guy 
>> wrote:
>>> 
 Thanks Eno.
 
 Yes i agree. We could apply this same approach to most of the operations
 where we have multiple overloads, i.e., we have a single method for each
 operation that takes the required parameters and everything else is
 specified as you have done above.
 
 On Wed, 21 Jun 2017 at 16:24 Eno Thereska 
>> wrote:
 
> (cc’ing user-list too)
> 
> Given that we already have StateStoreSuppliers that are configurable
 using
> the fluent-like API, probably it’s worth discussing the other examples
 with
> joins and serdes first since those have many overloads and are in need
>> of
> some TLC.
> 
> So following your example, I guess you’d have something like:
> .join()
>  .withKeySerdes(…)
>  .withValueSerdes(…)
>  .withJoinType(“outer”)
> 
> etc?
> 
> I like the approach since it still remains declarative and it’d reduce
 the
> number of overloads by quite a bit.
> 
> Eno
> 
>> On Jun 21, 2017, at 3:37 PM, Damian Guy  wrote:
>> 
>> Hi,
>> 
>> I'd like to get a discussion going around some of the API choices
>> we've
>> made in the DLS. In particular those that relate to stateful
>> operations
>> (though this could expand).
>> As it stands we lean heavily on overloaded methods in the API, i.e,
 there
>> are 9 overloads for KGroupedStream.count(..)! It is becoming noisy and
 i
>> feel it is only going to get worse as we add more optional params. In
>> particular we've had some requests to be able to turn caching off, or
>> change log configs,  on a per operator basis (note this can be done
>> now
> if
>> you pass in a StateStoreSupplier, but this can be a bit cumbersome).
>> 
>> So this is a bit of an open question. How can we change the DSL
 overloads
>> so that it flows, is simple to use and understand, and is easily
 extended
>> in the future?
>> 
>> One option would be to use a fluent API approach for providing 

Build failed in Jenkins: kafka-trunk-jdk7 #2444

2017-06-22 Thread Apache Jenkins Server
See 


Changes:

[damian.guy] MINOR: Turn off caching in demos for more understandable outputs

--
[...truncated 1.16 MB...]
kafka.server.ClientQuotaManagerTest > testUserClientIdQuotaParsing PASSED

kafka.server.ClientQuotaManagerTest > 
testUserClientQuotaParsingIdWithDefaultClientIdQuota STARTED

kafka.server.ClientQuotaManagerTest > 
testUserClientQuotaParsingIdWithDefaultClientIdQuota PASSED

kafka.server.ReplicaFetcherThreadTest > shouldFetchLeaderEpochOnFirstFetchOnly 
STARTED

kafka.server.ReplicaFetcherThreadTest > shouldFetchLeaderEpochOnFirstFetchOnly 
PASSED

kafka.server.ReplicaFetcherThreadTest > 
shouldPollIndefinitelyIfLeaderReturnsAnyException STARTED

kafka.server.ReplicaFetcherThreadTest > 
shouldPollIndefinitelyIfLeaderReturnsAnyException PASSED

kafka.server.ReplicaFetcherThreadTest > 
shouldTruncateToHighWatermarkIfLeaderReturnsUndefinedOffset STARTED

kafka.server.ReplicaFetcherThreadTest > 
shouldTruncateToHighWatermarkIfLeaderReturnsUndefinedOffset PASSED

kafka.server.ReplicaFetcherThreadTest > 
shouldTruncateToOffsetSpecifiedInEpochOffsetResponse STARTED

kafka.server.ReplicaFetcherThreadTest > 
shouldTruncateToOffsetSpecifiedInEpochOffsetResponse PASSED

kafka.server.ReplicaFetcherThreadTest > shouldHandleExceptionFromBlockingSend 
STARTED

kafka.server.ReplicaFetcherThreadTest > shouldHandleExceptionFromBlockingSend 
PASSED

kafka.server.ReplicaFetcherThreadTest > 
shouldNotIssueLeaderEpochRequestIfInterbrokerVersionBelow11 STARTED

kafka.server.ReplicaFetcherThreadTest > 
shouldNotIssueLeaderEpochRequestIfInterbrokerVersionBelow11 PASSED

kafka.server.ReplicaFetcherThreadTest > 
shouldMovePartitionsOutOfTruncatingLogState STARTED

kafka.server.ReplicaFetcherThreadTest > 
shouldMovePartitionsOutOfTruncatingLogState PASSED

kafka.server.ReplicaFetcherThreadTest > 
shouldFilterPartitionsMadeLeaderDuringLeaderEpochRequest STARTED

kafka.server.ReplicaFetcherThreadTest > 
shouldFilterPartitionsMadeLeaderDuringLeaderEpochRequest PASSED

kafka.server.ReplicaManagerQuotasTest > shouldGetBothMessagesIfQuotasAllow 
STARTED

kafka.server.ReplicaManagerQuotasTest > shouldGetBothMessagesIfQuotasAllow 
PASSED

kafka.server.ReplicaManagerQuotasTest > 
shouldExcludeSubsequentThrottledPartitions STARTED

kafka.server.ReplicaManagerQuotasTest > 
shouldExcludeSubsequentThrottledPartitions PASSED

kafka.server.ReplicaManagerQuotasTest > 
shouldGetNoMessagesIfQuotasExceededOnSubsequentPartitions STARTED

kafka.server.ReplicaManagerQuotasTest > 
shouldGetNoMessagesIfQuotasExceededOnSubsequentPartitions PASSED

kafka.server.ReplicaManagerQuotasTest > shouldIncludeInSyncThrottledReplicas 
STARTED

kafka.server.ReplicaManagerQuotasTest > shouldIncludeInSyncThrottledReplicas 
PASSED

kafka.server.ServerGenerateClusterIdTest > classMethod STARTED

kafka.server.ServerGenerateClusterIdTest > classMethod FAILED
java.lang.AssertionError: Found unexpected threads, 
allThreads=Set(ThrottledRequestReaper-Produce, kafka-request-handler-7, 
Reference Handler, kafka-request-handler-1, ThrottledRequestReaper-Request, 
ExpirationReaper-0-Heartbeat, kafka-request-handler-3, 
kafka-socket-acceptor-ListenerName(SSL)-SSL-0, 
kafka-network-thread-1-ListenerName(SSL)-SSL-0, ExpirationReaper-1-Heartbeat, 
SensorExpiryThread, kafka-log-cleaner-thread-0, 
kafka-network-thread-1-ListenerName(SSL)-SSL-2, kafka-scheduler-8, 
metrics-meter-tick-thread-1, TxnMarkerSenderThread-1, 
ZkClient-EventThread-8251-127.0.0.1:36694, Signal Dispatcher, 
ExpirationReaper-0-Rebalance, kafka-network-thread-0-ListenerName(SSL)-SSL-1, 
/0:0:0:0:0:0:0:1:48904 to /0:0:0:0:0:0:0:1:44344 workers Thread 2, 
controller-event-thread, transaction-log-manager-0, 
ExpirationReaper-1-Rebalance, kafka-scheduler-4, ExpirationReaper-1-Fetch, 
kafka-scheduler-6, kafka-scheduler-0, 
ZkClient-EventThread-8263-127.0.0.1:36694, ExpirationReaper-0-Fetch, 
Controller-0-to-broker-1-send-thread, ExpirationReaper-1-Produce, Finalizer, 
ExpirationReaper-1-DeleteRecords, kafka-scheduler-2, kafka-request-handler-6, 
Test worker-SendThread(127.0.0.1:36694), ExpirationReaper-0-Produce, 
kafka-request-handler-0, kafka-request-handler-2, Test worker, 
ZkClient-EventThread-8296-127.0.0.1:36694, kafka-request-handler-4, 
kafka-network-thread-1-ListenerName(SSL)-SSL-1, 
ZkClient-EventThread-8219-127.0.0.1:36694, ForkJoinPool-1-worker-39, 
kafka-network-thread-0-ListenerName(SSL)-SSL-0, kafka-scheduler-9, 
metrics-meter-tick-thread-2, main, 
kafka-network-thread-0-ListenerName(SSL)-SSL-2, 
ExpirationReaper-0-DeleteRecords, /0:0:0:0:0:0:0:1:48904 to 
/0:0:0:0:0:0:0:1:44344 workers Thread 3, kafka-scheduler-3, 
ThrottledRequestReaper-Fetch, ExpirationReaper-1-topic, kafka-scheduler-5, 
Controller-0-to-broker-0-send-thread, kafka-scheduler-7, 
group-metadata-manager-0, Test worker-EventThread, TxnMarkerSenderThread-0, 
ExpirationReaper-0-topic, 

[GitHub] kafka pull request #3403: MINOR: Turn off caching in demos for more understa...

2017-06-22 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/kafka/pull/3403


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] kafka pull request #3291: KAFKA-4659: Improve test coverage of CachingKeyVal...

2017-06-22 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/kafka/pull/3291


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---