from:"Ewen Cheslack\-Postava \(JIRA\)"

[jira] [Commented] (KAFKA-3135) Unexpected delay before fetch response transmission

2016-12-30 Thread Ewen Cheslack-Postava (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-3135?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15788677#comment-15788677
 ] 

Ewen Cheslack-Postava commented on KAFKA-3135:
--

[~jeffwidman] I've updated the affected versions, and I'll track this for the 
0.10.2.0 release. Do we have an idea of how critical this issue is? It seems 
like it should be hit fairly commonly (based on ideas that it's due to TCP 
persist timer issues), but we've seen few comments since Mar/Apr of earlier 
this year. Is this still a critical issue or should we downgrade it to 
something that would be a good improvement and make it a 
major/minor/nice-to-have?

> Unexpected delay before fetch response transmission
> ---
>
> Key: KAFKA-3135
> URL: https://issues.apache.org/jira/browse/KAFKA-3135
> Project: Kafka
>  Issue Type: Bug
>Affects Versions: 0.9.0.0, 0.10.1.0, 0.9.0.1, 0.10.0.0, 0.10.0.1, 0.10.1.1
>Reporter: Jason Gustafson
>Assignee: Jason Gustafson
>Priority: Critical
> Fix For: 0.10.2.0
>
>
> From the user list, Krzysztof Ciesielski reports the following:
> {quote}
> Scenario description:
> First, a producer writes 50 elements into a topic
> Then, a consumer starts to read, polling in a loop.
> When "max.partition.fetch.bytes" is set to a relatively small value, each
> "consumer.poll()" returns a batch of messages.
> If this value is left as default, the output tends to look like this:
> Poll returned 13793 elements
> Poll returned 13793 elements
> Poll returned 13793 elements
> Poll returned 13793 elements
> Poll returned 0 elements
> Poll returned 0 elements
> Poll returned 0 elements
> Poll returned 0 elements
> Poll returned 13793 elements
> Poll returned 13793 elements
> As we can see, there are weird "gaps" when poll returns 0 elements for some
> time. What is the reason for that? Maybe there are some good practices
> about setting "max.partition.fetch.bytes" which I don't follow?
> {quote}
> The gist to reproduce this problem is here: 
> https://gist.github.com/kciesielski/054bb4359a318aa17561.
> After some initial investigation, the delay appears to be in the server's 
> networking layer. Basically I see a delay of 5 seconds from the time that 
> Selector.send() is invoked in SocketServer.Processor with the fetch response 
> to the time that the send is completed. Using netstat in the middle of the 
> delay shows the following output:
> {code}
> tcp4   0  0  10.191.0.30.55455  10.191.0.30.9092   ESTABLISHED
> tcp4   0 102400  10.191.0.30.9092   10.191.0.30.55454  ESTABLISHED
> {code}
> From this, it looks like the data reaches the send buffer, but needs to be 
> flushed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (KAFKA-3135) Unexpected delay before fetch response transmission

2016-12-30 Thread Ewen Cheslack-Postava (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-3135?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ewen Cheslack-Postava updated KAFKA-3135:
-
Affects Version/s: 0.10.1.0
   0.9.0.1
   0.10.0.0
   0.10.0.1
   0.10.1.1

> Unexpected delay before fetch response transmission
> ---
>
> Key: KAFKA-3135
> URL: https://issues.apache.org/jira/browse/KAFKA-3135
> Project: Kafka
>  Issue Type: Bug
>Affects Versions: 0.9.0.0, 0.10.1.0, 0.9.0.1, 0.10.0.0, 0.10.0.1, 0.10.1.1
>Reporter: Jason Gustafson
>Assignee: Jason Gustafson
>Priority: Critical
> Fix For: 0.10.2.0
>
>
> From the user list, Krzysztof Ciesielski reports the following:
> {quote}
> Scenario description:
> First, a producer writes 50 elements into a topic
> Then, a consumer starts to read, polling in a loop.
> When "max.partition.fetch.bytes" is set to a relatively small value, each
> "consumer.poll()" returns a batch of messages.
> If this value is left as default, the output tends to look like this:
> Poll returned 13793 elements
> Poll returned 13793 elements
> Poll returned 13793 elements
> Poll returned 13793 elements
> Poll returned 0 elements
> Poll returned 0 elements
> Poll returned 0 elements
> Poll returned 0 elements
> Poll returned 13793 elements
> Poll returned 13793 elements
> As we can see, there are weird "gaps" when poll returns 0 elements for some
> time. What is the reason for that? Maybe there are some good practices
> about setting "max.partition.fetch.bytes" which I don't follow?
> {quote}
> The gist to reproduce this problem is here: 
> https://gist.github.com/kciesielski/054bb4359a318aa17561.
> After some initial investigation, the delay appears to be in the server's 
> networking layer. Basically I see a delay of 5 seconds from the time that 
> Selector.send() is invoked in SocketServer.Processor with the fetch response 
> to the time that the send is completed. Using netstat in the middle of the 
> delay shows the following output:
> {code}
> tcp4   0  0  10.191.0.30.55455  10.191.0.30.9092   ESTABLISHED
> tcp4   0 102400  10.191.0.30.9092   10.191.0.30.55454  ESTABLISHED
> {code}
> From this, it looks like the data reaches the send buffer, but needs to be 
> flushed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (KAFKA-3539) KafkaProducer.send() may block even though it returns the Future

2016-12-30 Thread Ewen Cheslack-Postava (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-3539?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15788664#comment-15788664
 ] 

Ewen Cheslack-Postava commented on KAFKA-3539:
--

http://mail-archives.apache.org/mod_mbox/kafka-users/201604.mbox/%3C01978027-5580-4AAB-A254-3585870FBD69%40hortonworks.com%3E
 is the original message that's the source of this issue.

> KafkaProducer.send() may block even though it returns the Future
> 
>
> Key: KAFKA-3539
> URL: https://issues.apache.org/jira/browse/KAFKA-3539
> Project: Kafka
>  Issue Type: Bug
>Reporter: Oleg Zhurakousky
>Assignee: Manikumar Reddy
>Priority: Critical
>
> You can get more details from the us...@kafka.apache.org by searching on the 
> thread with the subject "KafkaProducer block on send".
> The bottom line is that method that returns Future must never block, since it 
> essentially violates the Future contract as it was specifically designed to 
> return immediately passing control back to the user to check for completion, 
> cancel etc.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (KAFKA-4550) current trunk unstable

2016-12-30 Thread Ewen Cheslack-Postava (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-4550?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15788627#comment-15788627
 ] 

Ewen Cheslack-Postava commented on KAFKA-4550:
--

One of the failures reported in this issue is already being addressed in 
KAFKA-4528

> current trunk unstable
> --
>
> Key: KAFKA-4550
> URL: https://issues.apache.org/jira/browse/KAFKA-4550
> Project: Kafka
>  Issue Type: Sub-task
>Affects Versions: 0.10.2.0
>Reporter: radai rosenblatt
> Attachments: run1.log, run2.log, run3.log, run4.log, run5.log
>
>
> on latest trunk (commit hash 908b6d1148df963d21a70aaa73a7a87571b965a9)
> when running the exact same build 5 times, I get:
> 3 failures (on 3 separate runs):
>kafka.api.SslProducerSendTest > testFlush FAILED java.lang.AssertionError: 
> No request is complete
>org.apache.kafka.streams.integration.QueryableStateIntegrationTest > 
> queryOnRebalance[1] FAILED java.lang.AssertionError: Condition not met within 
> timeout 6. Did not receive 1 number of records
>kafka.producer.ProducerTest > testAsyncSendCanCorrectlyFailWithTimeout 
> FAILED java.lang.AssertionError: Message set should have 1 message
> 1 success
> 1 stall (build hung)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (KAFKA-4550) current trunk unstable

2016-12-30 Thread Ewen Cheslack-Postava (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-4550?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15788626#comment-15788626
 ] 

Ewen Cheslack-Postava commented on KAFKA-4550:
--

Seems like the second issue is still unresolved and reported as KAFKA-4528, so 
we should defer to that more specific issue. First issue had a JIRA associated 
with it, but it has already been resolved, so it seems this may be a new issue.

> current trunk unstable
> --
>
> Key: KAFKA-4550
> URL: https://issues.apache.org/jira/browse/KAFKA-4550
> Project: Kafka
>  Issue Type: Sub-task
>Affects Versions: 0.10.2.0
>Reporter: radai rosenblatt
> Attachments: run1.log, run2.log, run3.log, run4.log, run5.log
>
>
> on latest trunk (commit hash 908b6d1148df963d21a70aaa73a7a87571b965a9)
> when running the exact same build 5 times, I get:
> 3 failures (on 3 separate runs):
>kafka.api.SslProducerSendTest > testFlush FAILED java.lang.AssertionError: 
> No request is complete
>org.apache.kafka.streams.integration.QueryableStateIntegrationTest > 
> queryOnRebalance[1] FAILED java.lang.AssertionError: Condition not met within 
> timeout 6. Did not receive 1 number of records
>kafka.producer.ProducerTest > testAsyncSendCanCorrectlyFailWithTimeout 
> FAILED java.lang.AssertionError: Message set should have 1 message
> 1 success
> 1 stall (build hung)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (KAFKA-4556) unordered messages when multiple topics are combined in single topic through stream

2016-12-30 Thread Ewen Cheslack-Postava (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-4556?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15788347#comment-15788347
 ] 

Ewen Cheslack-Postava commented on KAFKA-4556:
--

Can you be more specific about what your streams topology looks like and is 
doing? First, timestamps are not strictly guaranteed to be monotonically 
increasing (especially when specified by the producer). But more importantly, 
ordering in Kafka is only guaranteed within a partition. In the output topic, 
are you writing the messages using keys such that the messages would end up on 
the same partition and preserve ordering? If you aren't, then they will be 
produced to different partitions and it is expected that downstream consumers 
might see them in a different order. (Also note that to preserve ordering in 
the face of errors, you'd also need to make sure your producer uses settings to 
avoid reordering due to errors, i.e. max.in.flight.requests.per.connection=1, 
acks=all, retries=infinite.)

> unordered messages when multiple topics are combined in single topic through 
> stream
> ---
>
> Key: KAFKA-4556
> URL: https://issues.apache.org/jira/browse/KAFKA-4556
> Project: Kafka
>  Issue Type: Bug
>  Components: consumer, producer , streams
>Affects Versions: 0.10.0.1
>Reporter: Savdeep Singh
>
> When binding builder with multiple topics, single resultant topic has 
> unordered set of messages.
> This issue is at millisecond level. When messages with same milisecond level 
> are added in topics.
> Scenario :  (1 producer : p1 , 2 topics t1 and t2, streams pick form these 2 
> topics and save in resulting t3 topic, 4 partitions of t3 and 4 consumers of 
> 4 partitions )
> Case: When p1 adds messages with same millisecond timestamp into t1 and t2 . 
> Stream combine and form t3. When this t3 is consumed by consumer, it has 
> different order of same millisecond messages.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (KAFKA-4572) Kafka connect for windows is missing

2016-12-30 Thread Ewen Cheslack-Postava (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-4572?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15788234#comment-15788234
 ] 

Ewen Cheslack-Postava commented on KAFKA-4572:
--

[~vahid] The version he included looks basically the same as the one checked 
in, so perhaps there's still some issue with log4j paths?

> Kafka connect for windows is missing
> 
>
> Key: KAFKA-4572
> URL: https://issues.apache.org/jira/browse/KAFKA-4572
> Project: Kafka
>  Issue Type: Bug
>  Components: KafkaConnect
>Affects Versions: 0.10.1.1
> Environment: Windows.
>Reporter: Aravind Krishnan
>Assignee: Ewen Cheslack-Postava
>
> Unable to find the connect-standalone for windows. When created as below
> IF [%1] EQU [] (
>   echo USAGE: %0 connect-standalone.properties
>   EXIT /B 1
> )
> SetLocal
> rem Using pushd popd to set BASE_DIR to the absolute path
> pushd %~dp0..\..
> set BASE_DIR=%CD%
> popd
> rem Log4j settings
> IF ["%KAFKA_LOG4J_OPTS%"] EQU [""] (
>   set 
> KAFKA_LOG4J_OPTS=-Dlog4j.configuration=file:%BASE_DIR%/config/tools-log4j.properties
> )
> %~dp0kafka-run-class.bat org.apache.kafka.connect.cli.ConnectStandalone %*
> EndLocal
> Not able to see any of the below logs in FileStreamSourceTask
>   log.trace("Read {} bytes from {}", nread, logFilename());
> (OR)
> log.trace("Read a line from {}", logFilename());



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (KAFKA-3297) More optimally balanced partition assignment strategy (new consumer)

2016-12-30 Thread Ewen Cheslack-Postava (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-3297?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15788229#comment-15788229
 ] 

Ewen Cheslack-Postava commented on KAFKA-3297:
--

Yeah, part of the improvement of KIP-49 seems to be included in step 2 of the 
algorithm (start from the least loaded consumers rather than just using a 
CircularIterator). But there are a few more improvements (specifically in 
optimizing the order of processing the topics) that would still probably need 
to be incorporated.

> More optimally balanced partition assignment strategy (new consumer)
> 
>
> Key: KAFKA-3297
> URL: https://issues.apache.org/jira/browse/KAFKA-3297
> Project: Kafka
>  Issue Type: Improvement
>  Components: consumer
>Reporter: Andrew Olson
>Assignee: Andrew Olson
> Fix For: 0.10.2.0
>
>
> While the roundrobin partition assignment strategy is an improvement over the 
> range strategy, when the consumer topic subscriptions are not identical 
> (previously disallowed but will be possible as of KAFKA-2172) it can produce 
> heavily skewed assignments. As suggested 
> [here|https://issues.apache.org/jira/browse/KAFKA-2172?focusedCommentId=14530767=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14530767]
>  it would be nice to have a strategy that attempts to assign an equal number 
> of partitions to each consumer in a group, regardless of how similar their 
> individual topic subscriptions are. We can accomplish this by tracking the 
> number of partitions assigned to each consumer, and having the partition 
> assignment loop assign each partition to a consumer interested in that topic 
> with the least number of partitions assigned. 
> Additionally, we can optimize the distribution fairness by adjusting the 
> partition assignment order:
> * Topics with fewer consumers are assigned first.
> * In the event of a tie for least consumers, the topic with more partitions 
> is assigned first.
> The general idea behind these two rules is to keep the most flexible 
> assignment choices available as long as possible by starting with the most 
> constrained partitions/consumers.
> This JIRA addresses the new consumer. For the original high-level consumer, 
> see KAFKA-2435.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (KAFKA-2196) remove roundrobin identical topic constraint in consumer coordinator

2016-12-29 Thread Ewen Cheslack-Postava (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-2196?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ewen Cheslack-Postava updated KAFKA-2196:
-
Fix Version/s: 0.9.0.0

> remove roundrobin identical topic constraint in consumer coordinator
> 
>
> Key: KAFKA-2196
> URL: https://issues.apache.org/jira/browse/KAFKA-2196
> Project: Kafka
>  Issue Type: Sub-task
>Reporter: Onur Karaman
>Assignee: Onur Karaman
> Fix For: 0.9.0.0
>
> Attachments: KAFKA-2196.patch
>
>
> roundrobin doesn't need to make all consumers have identical topic 
> subscriptions.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (KAFKA-2196) remove roundrobin identical topic constraint in consumer coordinator

2016-12-29 Thread Ewen Cheslack-Postava (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-2196?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15786960#comment-15786960
 ] 

Ewen Cheslack-Postava commented on KAFKA-2196:
--

Based on git history, looks like 0.9.0.0, I'll update the fix version 
accordingly.

> remove roundrobin identical topic constraint in consumer coordinator
> 
>
> Key: KAFKA-2196
> URL: https://issues.apache.org/jira/browse/KAFKA-2196
> Project: Kafka
>  Issue Type: Sub-task
>Reporter: Onur Karaman
>Assignee: Onur Karaman
> Fix For: 0.9.0.0
>
> Attachments: KAFKA-2196.patch
>
>
> roundrobin doesn't need to make all consumers have identical topic 
> subscriptions.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (KAFKA-3297) More optimally balanced partition assignment strategy (new consumer)

2016-12-29 Thread Ewen Cheslack-Postava (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-3297?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15786948#comment-15786948
 ] 

Ewen Cheslack-Postava commented on KAFKA-3297:
--

[~jeffwidman] I don't think KIP-54 supercedes this. KIP-54 is about maintaining 
stability across multiple reassignments. KIP-49 would apply to the *first* 
assignment. The two may interact, but they seem to have different goals.

If this KIP got the vote thread through, it seems like it might be easy to get 
included -- since new consumer assignments are performed by one group member, 
there aren't any real compatibility concerns.

> More optimally balanced partition assignment strategy (new consumer)
> 
>
> Key: KAFKA-3297
> URL: https://issues.apache.org/jira/browse/KAFKA-3297
> Project: Kafka
>  Issue Type: Improvement
>  Components: consumer
>Reporter: Andrew Olson
>Assignee: Andrew Olson
> Fix For: 0.10.2.0
>
>
> While the roundrobin partition assignment strategy is an improvement over the 
> range strategy, when the consumer topic subscriptions are not identical 
> (previously disallowed but will be possible as of KAFKA-2172) it can produce 
> heavily skewed assignments. As suggested 
> [here|https://issues.apache.org/jira/browse/KAFKA-2172?focusedCommentId=14530767=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14530767]
>  it would be nice to have a strategy that attempts to assign an equal number 
> of partitions to each consumer in a group, regardless of how similar their 
> individual topic subscriptions are. We can accomplish this by tracking the 
> number of partitions assigned to each consumer, and having the partition 
> assignment loop assign each partition to a consumer interested in that topic 
> with the least number of partitions assigned. 
> Additionally, we can optimize the distribution fairness by adjusting the 
> partition assignment order:
> * Topics with fewer consumers are assigned first.
> * In the event of a tie for least consumers, the topic with more partitions 
> is assigned first.
> The general idea behind these two rules is to keep the most flexible 
> assignment choices available as long as possible by starting with the most 
> constrained partitions/consumers.
> This JIRA addresses the new consumer. For the original high-level consumer, 
> see KAFKA-2435.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (KAFKA-3297) More optimally balanced partition assignment strategy (new consumer)

2016-12-29 Thread Ewen Cheslack-Postava (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-3297?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ewen Cheslack-Postava updated KAFKA-3297:
-
Component/s: consumer

> More optimally balanced partition assignment strategy (new consumer)
> 
>
> Key: KAFKA-3297
> URL: https://issues.apache.org/jira/browse/KAFKA-3297
> Project: Kafka
>  Issue Type: Improvement
>  Components: consumer
>Reporter: Andrew Olson
>Assignee: Andrew Olson
> Fix For: 0.10.2.0
>
>
> While the roundrobin partition assignment strategy is an improvement over the 
> range strategy, when the consumer topic subscriptions are not identical 
> (previously disallowed but will be possible as of KAFKA-2172) it can produce 
> heavily skewed assignments. As suggested 
> [here|https://issues.apache.org/jira/browse/KAFKA-2172?focusedCommentId=14530767=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14530767]
>  it would be nice to have a strategy that attempts to assign an equal number 
> of partitions to each consumer in a group, regardless of how similar their 
> individual topic subscriptions are. We can accomplish this by tracking the 
> number of partitions assigned to each consumer, and having the partition 
> assignment loop assign each partition to a consumer interested in that topic 
> with the least number of partitions assigned. 
> Additionally, we can optimize the distribution fairness by adjusting the 
> partition assignment order:
> * Topics with fewer consumers are assigned first.
> * In the event of a tie for least consumers, the topic with more partitions 
> is assigned first.
> The general idea behind these two rules is to keep the most flexible 
> assignment choices available as long as possible by starting with the most 
> constrained partitions/consumers.
> This JIRA addresses the new consumer. For the original high-level consumer, 
> see KAFKA-2435.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (KAFKA-3502) Build is killed during kafka streams tests due to `pure virtual method called` error

2016-12-29 Thread Ewen Cheslack-Postava (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-3502?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15786941#comment-15786941
 ] 

Ewen Cheslack-Postava commented on KAFKA-3502:
--

I've seen this in a number of Jenkins builds recently too, e.g.

{quote}
org.apache.kafka.streams.KeyValueTest > shouldHaveSaneEqualsAndHashCode STARTED

org.apache.kafka.streams.KeyValueTest > shouldHaveSaneEqualsAndHashCode PASSED
pure virtual method called
{quote}
from https://builds.apache.org/job/kafka-pr-jdk7-scala2.10/399/console

Transient integration test failures still seem to dominate, but this seems to 
be increasing the frequency of test failures for PRs recently.

> Build is killed during kafka streams tests due to `pure virtual method 
> called` error
> 
>
> Key: KAFKA-3502
> URL: https://issues.apache.org/jira/browse/KAFKA-3502
> Project: Kafka
>  Issue Type: Sub-task
>Reporter: Ashish K Singh
>  Labels: transient-unit-test-failure
>
> Build failed due to failure in streams' test. Not clear which test led to 
> this.
> Jenkins console: 
> https://builds.apache.org/job/kafka-trunk-git-pr-jdk7/3210/console
> {code}
> org.apache.kafka.streams.kstream.internals.KTableFilterTest > testValueGetter 
> PASSED
> org.apache.kafka.streams.kstream.internals.KStreamFlatMapTest > testFlatMap 
> PASSED
> org.apache.kafka.streams.kstream.internals.KTableAggregateTest > testAggBasic 
> PASSED
> org.apache.kafka.streams.kstream.internals.KStreamFlatMapValuesTest > 
> testFlatMapValues PASSED
> org.apache.kafka.streams.kstream.KStreamBuilderTest > testMerge PASSED
> org.apache.kafka.streams.kstream.KStreamBuilderTest > testFrom PASSED
> org.apache.kafka.streams.kstream.KStreamBuilderTest > testNewName PASSED
> pure virtual method called
> terminate called without an active exception
> :streams:test FAILED
> FAILURE: Build failed with an exception.
> * What went wrong:
> Execution failed for task ':streams:test'.
> > Process 'Gradle Test Executor 4' finished with non-zero exit value 134
> {code}
> Tried reproducing the issue locally, but could not.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (KAFKA-2019) Old consumer RoundRobinAssignor clusters by consumer

2016-12-29 Thread Ewen Cheslack-Postava (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-2019?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15786931#comment-15786931
 ] 

Ewen Cheslack-Postava commented on KAFKA-2019:
--

[~jeffwidman] I updated the title to clarify. It may make sense to start 
tracking new consumer stuff via a different component so they can be easily 
differentiated (although I don't like the idea of permanently labelling it "new 
consumer").

re: merging, agreed, it seems unlikely at this point since nobody has pushed 
for it to get merged for the past 18 months, although it would be completely 
reasonable as a new class to avoid compatibility concerns.

> Old consumer RoundRobinAssignor clusters by consumer
> 
>
> Key: KAFKA-2019
> URL: https://issues.apache.org/jira/browse/KAFKA-2019
> Project: Kafka
>  Issue Type: Bug
>  Components: consumer
>Reporter: Joseph Holsten
>Assignee: Neha Narkhede
>Priority: Minor
> Attachments: 0001-sort-consumer-thread-ids-by-hashcode.patch, 
> KAFKA-2019.patch
>
>
> When rolling out a change today, I noticed that some of my consumers are 
> "greedy", taking far more partitions than others.
> The cause is that the RoundRobinAssignor is using a list of ConsumerThreadIds 
> sorted by toString, which is {{ "%s-%d".format(consumer, threadId)}}. This 
> causes each consumer's threads to be adjacent to each other.
> One possible fix would be to define ConsumerThreadId.hashCode, and sort by 
> that.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (KAFKA-2019) Old consumer RoundRobinAssignor clusters by consumer

2016-12-29 Thread Ewen Cheslack-Postava (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-2019?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ewen Cheslack-Postava updated KAFKA-2019:
-
Summary: Old consumer RoundRobinAssignor clusters by consumer  (was: 
RoundRobinAssignor clusters by consumer)

> Old consumer RoundRobinAssignor clusters by consumer
> 
>
> Key: KAFKA-2019
> URL: https://issues.apache.org/jira/browse/KAFKA-2019
> Project: Kafka
>  Issue Type: Bug
>  Components: consumer
>Reporter: Joseph Holsten
>Assignee: Neha Narkhede
>Priority: Minor
> Attachments: 0001-sort-consumer-thread-ids-by-hashcode.patch, 
> KAFKA-2019.patch
>
>
> When rolling out a change today, I noticed that some of my consumers are 
> "greedy", taking far more partitions than others.
> The cause is that the RoundRobinAssignor is using a list of ConsumerThreadIds 
> sorted by toString, which is {{ "%s-%d".format(consumer, threadId)}}. This 
> causes each consumer's threads to be adjacent to each other.
> One possible fix would be to define ConsumerThreadId.hashCode, and sort by 
> that.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (KAFKA-2331) Kafka does not spread partitions in a topic among all consumers evenly

2016-12-29 Thread Ewen Cheslack-Postava (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-2331?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15786920#comment-15786920
 ] 

Ewen Cheslack-Postava commented on KAFKA-2331:
--

This seems like a separate issue -- it sounds like the consumers are joining 
the group but not getting assigned any partitions when there is only 1 topic 
and 10 partitions. Both range and round-robin assignment should have the same 
behavior in that case. But the split that is shown (3,2,1,1,1,1,1) doesn't seem 
likely either. It seems more likely that is an artifact of just aggregating all 
partitions each thread saw messages for without taking into account that when 
the first couple of instances join there will be a period when they are trying 
to fetch data for *all* partitions and can validly see data for more than 1 
partition.

This is quite old and is for the old consumer, but if someone wanted to tackle 
it, one strategy might be to use a ConsumerRebalanceListener to get more info 
about how partitions are being assigned. That might also reveal other issues 
such as some members not successfully joining the group.

> Kafka does not spread partitions in a topic among all consumers evenly
> --
>
> Key: KAFKA-2331
> URL: https://issues.apache.org/jira/browse/KAFKA-2331
> Project: Kafka
>  Issue Type: Bug
>  Components: clients, consumer
>Affects Versions: 0.8.1.1
>Reporter: Stefan Miklosovic
>
> I want to have 1 topic with 10 partitions. I am using default configuration 
> of Kafka. I create 1 topic with 10 partitions by that helper script and now I 
> am about to produce messages to it.
> The thing is that even all partitions are indeed consumed, some consumers 
> have more then 1 partition assigned even I have number of consumer threads 
> equal to partitions in a topic hence some threads are idle.
> Let's describe it in more detail.
> I know that common stuff that you need one consumer thread per partition. I 
> want to be able to commit offsets per partition and this is possible only 
> when I have 1 thread per consumer connector per partition (I am using high 
> level consumer).
> So I create 10 threads, in each thread I am calling 
> Consumer.createJavaConsumerConnector() where I am doing this
> topicCountMap.put("mytopic", 1);
> and in the end I have 1 iterator which consumes messages from 1 partition.
> When I do this 10 times, I have 10 consumers, consumer per thread per 
> partition where I can commit offsets independently per partition because if I 
> put different number from 1 in topic map, I would end up with more then 1 
> consumer thread for that topic for given consumer instance so if I am about 
> to commit offsets with created consumer instance, it would commit them for 
> all threads which is not desired.
> But the thing is that when I use consumers, only 7 consumers are involved and 
> it seems that other consumer threads are idle but I do not know why.
> The thing is that I am creating these consumer threads in a loop. So I start 
> first thread (submit to executor service), then another, then another and so 
> on.
> So the scenario is that first consumer gets all 10 partitions, then 2nd 
> connects so it is splits between these two to 5 and 5 (or something similar), 
> then other threads are connecting.
> I understand this as a partition rebalancing among all consumers so it 
> behaves well in such sense that if more consumers are being created, 
> partition rebalancing occurs between these consumers so every consumer should 
> have some partitions to operate upon.
> But from the results I see that there is only 7 consumers and according to 
> consumed messages it seems they are split like 3,2,1,1,1,1,1 partition-wise. 
> Yes, these 7 consumers covered all 10 partitions, but why consumers with more 
> then 1 partition do no split and give partitions to remaining 3 consumers?
> I am pretty much wondering what is happening with remaining 3 threads and why 
> they do not "grab" partitions from consumers which have more then 1 partition 
> assigned.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (KAFKA-2331) Kafka does not spread partitions in a topic among all consumers evenly

2016-12-29 Thread Ewen Cheslack-Postava (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-2331?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ewen Cheslack-Postava updated KAFKA-2331:
-
Component/s: (was: core)
 consumer
 clients

> Kafka does not spread partitions in a topic among all consumers evenly
> --
>
> Key: KAFKA-2331
> URL: https://issues.apache.org/jira/browse/KAFKA-2331
> Project: Kafka
>  Issue Type: Bug
>  Components: clients, consumer
>Affects Versions: 0.8.1.1
>Reporter: Stefan Miklosovic
>
> I want to have 1 topic with 10 partitions. I am using default configuration 
> of Kafka. I create 1 topic with 10 partitions by that helper script and now I 
> am about to produce messages to it.
> The thing is that even all partitions are indeed consumed, some consumers 
> have more then 1 partition assigned even I have number of consumer threads 
> equal to partitions in a topic hence some threads are idle.
> Let's describe it in more detail.
> I know that common stuff that you need one consumer thread per partition. I 
> want to be able to commit offsets per partition and this is possible only 
> when I have 1 thread per consumer connector per partition (I am using high 
> level consumer).
> So I create 10 threads, in each thread I am calling 
> Consumer.createJavaConsumerConnector() where I am doing this
> topicCountMap.put("mytopic", 1);
> and in the end I have 1 iterator which consumes messages from 1 partition.
> When I do this 10 times, I have 10 consumers, consumer per thread per 
> partition where I can commit offsets independently per partition because if I 
> put different number from 1 in topic map, I would end up with more then 1 
> consumer thread for that topic for given consumer instance so if I am about 
> to commit offsets with created consumer instance, it would commit them for 
> all threads which is not desired.
> But the thing is that when I use consumers, only 7 consumers are involved and 
> it seems that other consumer threads are idle but I do not know why.
> The thing is that I am creating these consumer threads in a loop. So I start 
> first thread (submit to executor service), then another, then another and so 
> on.
> So the scenario is that first consumer gets all 10 partitions, then 2nd 
> connects so it is splits between these two to 5 and 5 (or something similar), 
> then other threads are connecting.
> I understand this as a partition rebalancing among all consumers so it 
> behaves well in such sense that if more consumers are being created, 
> partition rebalancing occurs between these consumers so every consumer should 
> have some partitions to operate upon.
> But from the results I see that there is only 7 consumers and according to 
> consumed messages it seems they are split like 3,2,1,1,1,1,1 partition-wise. 
> Yes, these 7 consumers covered all 10 partitions, but why consumers with more 
> then 1 partition do no split and give partitions to remaining 3 consumers?
> I am pretty much wondering what is happening with remaining 3 threads and why 
> they do not "grab" partitions from consumers which have more then 1 partition 
> assigned.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Comment Edited] (KAFKA-4567) Connect Producer and Consumer ignore ssl parameters configured for worker

2016-12-29 Thread Ewen Cheslack-Postava (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-4567?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15786745#comment-15786745
 ] 

Ewen Cheslack-Postava edited comment on KAFKA-4567 at 12/30/16 3:13 AM:


Given some future security features we may want to support (e.g. supporting 
different identities for different connectors), we probably don't want to just 
include the worker-level security configs into the producer & consumer. It's 
annoying to have to duplicate them now, but we probably want to support more 
flexible combinations in the future, such as having unique credentials for the 
workers (limiting the ability to, e.g., write to the config/offsets/status 
topics) than those used by producers and consumers (where we may want both 
unique credentials to apply ACLs and maybe support things like delegation 
tokens in the future).

So I think the short term solution is probably to just update the docs to 
clarify that you'll currently need the settings both at the worker level and 
prefixed by {{producer.}} and {{consumer.}} if you're trying to use the same 
credentials for worker, producer, and consumer.


was (Author: ewencp):
Given some future security features we may want to support (e.g. supporting 
different identities for different connectors), we probably don't want to just 
include the worker-level security configs into the producer & consumer. It's 
annoying to have to duplicate them now, but we probably want to support more 
flexible combinations in the future, such as having unique credentials for the 
workers (limiting the ability to, e.g., write to the config/offsets/status 
topics) than those used by producers and consumers (where we may want both 
unique credentials to apply ACLs and maybe support things like delegation 
tokens in the future).

So I think the short term solution is probably to just update the docs to 
clarify that you'll currently need the settings both at the worker level and 
prefixed by {{producer.} and {{consumer.}} if you're trying to use the same 
credentials for worker, producer, and consumer.

> Connect Producer and Consumer ignore ssl parameters configured for worker
> -
>
> Key: KAFKA-4567
> URL: https://issues.apache.org/jira/browse/KAFKA-4567
> Project: Kafka
>  Issue Type: Bug
>  Components: KafkaConnect
>Affects Versions: 0.10.1.1
>Reporter: Sönke Liebau
>Assignee: Ewen Cheslack-Postava
>Priority: Minor
>
> When using Connect with a SSL enabled Kafka cluster, the configuration 
> options are either documented a bit misleading, or handled in an incorrect 
> way.
> The documentation states the usual available SSL options 
> (ssl.keystore.location, ssl.truststore.location, ...) and these are picked up 
> and used for the producers and consumers that are used to communicate with 
> the status, offset and configs topics.
> For the producers and consumers that are used for the actual data, these 
> parameters are ignored as can be seen 
> [here|https://github.com/apache/kafka/blob/trunk/connect/runtime/src/main/java/org/apache/kafka/connect/runtime/Worker.java#L98],
>  which results in plaintext communication on an SSL port, leading to an OOM 
> exception ([KAFKA-4493|https://issues.apache.org/jira/browse/KAFKA-4493]).
> So in order to get Connect to communicate with a secured cluster you need to 
> override all SSL configs with the prefixes _consumer._ and _producer._ and 
> duplicate the values already set at a global level.
> The documentation states: 
> bq. The most critical site-specific options, such as the Kafka bootstrap 
> servers, are already exposed via the standard worker configuration.
> Since the address for the cluster is exposed here, I would propose that there 
> is no reason not to also pass the SSL parameters through to the consumers and 
> producers, as it is clearly intended that communication happens with the same 
> cluster. 
> In fringe cases, these can still be overridden manually to achieve different 
> behavior.
> I am happy to create a pull request to address this or clarify the docs, 
> after we decide which one is the appropriate course of action.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (KAFKA-4567) Connect Producer and Consumer ignore ssl parameters configured for worker

2016-12-29 Thread Ewen Cheslack-Postava (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-4567?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15786745#comment-15786745
 ] 

Ewen Cheslack-Postava commented on KAFKA-4567:
--

Given some future security features we may want to support (e.g. supporting 
different identities for different connectors), we probably don't want to just 
include the worker-level security configs into the producer & consumer. It's 
annoying to have to duplicate them now, but we probably want to support more 
flexible combinations in the future, such as having unique credentials for the 
workers (limiting the ability to, e.g., write to the config/offsets/status 
topics) than those used by producers and consumers (where we may want both 
unique credentials to apply ACLs and maybe support things like delegation 
tokens in the future).

So I think the short term solution is probably to just update the docs to 
clarify that you'll currently need the settings both at the worker level and 
prefixed by {{producer.} and {{consumer.}} if you're trying to use the same 
credentials for worker, producer, and consumer.

> Connect Producer and Consumer ignore ssl parameters configured for worker
> -
>
> Key: KAFKA-4567
> URL: https://issues.apache.org/jira/browse/KAFKA-4567
> Project: Kafka
>  Issue Type: Bug
>  Components: KafkaConnect
>Affects Versions: 0.10.1.1
>Reporter: Sönke Liebau
>Assignee: Ewen Cheslack-Postava
>Priority: Minor
>
> When using Connect with a SSL enabled Kafka cluster, the configuration 
> options are either documented a bit misleading, or handled in an incorrect 
> way.
> The documentation states the usual available SSL options 
> (ssl.keystore.location, ssl.truststore.location, ...) and these are picked up 
> and used for the producers and consumers that are used to communicate with 
> the status, offset and configs topics.
> For the producers and consumers that are used for the actual data, these 
> parameters are ignored as can be seen 
> [here|https://github.com/apache/kafka/blob/trunk/connect/runtime/src/main/java/org/apache/kafka/connect/runtime/Worker.java#L98],
>  which results in plaintext communication on an SSL port, leading to an OOM 
> exception ([KAFKA-4493|https://issues.apache.org/jira/browse/KAFKA-4493]).
> So in order to get Connect to communicate with a secured cluster you need to 
> override all SSL configs with the prefixes _consumer._ and _producer._ and 
> duplicate the values already set at a global level.
> The documentation states: 
> bq. The most critical site-specific options, such as the Kafka bootstrap 
> servers, are already exposed via the standard worker configuration.
> Since the address for the cluster is exposed here, I would propose that there 
> is no reason not to also pass the SSL parameters through to the consumers and 
> producers, as it is clearly intended that communication happens with the same 
> cluster. 
> In fringe cases, these can still be overridden manually to achieve different 
> behavior.
> I am happy to create a pull request to address this or clarify the docs, 
> after we decide which one is the appropriate course of action.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (KAFKA-2260) Allow specifying expected offset on produce

2016-12-29 Thread Ewen Cheslack-Postava (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-2260?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15786719#comment-15786719
 ] 

Ewen Cheslack-Postava commented on KAFKA-2260:
--

[~wwarshaw] That sounds right -- the epoch for the PID would ensure a single 
writer and then the actual offset wouldn't matter.

KIP-98 hasn't been voted on yet, so it's be difficult to give a timeline now, 
but it seems unlikely to happen before the June release timeframe.

> Allow specifying expected offset on produce
> ---
>
> Key: KAFKA-2260
> URL: https://issues.apache.org/jira/browse/KAFKA-2260
> Project: Kafka
>  Issue Type: Improvement
>Reporter: Ben Kirwin
>Assignee: Ewen Cheslack-Postava
>Priority: Minor
> Attachments: KAFKA-2260.patch, expected-offsets.patch
>
>
> I'd like to propose a change that adds a simple CAS-like mechanism to the 
> Kafka producer. This update has a small footprint, but enables a bunch of 
> interesting uses in stream processing or as a commit log for process state.
> h4. Proposed Change
> In short:
> - Allow the user to attach a specific offset to each message produced.
> - The server assigns offsets to messages in the usual way. However, if the 
> expected offset doesn't match the actual offset, the server should fail the 
> produce request instead of completing the write.
> This is a form of optimistic concurrency control, like the ubiquitous 
> check-and-set -- but instead of checking the current value of some state, it 
> checks the current offset of the log.
> h4. Motivation
> Much like check-and-set, this feature is only useful when there's very low 
> contention. Happily, when Kafka is used as a commit log or as a 
> stream-processing transport, it's common to have just one producer (or a 
> small number) for a given partition -- and in many of these cases, predicting 
> offsets turns out to be quite useful.
> - We get the same benefits as the 'idempotent producer' proposal: a producer 
> can retry a write indefinitely and be sure that at most one of those attempts 
> will succeed; and if two producers accidentally write to the end of the 
> partition at once, we can be certain that at least one of them will fail.
> - It's possible to 'bulk load' Kafka this way -- you can write a list of n 
> messages consecutively to a partition, even if the list is much larger than 
> the buffer size or the producer has to be restarted.
> - If a process is using Kafka as a commit log -- reading from a partition to 
> bootstrap, then writing any updates to that same partition -- it can be sure 
> that it's seen all of the messages in that partition at the moment it does 
> its first (successful) write.
> There's a bunch of other similar use-cases here, but they all have roughly 
> the same flavour.
> h4. Implementation
> The major advantage of this proposal over other suggested transaction / 
> idempotency mechanisms is its minimality: it gives the 'obvious' meaning to a 
> currently-unused field, adds no new APIs, and requires very little new code 
> or additional work from the server.
> - Produced messages already carry an offset field, which is currently ignored 
> by the server. This field could be used for the 'expected offset', with a 
> sigil value for the current behaviour. (-1 is a natural choice, since it's 
> already used to mean 'next available offset'.)
> - We'd need a new error and error code for a 'CAS failure'.
> - The server assigns offsets to produced messages in 
> {{ByteBufferMessageSet.validateMessagesAndAssignOffsets}}. After this 
> changed, this method would assign offsets in the same way -- but if they 
> don't match the offset in the message, we'd return an error instead of 
> completing the write.
> - To avoid breaking existing clients, this behaviour would need to live 
> behind some config flag. (Possibly global, but probably more useful 
> per-topic?)
> I understand all this is unsolicited and possibly strange: happy to answer 
> questions, and if this seems interesting, I'd be glad to flesh this out into 
> a full KIP or patch. (And apologies if this is the wrong venue for this sort 
> of thing!)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (KAFKA-4570) How to transfer extended fields in producing or consuming requests.

2016-12-29 Thread Ewen Cheslack-Postava (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-4570?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15786713#comment-15786713
 ] 

Ewen Cheslack-Postava commented on KAFKA-4570:
--

[~zander] You're correct that Kafka does not handle extra user metadata today. 
However, there is an ongoing discussion to [add headers to Kafka 
messages|https://cwiki.apache.org/confluence/display/KAFKA/KIP-82+-+Add+Record+Headers].

> How to transfer extended fields in producing or consuming requests.
> ---
>
> Key: KAFKA-4570
> URL: https://issues.apache.org/jira/browse/KAFKA-4570
> Project: Kafka
>  Issue Type: Wish
>  Components: clients
>Affects Versions: 0.10.1.1
>Reporter: zander
>Priority: Critical
>  Labels: features
>
> We encounter a problem that  we can not transfer extended fields for 
> producing or consuming requests to the broker.
> We want to validate  the producers or consumers in a custom way other than 
> using SSL.
> In general, such as JMS, it is possible to transfer user-related fields to 
> server.
> But it seems that Kafka dose not support this, its protocol is very tight and 
>  unable to add user-defined fields.
> So is there any way  achieving this goal ?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (KAFKA-4571) Consumer fails to retrieve messages if started before producer

2016-12-29 Thread Ewen Cheslack-Postava (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-4571?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15786707#comment-15786707
 ] 

Ewen Cheslack-Postava commented on KAFKA-4571:
--

Since another consumer group ID consumes the message, I'm assuming you're using 
auto.offset.reset = earliest?

Does the consumer *never* consume messages, or do you just need to wait long 
enough? If you start the consumer before the topic exists, it will get metadata 
indicating there are no partitions. I think you'd ned to wait for the metadata 
timeout before it would try to refresh the metadata and finally see the created 
topic partitions (default metadata.max.age.ms is 5 minutes).

> Consumer fails to retrieve messages if started before producer
> --
>
> Key: KAFKA-4571
> URL: https://issues.apache.org/jira/browse/KAFKA-4571
> Project: Kafka
>  Issue Type: Bug
>  Components: clients
>Affects Versions: 0.10.1.1
> Environment: Ubuntu Desktop 16.04 LTS, Oracle Java 8 1.8.0_101, Core 
> i7 4770K
>Reporter: Sergiu Hlihor
>
> In a configuration where topic was never created before, starting the 
> consumer before the producer leads to no message being consumed 
> (KafkaConsumer.pool() returns always an instance of ConsumerRecords with 0 
> count ). 
> Starting another consumer on the same group, same topic after messages were 
> produced is still not consuming them. Starting another consumer with another 
> groupId appears to be working.
> In the consumer logs I see: WARN  NetworkClient - Error while fetching 
> metadata with correlation id 1 : {measurements021=LEADER_NOT_AVAILABLE} 
> Both producer and consumer were launched from inside same JVM. 
> The configuration used is the standard one found in Kafka distribution. If 
> this is a configuration issue, please suggest any change that I should do.
> Thank you



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (KAFKA-4115) Grow default heap settings for distributed Connect from 256M to 1G

2016-12-29 Thread Ewen Cheslack-Postava (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-4115?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ewen Cheslack-Postava updated KAFKA-4115:
-
Fix Version/s: 0.10.2.0

> Grow default heap settings for distributed Connect from 256M to 1G
> --
>
> Key: KAFKA-4115
> URL: https://issues.apache.org/jira/browse/KAFKA-4115
> Project: Kafka
>  Issue Type: Improvement
>  Components: KafkaConnect
>Reporter: Shikhar Bhushan
>Assignee: Shikhar Bhushan
> Fix For: 0.10.2.0
>
>
> Currently, both {{connect-standalone.sh}} and {{connect-distributed.sh}} 
> start the Connect JVM with the default heap settings from 
> {{kafka-run-class.sh}} of {{-Xmx256M}}.
> At least for distributed connect, we should default to a much higher limit 
> like 1G. While the 'correct' sizing is workload dependent, with a system 
> where you can run arbitrary connector plugins which may perform buffering of 
> data, we should provide for more headroom.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (KAFKA-3947) kafka-reassign-partitions.sh should support dumping current assignment

2016-12-29 Thread Ewen Cheslack-Postava (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-3947?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15786244#comment-15786244
 ] 

Ewen Cheslack-Postava commented on KAFKA-3947:
--

[~kawamuray] Any progress on this? It got bumped to 0.10.2.0 since the patch 
was still in flight, and now we're getting closer to 0.10.2 feature freeze (and 
this may need some public discussion which can take some time). Any follow up 
based on the comments above?

> kafka-reassign-partitions.sh should support dumping current assignment
> --
>
> Key: KAFKA-3947
> URL: https://issues.apache.org/jira/browse/KAFKA-3947
> Project: Kafka
>  Issue Type: Improvement
>  Components: tools
>Affects Versions: 0.10.0.0
>Reporter: Yuto Kawamura
>Assignee: Yuto Kawamura
>Priority: Minor
> Fix For: 0.10.2.0
>
>
> When I building my own tool to perform reassignment of partitions, I realized 
> that there's no way to dump the current partition assignment in machine 
> parsable format such as JSON.
> Actually giving {{\-\-generate}} option to the kafka-reassign-partitions.sh 
> script dumps the current assignment of topic given by 
> {{\-\-topics-to-assign-json-file}} but it's very inconvenient because of:
> - I want the dump containing all topics. That is, I wanna skip generating the 
> list of current topics to pass it to the generate command.
> - The output is concatenated with the result of reassignment so can't do 
> simply something like: {{kafka-reassign-partitions.sh --generate ... > 
> current-assignment.json}}
> - Don't need to ask kafka to generate reassginment to get the current 
> assignment in the first place.
> Here I'd like to add the {{\-\-dump}} option to kafka-reassign-partitions.sh.
> I was wondering whether this functionality should be provided by 
> {{kafka-reassign-partitions.sh}} or {{kafka-topics.sh}} but now I think 
> {{kafka-reassign-partitions.sh}} should be much proper as the resulting JSON 
> should be in the format of {{\-\-reassignment-json-file}} which sticks to 
> this command.
> Will follow up the patch implements this shortly.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (KAFKA-4404) Add knowledge of sign to numeric schema types

2016-12-29 Thread Ewen Cheslack-Postava (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-4404?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ewen Cheslack-Postava updated KAFKA-4404:
-
Fix Version/s: 0.10.2.0
   Status: Patch Available  (was: Open)

> Add knowledge of sign to numeric schema types
> -
>
> Key: KAFKA-4404
> URL: https://issues.apache.org/jira/browse/KAFKA-4404
> Project: Kafka
>  Issue Type: Improvement
>  Components: KafkaConnect
>Affects Versions: 0.10.0.1
>Reporter: Andy Bryant
>Assignee: Ewen Cheslack-Postava
>Priority: Minor
> Fix For: 0.10.2.0
>
>
> For KafkaConnect schemas there is currently no concept of whether a numeric 
> field is signed or unsigned. 
> Add an additional `signed` attribute (like optional) or make it explicit that 
> numeric types must be signed.
> You could encode this as a parameter on the schema but this would not be 
> standard across all connectors.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Resolved] (KAFKA-4092) retention.bytes should not be allowed to be less than segment.bytes

2016-12-27 Thread Ewen Cheslack-Postava (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-4092?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ewen Cheslack-Postava resolved KAFKA-4092.
--
   Resolution: Fixed
 Reviewer: Ewen Cheslack-Postava
Fix Version/s: 0.10.2.0

> retention.bytes should not be allowed to be less than segment.bytes
> ---
>
> Key: KAFKA-4092
> URL: https://issues.apache.org/jira/browse/KAFKA-4092
> Project: Kafka
>  Issue Type: Improvement
>  Components: log
>Reporter: Dustin Cote
>Assignee: Dustin Cote
>Priority: Minor
> Fix For: 0.10.2.0
>
>
> Right now retention.bytes can be as small as the user wants but it doesn't 
> really get acted on for the active segment if retention.bytes is smaller than 
> segment.bytes.  We shouldn't allow retention.bytes to be less than 
> segment.bytes and validate that at startup.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (KAFKA-4551) StreamsSmokeTest.test_streams intermittent failure

2016-12-26 Thread Ewen Cheslack-Postava (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-4551?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ewen Cheslack-Postava updated KAFKA-4551:
-
Labels: system-test-failure  (was: )

> StreamsSmokeTest.test_streams intermittent failure
> --
>
> Key: KAFKA-4551
> URL: https://issues.apache.org/jira/browse/KAFKA-4551
> Project: Kafka
>  Issue Type: Bug
>Reporter: Roger Hoover
>Assignee: Damian Guy
>Priority: Blocker
>  Labels: system-test-failure
> Fix For: 0.10.1.2
>
>
> {code}
> test_id:
> kafkatest.tests.streams.streams_smoke_test.StreamsSmokeTest.test_streams
> status: FAIL
> run time:   4 minutes 44.872 seconds
> 
> Traceback (most recent call last):
>   File 
> "/var/lib/jenkins/workspace/system-test-kafka/kafka/venv/local/lib/python2.7/site-packages/ducktape-0.6.0-py2.7.egg/ducktape/tests/runner_client.py",
>  line 123, in run
> data = self.run_test()
>   File 
> "/var/lib/jenkins/workspace/system-test-kafka/kafka/venv/local/lib/python2.7/site-packages/ducktape-0.6.0-py2.7.egg/ducktape/tests/runner_client.py",
>  line 176, in run_test
> return self.test_context.function(self.test)
>   File 
> "/var/lib/jenkins/workspace/system-test-kafka/kafka/tests/kafkatest/tests/streams/streams_smoke_test.py",
>  line 78, in test_streams
> node.account.ssh("grep SUCCESS %s" % self.driver.STDOUT_FILE, 
> allow_fail=False)
>   File 
> "/var/lib/jenkins/workspace/system-test-kafka/kafka/venv/local/lib/python2.7/site-packages/ducktape-0.6.0-py2.7.egg/ducktape/cluster/remoteaccount.py",
>  line 253, in ssh
> raise RemoteCommandError(self, cmd, exit_status, stderr.read())
> RemoteCommandError: ubuntu@worker6: Command 'grep SUCCESS 
> /mnt/streams/streams.stdout' returned non-zero exit status 1.
> {code}
> http://confluent-kafka-system-test-results.s3-us-west-2.amazonaws.com/2016-12-15--001.1481794587--apache--trunk--7049938/StreamsSmokeTest/test_streams/91.tgz



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (KAFKA-4551) StreamsSmokeTest.test_streams intermittent failure

2016-12-26 Thread Ewen Cheslack-Postava (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-4551?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15779574#comment-15779574
 ] 

Ewen Cheslack-Postava commented on KAFKA-4551:
--

Another case of this today: 
http://confluent-kafka-0-10-1-system-test-results.s3-us-west-2.amazonaws.com/2016-12-26--001.1482752503--apache--0.10.1--8c62858/report.html

This probably should have been a blocker for 0.10.1.1 unless we have an 
explanation for what's going wrong. I'm going to up the priority, assign to 
[~damianguy], and mark for 0.10.1.2 just to make sure this gets triaged at some 
point.

> StreamsSmokeTest.test_streams intermittent failure
> --
>
> Key: KAFKA-4551
> URL: https://issues.apache.org/jira/browse/KAFKA-4551
> Project: Kafka
>  Issue Type: Bug
>Reporter: Roger Hoover
>Priority: Blocker
> Fix For: 0.10.1.2
>
>
> {code}
> test_id:
> kafkatest.tests.streams.streams_smoke_test.StreamsSmokeTest.test_streams
> status: FAIL
> run time:   4 minutes 44.872 seconds
> 
> Traceback (most recent call last):
>   File 
> "/var/lib/jenkins/workspace/system-test-kafka/kafka/venv/local/lib/python2.7/site-packages/ducktape-0.6.0-py2.7.egg/ducktape/tests/runner_client.py",
>  line 123, in run
> data = self.run_test()
>   File 
> "/var/lib/jenkins/workspace/system-test-kafka/kafka/venv/local/lib/python2.7/site-packages/ducktape-0.6.0-py2.7.egg/ducktape/tests/runner_client.py",
>  line 176, in run_test
> return self.test_context.function(self.test)
>   File 
> "/var/lib/jenkins/workspace/system-test-kafka/kafka/tests/kafkatest/tests/streams/streams_smoke_test.py",
>  line 78, in test_streams
> node.account.ssh("grep SUCCESS %s" % self.driver.STDOUT_FILE, 
> allow_fail=False)
>   File 
> "/var/lib/jenkins/workspace/system-test-kafka/kafka/venv/local/lib/python2.7/site-packages/ducktape-0.6.0-py2.7.egg/ducktape/cluster/remoteaccount.py",
>  line 253, in ssh
> raise RemoteCommandError(self, cmd, exit_status, stderr.read())
> RemoteCommandError: ubuntu@worker6: Command 'grep SUCCESS 
> /mnt/streams/streams.stdout' returned non-zero exit status 1.
> {code}
> http://confluent-kafka-system-test-results.s3-us-west-2.amazonaws.com/2016-12-15--001.1481794587--apache--trunk--7049938/StreamsSmokeTest/test_streams/91.tgz



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (KAFKA-4551) StreamsSmokeTest.test_streams intermittent failure

2016-12-26 Thread Ewen Cheslack-Postava (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-4551?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ewen Cheslack-Postava updated KAFKA-4551:
-
Assignee: Damian Guy

> StreamsSmokeTest.test_streams intermittent failure
> --
>
> Key: KAFKA-4551
> URL: https://issues.apache.org/jira/browse/KAFKA-4551
> Project: Kafka
>  Issue Type: Bug
>Reporter: Roger Hoover
>Assignee: Damian Guy
>Priority: Blocker
> Fix For: 0.10.1.2
>
>
> {code}
> test_id:
> kafkatest.tests.streams.streams_smoke_test.StreamsSmokeTest.test_streams
> status: FAIL
> run time:   4 minutes 44.872 seconds
> 
> Traceback (most recent call last):
>   File 
> "/var/lib/jenkins/workspace/system-test-kafka/kafka/venv/local/lib/python2.7/site-packages/ducktape-0.6.0-py2.7.egg/ducktape/tests/runner_client.py",
>  line 123, in run
> data = self.run_test()
>   File 
> "/var/lib/jenkins/workspace/system-test-kafka/kafka/venv/local/lib/python2.7/site-packages/ducktape-0.6.0-py2.7.egg/ducktape/tests/runner_client.py",
>  line 176, in run_test
> return self.test_context.function(self.test)
>   File 
> "/var/lib/jenkins/workspace/system-test-kafka/kafka/tests/kafkatest/tests/streams/streams_smoke_test.py",
>  line 78, in test_streams
> node.account.ssh("grep SUCCESS %s" % self.driver.STDOUT_FILE, 
> allow_fail=False)
>   File 
> "/var/lib/jenkins/workspace/system-test-kafka/kafka/venv/local/lib/python2.7/site-packages/ducktape-0.6.0-py2.7.egg/ducktape/cluster/remoteaccount.py",
>  line 253, in ssh
> raise RemoteCommandError(self, cmd, exit_status, stderr.read())
> RemoteCommandError: ubuntu@worker6: Command 'grep SUCCESS 
> /mnt/streams/streams.stdout' returned non-zero exit status 1.
> {code}
> http://confluent-kafka-system-test-results.s3-us-west-2.amazonaws.com/2016-12-15--001.1481794587--apache--trunk--7049938/StreamsSmokeTest/test_streams/91.tgz



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (KAFKA-4551) StreamsSmokeTest.test_streams intermittent failure

2016-12-26 Thread Ewen Cheslack-Postava (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-4551?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ewen Cheslack-Postava updated KAFKA-4551:
-
Fix Version/s: 0.10.1.2

> StreamsSmokeTest.test_streams intermittent failure
> --
>
> Key: KAFKA-4551
> URL: https://issues.apache.org/jira/browse/KAFKA-4551
> Project: Kafka
>  Issue Type: Bug
>Reporter: Roger Hoover
>Priority: Blocker
> Fix For: 0.10.1.2
>
>
> {code}
> test_id:
> kafkatest.tests.streams.streams_smoke_test.StreamsSmokeTest.test_streams
> status: FAIL
> run time:   4 minutes 44.872 seconds
> 
> Traceback (most recent call last):
>   File 
> "/var/lib/jenkins/workspace/system-test-kafka/kafka/venv/local/lib/python2.7/site-packages/ducktape-0.6.0-py2.7.egg/ducktape/tests/runner_client.py",
>  line 123, in run
> data = self.run_test()
>   File 
> "/var/lib/jenkins/workspace/system-test-kafka/kafka/venv/local/lib/python2.7/site-packages/ducktape-0.6.0-py2.7.egg/ducktape/tests/runner_client.py",
>  line 176, in run_test
> return self.test_context.function(self.test)
>   File 
> "/var/lib/jenkins/workspace/system-test-kafka/kafka/tests/kafkatest/tests/streams/streams_smoke_test.py",
>  line 78, in test_streams
> node.account.ssh("grep SUCCESS %s" % self.driver.STDOUT_FILE, 
> allow_fail=False)
>   File 
> "/var/lib/jenkins/workspace/system-test-kafka/kafka/venv/local/lib/python2.7/site-packages/ducktape-0.6.0-py2.7.egg/ducktape/cluster/remoteaccount.py",
>  line 253, in ssh
> raise RemoteCommandError(self, cmd, exit_status, stderr.read())
> RemoteCommandError: ubuntu@worker6: Command 'grep SUCCESS 
> /mnt/streams/streams.stdout' returned non-zero exit status 1.
> {code}
> http://confluent-kafka-system-test-results.s3-us-west-2.amazonaws.com/2016-12-15--001.1481794587--apache--trunk--7049938/StreamsSmokeTest/test_streams/91.tgz



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (KAFKA-4558) throttling_test fails if the producer starts too fast.

2016-12-25 Thread Ewen Cheslack-Postava (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-4558?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15776893#comment-15776893
 ] 

Ewen Cheslack-Postava commented on KAFKA-4558:
--

And this is also happening, albeit less frequently, on the 3.1.x branch: 
http://confluent-kafka-0-10-1-system-test-results.s3-us-west-2.amazonaws.com/2016-12-25--001.1482666294--apache--0.10.1--8c62858/report.html

> throttling_test fails if the producer starts too fast.
> --
>
> Key: KAFKA-4558
> URL: https://issues.apache.org/jira/browse/KAFKA-4558
> Project: Kafka
>  Issue Type: Bug
>Reporter: Apurva Mehta
>Assignee: Apurva Mehta
>
> As described in https://issues.apache.org/jira/browse/KAFKA-4526, the 
> throttling test will fail if the producer in the produce-consume-validate 
> loop starts up before the consumer is fully initialized.
> We need to block the start of the producer until the consumer is ready to go. 
> The current plan is to poll the consumer for a particular metric (like, for 
> instance, partition assignment) which will act as a good proxy for successful 
> initialization. Currently, we just check for the existence of a process with 
> the PID, which is not a strong enough check, causing the test to fail 
> intermittently. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Resolved] (KAFKA-4527) Transient failure of ConnectDistributedTest.test_pause_and_resume_sink where paused connector produces messages

2016-12-24 Thread Ewen Cheslack-Postava (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-4527?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ewen Cheslack-Postava resolved KAFKA-4527.
--
Resolution: Fixed
  Reviewer: Ewen Cheslack-Postava

> Transient failure of ConnectDistributedTest.test_pause_and_resume_sink where 
> paused connector produces messages
> ---
>
> Key: KAFKA-4527
> URL: https://issues.apache.org/jira/browse/KAFKA-4527
> Project: Kafka
>  Issue Type: Bug
>  Components: KafkaConnect, system tests
>Reporter: Ewen Cheslack-Postava
>Assignee: Shikhar Bhushan
>  Labels: system-test-failure, system-tests
> Fix For: 0.10.2.0
>
>
> {quote}
> 
> test_id:
> kafkatest.tests.connect.connect_distributed_test.ConnectDistributedTest.test_pause_and_resume_sink
> status: FAIL
> run time:   40.164 seconds
> Paused sink connector should not consume any messages
> Traceback (most recent call last):
>   File 
> "/var/lib/jenkins/workspace/system-test-kafka/kafka/venv/local/lib/python2.7/site-packages/ducktape-0.6.0-py2.7.egg/ducktape/tests/runner_client.py",
>  line 123, in run
> data = self.run_test()
>   File 
> "/var/lib/jenkins/workspace/system-test-kafka/kafka/venv/local/lib/python2.7/site-packages/ducktape-0.6.0-py2.7.egg/ducktape/tests/runner_client.py",
>  line 176, in run_test
> return self.test_context.function(self.test)
>   File 
> "/var/lib/jenkins/workspace/system-test-kafka/kafka/tests/kafkatest/tests/connect/connect_distributed_test.py",
>  line 257, in test_pause_and_resume_sink
> assert num_messages == len(self.sink.received_messages()), "Paused sink 
> connector should not consume any messages"
> AssertionError: Paused sink connector should not consume any messages
> {quote}
> See one case here: 
> http://confluent-kafka-system-test-results.s3-us-west-2.amazonaws.com/2016-12-12--001.1481535295--apache--trunk--62e043a/report.html
>  but it has also happened before, e.g. 
> http://confluent-kafka-system-test-results.s3-us-west-2.amazonaws.com/2016-12-06--001.1481017508--apache--trunk--34aa538/report.html
> Thinking about the test, one simple possibility is that our approach to get 
> the number of messages produced/consumed during the test is flawed -- I think 
> we may not account for additional buffering between the connectors and the 
> process reading their output to determine what they have produced. However, 
> that's just a theory -- the minimal checking on the logs that I did didn't 
> reveal anything obviously wrong.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (KAFKA-4558) throttling_test fails if the producer starts too fast.

2016-12-23 Thread Ewen Cheslack-Postava (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-4558?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15773299#comment-15773299
 ] 

Ewen Cheslack-Postava commented on KAFKA-4558:
--

There's another case that looks basically the same as this issue:

http://confluent-kafka-system-test-results.s3-us-west-2.amazonaws.com/2016-12-23--001.1482484603--apache--trunk--76169f9/report.html

{quote}
test_id:
kafkatest.tests.core.replication_test.ReplicationTest.test_replication_with_broker_failure.security_protocol=SASL_SSL.failure_mode=hard_bounce.broker_type=controller
status: FAIL
run time:   3 minutes 27.556 seconds


9 acked message did not make it to the Consumer. They are: [3425, 3428, 
3404, 3407, 3410, 3413, 3416, 3419, 3422]. We validated that the first 9 of 
these missing messages correctly made it into Kafka's data files. This suggests 
they were lost on their way to the consumer.
Traceback (most recent call last):
  File 
"/var/lib/jenkins/workspace/system-test-kafka/kafka/venv/local/lib/python2.7/site-packages/ducktape-0.6.0-py2.7.egg/ducktape/tests/runner_client.py",
 line 123, in run
data = self.run_test()
  File 
"/var/lib/jenkins/workspace/system-test-kafka/kafka/venv/local/lib/python2.7/site-packages/ducktape-0.6.0-py2.7.egg/ducktape/tests/runner_client.py",
 line 176, in run_test
return self.test_context.function(self.test)
  File 
"/var/lib/jenkins/workspace/system-test-kafka/kafka/venv/local/lib/python2.7/site-packages/ducktape-0.6.0-py2.7.egg/ducktape/mark/_mark.py",
 line 321, in wrapper
return functools.partial(f, *args, **kwargs)(*w_args, **w_kwargs)
  File 
"/var/lib/jenkins/workspace/system-test-kafka/kafka/tests/kafkatest/tests/core/replication_test.py",
 line 155, in test_replication_with_broker_failure
self.run_produce_consume_validate(core_test_action=lambda: 
failures[failure_mode](self, broker_type))
  File 
"/var/lib/jenkins/workspace/system-test-kafka/kafka/tests/kafkatest/tests/produce_consume_validate.py",
 line 101, in run_produce_consume_validate
self.validate()
  File 
"/var/lib/jenkins/workspace/system-test-kafka/kafka/tests/kafkatest/tests/produce_consume_validate.py",
 line 163, in validate
assert success, msg
AssertionError: 9 acked message did not make it to the Consumer. They are: 
[3425, 3428, 3404, 3407, 3410, 3413, 3416, 3419, 3422]. We validated that the 
first 9 of these missing messages correctly made it into Kafka's data files. 
This suggests they were lost on their way to the consumer.
{quote}

These are in the middle of the set and are all 3 apart, which is presumably due 
to the fact that there are 3 partitions in the topic and we are seeing a piece 
of one of the partitions missing instead of all 3. I think probably this is 
fairly pervasive in the ProduceConsumeValidate tests, so may not be "fixable" 
just by ignoring tests one-off.

> throttling_test fails if the producer starts too fast.
> --
>
> Key: KAFKA-4558
> URL: https://issues.apache.org/jira/browse/KAFKA-4558
> Project: Kafka
>  Issue Type: Bug
>Reporter: Apurva Mehta
>Assignee: Apurva Mehta
>
> As described in https://issues.apache.org/jira/browse/KAFKA-4526, the 
> throttling test will fail if the producer in the produce-consume-validate 
> loop starts up before the consumer is fully initialized.
> We need to block the start of the producer until the consumer is ready to go. 
> The current plan is to poll the consumer for a particular metric (like, for 
> instance, partition assignment) which will act as a good proxy for successful 
> initialization. Currently, we just check for the existence of a process with 
> the PID, which is not a strong enough check, causing the test to fail 
> intermittently. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (KAFKA-4553) Connect's round robin assignment produces undesirable distribution of connectors/tasks

2016-12-17 Thread Ewen Cheslack-Postava (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-4553?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ewen Cheslack-Postava updated KAFKA-4553:
-
Status: Patch Available  (was: Open)

> Connect's round robin assignment produces undesirable distribution of 
> connectors/tasks
> --
>
> Key: KAFKA-4553
> URL: https://issues.apache.org/jira/browse/KAFKA-4553
> Project: Kafka
>  Issue Type: Bug
>  Components: KafkaConnect
>Affects Versions: 0.10.1.0
>Reporter: Ewen Cheslack-Postava
>Assignee: Ewen Cheslack-Postava
>
> Currently the round robin assignment in Connect looks something like this:
> foreach connector {
>   assign connector to next worker
>   for each task in connector {
> assign task to next member
>   }
> }
> For the most part we assume that connectors and tasks are effectively 
> equivalent units of work, but this is actually rarely the case. Connectors 
> are usually much lighterweight as they are just monitoring for changes in the 
> source/sink system and tasks are doing the heavy lifting. The way we are 
> currently doing round robin assignment then causes uneven distributions of 
> work in some cases that are not too uncommon.
> In particular, it gets bad if there are an even number of workers and 
> connectors that generate only a single task since this results in the even 
> #'d workers always getting assigned connectors and odd workers always getting 
> assigned tasks. An extreme case of this is when users start distributed mode 
> clusters with just a couple of workers to get started and deploy multiple 
> single-task connectors (e.g. CDC connectors like Debezium would be a common 
> example). All the connectors end up on one worker, all the tasks end up on 
> the other, and the second worker becomes overloaded.
> Although the ideal solution to this problem is to have a better idea of how 
> much load each connector/task will generate, I don't think we want to get 
> into the business of full-on cluster resource management. An alternative 
> which I think avoids this common pitfall without the risk of hitting another 
> common bad case is to change the algorithm to assign all the connectors 
> first, then all the tasks, i.e.
> foreach connector {
>   assign connector to next worker
> }
> foreach connector {
>   for each task in connector {
> assign task to next worker
>   }
> }



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (KAFKA-4553) Connect's round robin assignment produces undesirable distribution of connectors/tasks

2016-12-17 Thread Ewen Cheslack-Postava (JIRA)

Ewen Cheslack-Postava created KAFKA-4553:


 Summary: Connect's round robin assignment produces undesirable 
distribution of connectors/tasks
 Key: KAFKA-4553
 URL: https://issues.apache.org/jira/browse/KAFKA-4553
 Project: Kafka
  Issue Type: Bug
  Components: KafkaConnect
Affects Versions: 0.10.1.0
Reporter: Ewen Cheslack-Postava
Assignee: Ewen Cheslack-Postava


Currently the round robin assignment in Connect looks something like this:

foreach connector {
  assign connector to next worker
  for each task in connector {
assign task to next member
  }
}

For the most part we assume that connectors and tasks are effectively 
equivalent units of work, but this is actually rarely the case. Connectors are 
usually much lighterweight as they are just monitoring for changes in the 
source/sink system and tasks are doing the heavy lifting. The way we are 
currently doing round robin assignment then causes uneven distributions of work 
in some cases that are not too uncommon.

In particular, it gets bad if there are an even number of workers and 
connectors that generate only a single task since this results in the even #'d 
workers always getting assigned connectors and odd workers always getting 
assigned tasks. An extreme case of this is when users start distributed mode 
clusters with just a couple of workers to get started and deploy multiple 
single-task connectors (e.g. CDC connectors like Debezium would be a common 
example). All the connectors end up on one worker, all the tasks end up on the 
other, and the second worker becomes overloaded.

Although the ideal solution to this problem is to have a better idea of how 
much load each connector/task will generate, I don't think we want to get into 
the business of full-on cluster resource management. An alternative which I 
think avoids this common pitfall without the risk of hitting another common bad 
case is to change the algorithm to assign all the connectors first, then all 
the tasks, i.e.

foreach connector {
  assign connector to next worker
}
foreach connector {
  for each task in connector {
assign task to next worker
  }
}




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (KAFKA-4527) Transient failure of ConnectDistributedTest.test_pause_and_resume_sink where paused connector produces messages

2016-12-12 Thread Ewen Cheslack-Postava (JIRA)

Ewen Cheslack-Postava created KAFKA-4527:


 Summary: Transient failure of 
ConnectDistributedTest.test_pause_and_resume_sink where paused connector 
produces messages
 Key: KAFKA-4527
 URL: https://issues.apache.org/jira/browse/KAFKA-4527
 Project: Kafka
  Issue Type: Bug
  Components: KafkaConnect, system tests
Reporter: Ewen Cheslack-Postava
Assignee: Shikhar Bhushan
 Fix For: 0.10.2.0


{quote}

test_id:
kafkatest.tests.connect.connect_distributed_test.ConnectDistributedTest.test_pause_and_resume_sink
status: FAIL
run time:   40.164 seconds


Paused sink connector should not consume any messages
Traceback (most recent call last):
  File 
"/var/lib/jenkins/workspace/system-test-kafka/kafka/venv/local/lib/python2.7/site-packages/ducktape-0.6.0-py2.7.egg/ducktape/tests/runner_client.py",
 line 123, in run
data = self.run_test()
  File 
"/var/lib/jenkins/workspace/system-test-kafka/kafka/venv/local/lib/python2.7/site-packages/ducktape-0.6.0-py2.7.egg/ducktape/tests/runner_client.py",
 line 176, in run_test
return self.test_context.function(self.test)
  File 
"/var/lib/jenkins/workspace/system-test-kafka/kafka/tests/kafkatest/tests/connect/connect_distributed_test.py",
 line 257, in test_pause_and_resume_sink
assert num_messages == len(self.sink.received_messages()), "Paused sink 
connector should not consume any messages"
AssertionError: Paused sink connector should not consume any messages
{quote}

See one case here: 
http://confluent-kafka-system-test-results.s3-us-west-2.amazonaws.com/2016-12-12--001.1481535295--apache--trunk--62e043a/report.html
 but it has also happened before, e.g. 
http://confluent-kafka-system-test-results.s3-us-west-2.amazonaws.com/2016-12-06--001.1481017508--apache--trunk--34aa538/report.html

Thinking about the test, one simple possibility is that our approach to get the 
number of messages produced/consumed during the test is flawed -- I think we 
may not account for additional buffering between the connectors and the process 
reading their output to determine what they have produced. However, that's just 
a theory -- the minimal checking on the logs that I did didn't reveal anything 
obviously wrong.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (KAFKA-4526) Transient failure in ThrottlingTest.test_throttled_reassignment

2016-12-12 Thread Ewen Cheslack-Postava (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-4526?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15743246#comment-15743246
 ] 

Ewen Cheslack-Postava commented on KAFKA-4526:
--

re: related test failures, we're also seeing this in the same test run:

{quote}

test_id:
kafkatest.tests.core.replication_test.ReplicationTest.test_replication_with_broker_failure.security_protocol=SASL_SSL.failure_mode=hard_bounce.broker_type=controller
status: FAIL
run time:   3 minutes 35.081 seconds


2 acked message did not make it to the Consumer. They are: [43137, 43140]. 
We validated that the first 2 of these missing messages correctly made it into 
Kafka's data files. This suggests they were lost on their way to the 
consumer.(There are also 1110 duplicate messages in the log - but that is an 
acceptable outcome)

Traceback (most recent call last):
  File 
"/var/lib/jenkins/workspace/system-test-kafka/kafka/venv/local/lib/python2.7/site-packages/ducktape-0.6.0-py2.7.egg/ducktape/tests/runner_client.py",
 line 123, in run
data = self.run_test()
  File 
"/var/lib/jenkins/workspace/system-test-kafka/kafka/venv/local/lib/python2.7/site-packages/ducktape-0.6.0-py2.7.egg/ducktape/tests/runner_client.py",
 line 176, in run_test
return self.test_context.function(self.test)
  File 
"/var/lib/jenkins/workspace/system-test-kafka/kafka/venv/local/lib/python2.7/site-packages/ducktape-0.6.0-py2.7.egg/ducktape/mark/_mark.py",
 line 321, in wrapper
return functools.partial(f, *args, **kwargs)(*w_args, **w_kwargs)
  File 
"/var/lib/jenkins/workspace/system-test-kafka/kafka/tests/kafkatest/tests/core/replication_test.py",
 line 155, in test_replication_with_broker_failure
self.run_produce_consume_validate(core_test_action=lambda: 
failures[failure_mode](self, broker_type))
  File 
"/var/lib/jenkins/workspace/system-test-kafka/kafka/tests/kafkatest/tests/produce_consume_validate.py",
 line 101, in run_produce_consume_validate
self.validate()
  File 
"/var/lib/jenkins/workspace/system-test-kafka/kafka/tests/kafkatest/tests/produce_consume_validate.py",
 line 163, in validate
assert success, msg
AssertionError: 2 acked message did not make it to the Consumer. They are: 
[43137, 43140]. We validated that the first 2 of these missing messages 
correctly made it into Kafka's data files. This suggests they were lost on 
their way to the consumer.(There are also 1110 duplicate messages in the log - 
but that is an acceptable outcome)
{quote}

These use common utilities, so they may not be related and just have similar 
error messages.  However, the fact that they seem to have started happening at 
the same time is suspicious.

> Transient failure in ThrottlingTest.test_throttled_reassignment
> ---
>
> Key: KAFKA-4526
> URL: https://issues.apache.org/jira/browse/KAFKA-4526
> Project: Kafka
>  Issue Type: Bug
>Reporter: Ewen Cheslack-Postava
>Assignee: Jason Gustafson
>  Labels: system-test-failure, system-tests
> Fix For: 0.10.2.0
>
>
> This test is seeing transient failures sometimes
> {quote}
> Module: kafkatest.tests.core.throttling_test
> Class:  ThrottlingTest
> Method: test_throttled_reassignment
> Arguments:
> {
>   "bounce_brokers": false
> }
> {quote}
> This happens with both bounce_brokers = true and false. Fails with
> {quote}
> AssertionError: 1646 acked message did not make it to the Consumer. They are: 
> 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19...plus 
> 1626 more. Total Acked: 174799, Total Consumed: 173153. We validated that the 
> first 1000 of these missing messages correctly made it into Kafka's data 
> files. This suggests they were lost on their way to the consumer.
> {quote}
> See 
> http://confluent-kafka-system-test-results.s3-us-west-2.amazonaws.com/2016-12-12--001.1481535295--apache--trunk--62e043a/report.html
>  for an example.
> Note that there are a number of similar bug reports for different tests: 
> https://issues.apache.org/jira/issues/?jql=text%20~%20%22acked%20message%20did%20not%20make%20it%20to%20the%20Consumer%22%20and%20project%20%3D%20Kafka
>  I am wondering if we have a wrong ack setting somewhere that we should be 
> specifying as acks=all but is only defaulting to 0?
> It also seems interesting that the missing messages in these recent failures 
> seem to always start at 0...



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (KAFKA-4526) Transient failure in ThrottlingTest.test_throttled_reassignment

2016-12-12 Thread Ewen Cheslack-Postava (JIRA)

Ewen Cheslack-Postava created KAFKA-4526:


 Summary: Transient failure in 
ThrottlingTest.test_throttled_reassignment
 Key: KAFKA-4526
 URL: https://issues.apache.org/jira/browse/KAFKA-4526
 Project: Kafka
  Issue Type: Bug
Reporter: Ewen Cheslack-Postava
Assignee: Jason Gustafson
 Fix For: 0.10.2.0


This test is seeing transient failures sometimes
{quote}
Module: kafkatest.tests.core.throttling_test
Class:  ThrottlingTest
Method: test_throttled_reassignment
Arguments:
{
  "bounce_brokers": false
}
{quote}

This happens with both bounce_brokers = true and false. Fails with

{quote}
AssertionError: 1646 acked message did not make it to the Consumer. They are: 
0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19...plus 
1626 more. Total Acked: 174799, Total Consumed: 173153. We validated that the 
first 1000 of these missing messages correctly made it into Kafka's data files. 
This suggests they were lost on their way to the consumer.
{quote}

See 
http://confluent-kafka-system-test-results.s3-us-west-2.amazonaws.com/2016-12-12--001.1481535295--apache--trunk--62e043a/report.html
 for an example.

Note that there are a number of similar bug reports for different tests: 
https://issues.apache.org/jira/issues/?jql=text%20~%20%22acked%20message%20did%20not%20make%20it%20to%20the%20Consumer%22%20and%20project%20%3D%20Kafka
 I am wondering if we have a wrong ack setting somewhere that we should be 
specifying as acks=all but is only defaulting to 0?

It also seems interesting that the missing messages in these recent failures 
seem to always start at 0...



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Resolved] (KAFKA-4140) Update system tests to allow running tests in parallel

2016-12-11 Thread Ewen Cheslack-Postava (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-4140?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ewen Cheslack-Postava resolved KAFKA-4140.
--
   Resolution: Fixed
Fix Version/s: 0.10.2.0

Issue resolved by pull request 1834
[https://github.com/apache/kafka/pull/1834]

> Update system tests to allow running tests in parallel
> --
>
> Key: KAFKA-4140
> URL: https://issues.apache.org/jira/browse/KAFKA-4140
> Project: Kafka
>  Issue Type: Improvement
>Reporter: Geoff Anderson
>Assignee: Geoff Anderson
> Fix For: 0.10.2.0
>
>
> The framework used to run system tests will soon have the capability to run 
> tests in parallel. In our validations, we've found significant speedup with 
> modest increase in the size of the worker cluster, as well as much better 
> usage of the cluster resources.
> A few updates to the kafka system test services and tests are needed to take 
> full advantage of this:
> 1) cluster usage annotation - this provides a hint to the framework about 
> what cluster resources to set aside for a given test, and lets the driver 
> efficiently use the worker cluster.
> 2) eliminate a few canonical paths on the test driver. This is fine when 
> tests are run serially, but in parallel, different tests end up colliding on 
> these paths. The primary culprits here are security_config.py, and minikdc.py



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Resolved] (KAFKA-4154) Kafka Connect fails to shutdown if it has not completed startup

2016-12-06 Thread Ewen Cheslack-Postava (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-4154?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ewen Cheslack-Postava resolved KAFKA-4154.
--
   Resolution: Fixed
 Reviewer: Ewen Cheslack-Postava
Fix Version/s: 0.10.2.0

Fixed by https://github.com/apache/kafka/pull/2201

> Kafka Connect fails to shutdown if it has not completed startup
> ---
>
> Key: KAFKA-4154
> URL: https://issues.apache.org/jira/browse/KAFKA-4154
> Project: Kafka
>  Issue Type: Bug
>  Components: KafkaConnect
>Reporter: Shikhar Bhushan
>Assignee: Konstantine Karantasis
> Fix For: 0.10.2.0
>
>
> To reproduce:
> 1. Start Kafka Connect in distributed mode without Kafka running 
> {{./bin/connect-distributed.sh config/connect-distributed.properties}}
> 2. Ctrl+C fails to terminate the process
> thread dump:
> {noformat}
> Full thread dump Java HotSpot(TM) 64-Bit Server VM (25.92-b14 mixed mode):
> "Thread-1" #13 prio=5 os_prio=31 tid=0x7fc29a18a800 nid=0x7007 waiting on 
> condition [0x73129000]
>java.lang.Thread.State: WAITING (parking)
>   at sun.misc.Unsafe.park(Native Method)
>   - parking to wait for  <0x0007bd7d91d8> (a 
> java.util.concurrent.CountDownLatch$Sync)
>   at 
> java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
>   at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836)
>   at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedInterruptibly(AbstractQueuedSynchronizer.java:997)
>   at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1304)
>   at 
> java.util.concurrent.CountDownLatch.await(CountDownLatch.java:231)
>   at 
> org.apache.kafka.connect.runtime.distributed.DistributedHerder.stop(DistributedHerder.java:357)
>   at 
> org.apache.kafka.connect.runtime.Connect.stop(Connect.java:71)
>   at 
> org.apache.kafka.connect.runtime.Connect$ShutdownHook.run(Connect.java:93)
> "SIGINT handler" #27 daemon prio=9 os_prio=31 tid=0x7fc29aa6a000 
> nid=0x560f in Object.wait() [0x71a61000]
>java.lang.Thread.State: WAITING (on object monitor)
>   at java.lang.Object.wait(Native Method)
>   - waiting on <0x0007bd63db38> (a 
> org.apache.kafka.connect.runtime.Connect$ShutdownHook)
>   at java.lang.Thread.join(Thread.java:1245)
>   - locked <0x0007bd63db38> (a 
> org.apache.kafka.connect.runtime.Connect$ShutdownHook)
>   at java.lang.Thread.join(Thread.java:1319)
>   at 
> java.lang.ApplicationShutdownHooks.runHooks(ApplicationShutdownHooks.java:106)
>   at 
> java.lang.ApplicationShutdownHooks$1.run(ApplicationShutdownHooks.java:46)
>   at java.lang.Shutdown.runHooks(Shutdown.java:123)
>   at java.lang.Shutdown.sequence(Shutdown.java:167)
>   at java.lang.Shutdown.exit(Shutdown.java:212)
>   - locked <0x0007b0244600> (a java.lang.Class for 
> java.lang.Shutdown)
>   at java.lang.Terminator$1.handle(Terminator.java:52)
>   at sun.misc.Signal$1.run(Signal.java:212)
>   at java.lang.Thread.run(Thread.java:745)
> "kafka-producer-network-thread | producer-1" #15 daemon prio=5 os_prio=31 
> tid=0x7fc29a0b7000 nid=0x7a03 runnable [0x72608000]
>java.lang.Thread.State: RUNNABLE
>   at sun.nio.ch.KQueueArrayWrapper.kevent0(Native Method)
>   at 
> sun.nio.ch.KQueueArrayWrapper.poll(KQueueArrayWrapper.java:198)
>   at 
> sun.nio.ch.KQueueSelectorImpl.doSelect(KQueueSelectorImpl.java:117)
>   at sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:86)
>   - locked <0x0007bd7788d8> (a sun.nio.ch.Util$2)
>   - locked <0x0007bd7788e8> (a 
> java.util.Collections$UnmodifiableSet)
>   - locked <0x0007bd77> (a sun.nio.ch.KQueueSelectorImpl)
>   at sun.nio.ch.SelectorImpl.select(SelectorImpl.java:97)
>   at 
> org.apache.kafka.common.network.Selector.select(Selector.java:470)
>   at 
> org.apache.kafka.common.network.Selector.poll(Selector.java:286)
>   at 
> org.apache.kafka.clients.NetworkClient.poll(NetworkClient.java:260)
>   at 
> org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:235)
>   at 
> org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:134)
>   at java.lang.Thread.run(Thread.java:745)
> "DistributedHerder" #14 prio=5 os_prio=31 tid=0x7fc29a11e000 nid=0x7803 
> waiting

[jira] [Resolved] (KAFKA-4306) Connect workers won't shut down if brokers are not available

2016-12-06 Thread Ewen Cheslack-Postava (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-4306?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ewen Cheslack-Postava resolved KAFKA-4306.
--
   Resolution: Fixed
Fix Version/s: 0.10.2.0

Issue resolved by pull request 2201
[https://github.com/apache/kafka/pull/2201]

> Connect workers won't shut down if brokers are not available
> 
>
> Key: KAFKA-4306
> URL: https://issues.apache.org/jira/browse/KAFKA-4306
> Project: Kafka
>  Issue Type: Bug
>  Components: KafkaConnect
>Affects Versions: 0.10.1.0
>Reporter: Gwen Shapira
>Assignee: Konstantine Karantasis
> Fix For: 0.10.2.0
>
>
> If brokers are not available and we try to shut down connect workers, sink 
> connectors will be stuck in a loop retrying to commit offsets:
> 2016-10-17 09:39:14,907] INFO Marking the coordinator 192.168.1.9:9092 (id: 
> 2147483647 rack: null) dead for group connect-dump-kafka-config1 
> (org.apache.kafka.clients.consumer.internals.AbstractCoordinator:600)
> [2016-10-17 09:39:14,907] ERROR Commit of 
> WorkerSinkTask{id=dump-kafka-config1-0} offsets threw an unexpected 
> exception:  (org.apache.kafka.connect.runtime.WorkerSinkTask:194)
> org.apache.kafka.clients.consumer.RetriableCommitFailedException: Offset 
> commit failed with a retriable exception. You should retry committing offsets.
> Caused by: 
> org.apache.kafka.common.errors.GroupCoordinatorNotAvailableException
> We should probably limit the number of retries before doing "unclean" 
> shutdown.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Resolved] (KAFKA-4161) Decouple flush and offset commits

2016-12-01 Thread Ewen Cheslack-Postava (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-4161?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ewen Cheslack-Postava resolved KAFKA-4161.
--
   Resolution: Fixed
Fix Version/s: 0.10.2.0

Issue resolved by pull request 2139
[https://github.com/apache/kafka/pull/2139]

> Decouple flush and offset commits
> -
>
> Key: KAFKA-4161
> URL: https://issues.apache.org/jira/browse/KAFKA-4161
> Project: Kafka
>  Issue Type: Improvement
>  Components: KafkaConnect
>Reporter: Shikhar Bhushan
>Assignee: Shikhar Bhushan
>  Labels: needs-kip
> Fix For: 0.10.2.0
>
>
> It is desirable to have, in addition to the time-based flush interval, volume 
> or size-based commits. E.g. a sink connector which is buffering in terms of 
> number of records may want to request a flush when the buffer is full, or 
> when sufficient amount of data has been buffered in a file.
> Having a method like say {{requestFlush()}} on the {{SinkTaskContext}} would 
> allow for connectors to have flexible policies around flushes. This would be 
> in addition to the time interval based flushes that are controlled with 
> {{offset.flush.interval.ms}}, for which the clock should be reset when any 
> kind of flush happens.
> We should probably also support requesting flushes via the 
> {{SourceTaskContext}} for consistency though a use-case doesn't come to mind 
> off the bat.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Resolved] (KAFKA-3008) Connect should parallelize task start/stop

2016-12-01 Thread Ewen Cheslack-Postava (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-3008?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ewen Cheslack-Postava resolved KAFKA-3008.
--
   Resolution: Fixed
Fix Version/s: 0.10.2.0

Issue resolved by pull request 1788
[https://github.com/apache/kafka/pull/1788]

> Connect should parallelize task start/stop
> --
>
> Key: KAFKA-3008
> URL: https://issues.apache.org/jira/browse/KAFKA-3008
> Project: Kafka
>  Issue Type: Improvement
>  Components: KafkaConnect
>Affects Versions: 0.9.0.0
>Reporter: Ewen Cheslack-Postava
>Assignee: Konstantine Karantasis
>Priority: Minor
> Fix For: 0.10.2.0
>
>
> The Herder implementations currently iterate over all connectors/tasks and 
> sequentially start/stop them. We should parallelize this. This is less 
> critical for {{StandaloneHerder}}, but pretty important for 
> {{DistributedHerder}} since it will generally be managing more tasks and any 
> delay starting/stopping a single task will impact every other task on the 
> node (and can ultimately result in incorrect behavior in the case of a single 
> offset commit in one connector taking too long preventing all of the rest 
> from committing offsets).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Resolved] (KAFKA-4397) Refactor Connect backing stores for thread-safety

2016-11-29 Thread Ewen Cheslack-Postava (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-4397?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ewen Cheslack-Postava resolved KAFKA-4397.
--
Resolution: Fixed

Issue resolved by pull request 2123
[https://github.com/apache/kafka/pull/2123]

> Refactor Connect backing stores for thread-safety
> -
>
> Key: KAFKA-4397
> URL: https://issues.apache.org/jira/browse/KAFKA-4397
> Project: Kafka
>  Issue Type: Improvement
>  Components: KafkaConnect
>Affects Versions: 0.10.1.0
>Reporter: Konstantine Karantasis
>Assignee: Konstantine Karantasis
>Priority: Minor
> Fix For: 0.10.2.0
>
>   Original Estimate: 48h
>  Remaining Estimate: 48h
>
> In Kafka Connect there has been already significant provisioning for 
> multi-threaded execution with respect to classes implementing backing store 
> interfaces. 
> A requirement for 
> [KAFKA-3008|https://issues.apache.org/jira/browse/KAFKA-3008] is to tighten 
> thread-safety guarantees in these implementations, especially for 
> ConfigBackingStore and StatusBackingStore, and this will be the focus of the 
> current ticket. 
>  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (KAFKA-4345) Run ducktape test for each pull request

2016-11-29 Thread Ewen Cheslack-Postava (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-4345?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ewen Cheslack-Postava updated KAFKA-4345:
-
Summary: Run ducktape test for each pull request  (was: Run decktape test 
for each pull request)

> Run ducktape test for each pull request
> ---
>
> Key: KAFKA-4345
> URL: https://issues.apache.org/jira/browse/KAFKA-4345
> Project: Kafka
>  Issue Type: Bug
>  Components: system tests
>Affects Versions: 0.10.0.1
>Reporter: Raghav Kumar Gautam
>Assignee: Raghav Kumar Gautam
> Fix For: 0.10.2.0
>
>
> As of now the ducktape tests that we have for kafka and not run for pull 
> request. We can run these test using travis-ci.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Reopened] (KAFKA-4345) Run decktape test for each pull request

2016-11-29 Thread Ewen Cheslack-Postava (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-4345?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ewen Cheslack-Postava reopened KAFKA-4345:
--

> Run decktape test for each pull request
> ---
>
> Key: KAFKA-4345
> URL: https://issues.apache.org/jira/browse/KAFKA-4345
> Project: Kafka
>  Issue Type: Bug
>  Components: system tests
>Affects Versions: 0.10.0.1
>Reporter: Raghav Kumar Gautam
>Assignee: Raghav Kumar Gautam
> Fix For: 0.10.2.0
>
>
> As of now the ducktape tests that we have for kafka and not run for pull 
> request. We can run these test using travis-ci.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (KAFKA-4450) Add missing 0.10.1.x upgrade tests and ensure ongoing compatibility checks

2016-11-29 Thread Ewen Cheslack-Postava (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-4450?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ewen Cheslack-Postava updated KAFKA-4450:
-
Fix Version/s: (was: 0.10.1.1)

> Add missing 0.10.1.x upgrade tests and ensure ongoing compatibility checks
> --
>
> Key: KAFKA-4450
> URL: https://issues.apache.org/jira/browse/KAFKA-4450
> Project: Kafka
>  Issue Type: Bug
>  Components: system tests
>Affects Versions: 0.10.1.0
>Reporter: Ewen Cheslack-Postava
>Priority: Critical
> Fix For: 0.10.2.0
>
>   Original Estimate: 72h
>  Remaining Estimate: 72h
>
> We have upgrade system tests, but we neglected to update them for the most 
> recent released versions (we only have LATEST_0_10_0 but not something from 
> 0_10_1).
> We should probably not only add these versions, but also a) make sure some 
> TRUNK version is always included since upgrade to trunk would always be 
> possible to avoid issues for anyone deploying off trunk (we want every commit 
> to trunk to be solid & compatible) and b) make sure there aren't gaps between 
> versions annotated on the test vs versions that are officially released 
> (which may not be easy statically with the decorators, but might be possible 
> by checking the kafkatest version against previous versions and checking for 
> gaps?).
> Perhaps we need to be able to get the most recent release/snapshot version 
> from the python code so we can always validate previous versions? Even if 
> that's possible, is there going to be a reliable way to get all the previous 
> released versions so we can make sure we have all upgrade tests in place?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (KAFKA-4450) Add missing 0.10.1.x upgrade tests and ensure ongoing compatibility checks

2016-11-29 Thread Ewen Cheslack-Postava (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-4450?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15705874#comment-15705874
 ] 

Ewen Cheslack-Postava commented on KAFKA-4450:
--

Yeah, that makes sense [~ijuma], I think when I was reading through the test 
TRUNK confused me, but it's actually "current version for this branch", not 
necessarily what is currently on trunk. So I suppose what should actually 
happen is that we should add a step to creating a release branch + bumping 
versions for adding the new version to this test on trunk (or maybe not until 
the actual release since we get these installed by actually downloading the 
artifacts).

> Add missing 0.10.1.x upgrade tests and ensure ongoing compatibility checks
> --
>
> Key: KAFKA-4450
> URL: https://issues.apache.org/jira/browse/KAFKA-4450
> Project: Kafka
>  Issue Type: Bug
>  Components: system tests
>Affects Versions: 0.10.1.0
>Reporter: Ewen Cheslack-Postava
>Priority: Critical
> Fix For: 0.10.2.0
>
>   Original Estimate: 72h
>  Remaining Estimate: 72h
>
> We have upgrade system tests, but we neglected to update them for the most 
> recent released versions (we only have LATEST_0_10_0 but not something from 
> 0_10_1).
> We should probably not only add these versions, but also a) make sure some 
> TRUNK version is always included since upgrade to trunk would always be 
> possible to avoid issues for anyone deploying off trunk (we want every commit 
> to trunk to be solid & compatible) and b) make sure there aren't gaps between 
> versions annotated on the test vs versions that are officially released 
> (which may not be easy statically with the decorators, but might be possible 
> by checking the kafkatest version against previous versions and checking for 
> gaps?).
> Perhaps we need to be able to get the most recent release/snapshot version 
> from the python code so we can always validate previous versions? Even if 
> that's possible, is there going to be a reliable way to get all the previous 
> released versions so we can make sure we have all upgrade tests in place?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (KAFKA-4007) Improve fetch pipelining for low values of max.poll.records

2016-11-26 Thread Ewen Cheslack-Postava (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-4007?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15699134#comment-15699134
 ] 

Ewen Cheslack-Postava commented on KAFKA-4007:
--

[~enothereska] prefetching is based on the fetch requests, not any setting of 
max.poll.records. A new fetch request is only sent if the previous data is 
exhausted. If a user sets max.poll.records = 1, then a new request only gets 
sent when the data from the last request is completely exhausted. Since 
processing a single record is probably very fast, this isn't efficient -- we'd 
probably prefer to fetch data earlier since the network roundtrip (especially 
given the fetch.min.bytes means you could easily spend some time waiting on the 
broker/producers) may be relatively expensive.

The idea here is to send another fetch but delay processing until the previous 
fetch response has been fully processed. This pipelines data such that we could 
potentially have 2x the response data queued, but doesn't add any more 
overhead, still gives a chance for pipelining future data (given the extra 
buffer on the second batches of results), and doesn't defer fetching more data 
until the last minute (which max.poll.records would otherwise allow since it 
would allow for a single record to block requesting additional to satisfy 
subsequent poll() requests).

> Improve fetch pipelining for low values of max.poll.records
> ---
>
> Key: KAFKA-4007
> URL: https://issues.apache.org/jira/browse/KAFKA-4007
> Project: Kafka
>  Issue Type: Improvement
>  Components: consumer
>Reporter: Jason Gustafson
>Assignee: Mickael Maison
>
> Currently the consumer will only send a prefetch for a partition after all 
> the records from the previous fetch have been consumed. This can lead to 
> suboptimal pipelining when max.poll.records is set very low since the 
> processing latency for a small set of records may be small compared to the 
> latency of a fetch. An improvement suggested by [~junrao] is to send the 
> fetch anyway even if we have unprocessed data buffered, but delay reading it 
> from the socket until that data has been consumed. Potentially the consumer 
> can delay reading _any_ pending fetch until it is ready to be returned to the 
> user, which may help control memory better. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (KAFKA-4450) Add missing 0.10.1.x upgrade tests and ensure ongoing compatibility checks

2016-11-26 Thread Ewen Cheslack-Postava (JIRA)

Ewen Cheslack-Postava created KAFKA-4450:


 Summary: Add missing 0.10.1.x upgrade tests and ensure ongoing 
compatibility checks
 Key: KAFKA-4450
 URL: https://issues.apache.org/jira/browse/KAFKA-4450
 Project: Kafka
  Issue Type: Bug
  Components: system tests
Affects Versions: 0.10.1.0
Reporter: Ewen Cheslack-Postava
Priority: Critical
 Fix For: 0.11.0.0, 0.10.1.1, 0.10.2.0


We have upgrade system tests, but we neglected to update them for the most 
recent released versions (we only have LATEST_0_10_0 but not something from 
0_10_1).

We should probably not only add these versions, but also a) make sure some 
TRUNK version is always included since upgrade to trunk would always be 
possible to avoid issues for anyone deploying off trunk (we want every commit 
to trunk to be solid & compatible) and b) make sure there aren't gaps between 
versions annotated on the test vs versions that are officially released (which 
may not be easy statically with the decorators, but might be possible by 
checking the kafkatest version against previous versions and checking for 
gaps?).

Perhaps we need to be able to get the most recent release/snapshot version from 
the python code so we can always validate previous versions? Even if that's 
possible, is there going to be a reliable way to get all the previous released 
versions so we can make sure we have all upgrade tests in place?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (KAFKA-3959) __consumer_offsets wrong number of replicas at startup

2016-11-26 Thread Ewen Cheslack-Postava (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-3959?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15698985#comment-15698985
 ] 

Ewen Cheslack-Postava commented on KAFKA-3959:
--

[~toddpalino] That's fair, I think the tension is between trivial quickstart 
mode working and production settings. Then again, there are other things (even 
really simple things like the log.dirs) which presumably will differ between 
the two. Maybe the fix here is to both enforce it and update all quickstarts to 
use a config/quickstart-server.properties that has 
offsets.topic.replication.factor=1 and default.replication.factor=1.

> __consumer_offsets wrong number of replicas at startup
> --
>
> Key: KAFKA-3959
> URL: https://issues.apache.org/jira/browse/KAFKA-3959
> Project: Kafka
>  Issue Type: Bug
>  Components: consumer, offset manager, replication
>Affects Versions: 0.9.0.1, 0.10.0.0
> Environment: Brokers of 3 kafka nodes running Red Hat Enterprise 
> Linux Server release 7.2 (Maipo)
>Reporter: Alban Hurtaud
>
> When creating a stack of 3 kafka brokers, the consumer is starting faster 
> than kafka nodes and when trying to read a topic, only one kafka node is 
> available.
> So the __consumer_offsets is created with a replication factor set to 1 
> (instead of configured 3) :
> offsets.topic.replication.factor=3
> default.replication.factor=3
> min.insync.replicas=2
> Then, other kafka nodes go up and we have exceptions because the replicas # 
> for __consumer_offsets is 1 and min insync is 2. So exceptions are thrown.
> What I missed is : Why the __consumer_offsets is created with replication to 
> 1 (when 1 broker is running) whereas in server.properties it is set to 3 ?
> To reproduce : 
> - Prepare 3 kafka nodes with the 3 lines above added to servers.properties.
> - Run one kafka,
> - Run one consumer (the __consumer_offsets is created with replicas =1)
> - Run 2 more kafka nodes



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (KAFKA-3963) Missing messages from the controller to brokers

2016-11-26 Thread Ewen Cheslack-Postava (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-3963?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ewen Cheslack-Postava updated KAFKA-3963:
-
Fix Version/s: (was: 0.10.1.0)

> Missing messages from the controller to brokers
> ---
>
> Key: KAFKA-3963
> URL: https://issues.apache.org/jira/browse/KAFKA-3963
> Project: Kafka
>  Issue Type: Bug
>  Components: core
>Reporter: Maysam Yabandeh
>Priority: Minor
>
> The controller takes messages from a queue and send it to the designated 
> broker. If the controller times out on receiving a response from the broker 
> (30s) it closes the connection and retries again after a backoff period, 
> however it does not return the message back to the queue. As a result the 
> retry will start with the next message and the previous message might have 
> never been received by the broker.
> {code}
> val QueueItem(apiKey, apiVersion, request, callback) = queue.take()
> ...
>   try {
> ...
>   clientResponse = 
> networkClient.blockingSendAndReceive(clientRequest)(time)
> ...
> }
>   } catch {
> case e: Throwable => // if the send was not successful, reconnect 
> to broker and resend the message
>   warn(("Controller %d epoch %d fails to send request %s to 
> broker %s. " +
> "Reconnecting to broker.").format(controllerId, 
> controllerContext.epoch,
>   request.toString, brokerNode.toString()), e)
>   networkClient.close(brokerNode.idString)
> ...
>   }
> {code}
> This could violates the semantics that developers had assumed when writing 
> controller-broker protocol. For example, the controller code sends metadata 
> updates BEFORE sending LeaderAndIsrRequest when it communicates with a newly 
> joined broker for the first time. 
> {code}
>   def onBrokerStartup(newBrokers: Seq[Int]) {
> info("New broker startup callback for 
> %s".format(newBrokers.mkString(",")))
> val newBrokersSet = newBrokers.toSet
> // send update metadata request to all live and shutting down brokers. 
> Old brokers will get to know of the new
> // broker via this update.
> // In cases of controlled shutdown leaders will not be elected when a new 
> broker comes up. So at least in the
> // common controlled shutdown case, the metadata will reach the new 
> brokers faster
> 
> sendUpdateMetadataRequest(controllerContext.liveOrShuttingDownBrokerIds.toSeq)
> // the very first thing to do when a new broker comes up is send it the 
> entire list of partitions that it is
> // supposed to host. Based on that the broker starts the high watermark 
> threads for the input list of partitions
> val allReplicasOnNewBrokers = 
> controllerContext.replicasOnBrokers(newBrokersSet)
> replicaStateMachine.handleStateChanges(allReplicasOnNewBrokers, 
> OnlineReplica)
> {code}
> This is important because without the metadata cached in the broker the 
> LeaderAndIsrRequests that ask the broker to become a follower would fail 
> since there is no metadata for leader of the partition.
> {code}
> metadataCache.getAliveBrokers.find(_.id == newLeaderBrokerId) match {
>   // Only change partition state when the leader is available
>   case Some(leaderBroker) =>
> ...
>   case None =>
> // The leader broker should always be present in the metadata 
> cache.
> // If not, we should record the error message and abort the 
> transition process for this partition
> stateChangeLogger.error(("Broker %d received LeaderAndIsrRequest 
> with correlation id %d from controller" +
>   " %d epoch %d for partition [%s,%d] but cannot become follower 
> since the new leader %d is unavailable.")
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (KAFKA-4449) Add Serializer/Deserializer for POJO

2016-11-26 Thread Ewen Cheslack-Postava (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-4449?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15698784#comment-15698784
 ] 

Ewen Cheslack-Postava commented on KAFKA-4449:
--

[~habren] We actually try to keep serialization format specific stuff out of 
Kafka's core libraries. Why should we include this support for JSON (but not 
labelled as such), but not include it for Avro, protobufs, thrift, msgpack, 
capnproto, flatbuffers, etc?

In practice, we actually *do* include a JSON serializer already. See 
org.apache.kafka.connect.json.JsonSerializer for something that handles Jackson 
JsonNodes (but admittedly does not handle all POJOs currently). The only reason 
we include this is that we need *something* to demo the schema functionality 
that Kafka Connect supports and JSON happens to be very easy for demo purposes.

> Add Serializer/Deserializer for POJO
> 
>
> Key: KAFKA-4449
> URL: https://issues.apache.org/jira/browse/KAFKA-4449
> Project: Kafka
>  Issue Type: Improvement
>  Components: clients
>Affects Versions: 0.10.0.1
>Reporter: Jason Guo
>Priority: Minor
>  Labels: easyfix, features
>   Original Estimate: 1h
>  Remaining Estimate: 1h
>
> Currently, there are only build-in serializer/deserializer for basic data 
> type (String, Long, etc). It's better to have serializer/deserializer for 
> POJO.
> If we had this, user can serialize/deserialize all of their POJO with it. 
> Otherwise, user may need to create e pair of serializer and deserializer for 
> each kind of POJO, just like the implementation in the stream example 
> PageViewTypeDemo
> https://github.com/apache/kafka/blob/trunk/streams/examples/src/main/java/org/apache/kafka/streams/examples/pageview/PageViewTypedDemo.java
> Let's take above streams-example as an example, Serde was created for 
> PageView as below
> final Serializer pageViewSerializer = new 
> JsonPOJOSerializer<>();
> serdeProps.put("JsonPOJOClass", PageView.class);
> pageViewSerializer.configure(serdeProps, false);
> final Deserializer pageViewDeserializer = new 
> JsonPOJODeserializer<>();
> serdeProps.put("JsonPOJOClass", PageView.class);
> pageViewDeserializer.configure(serdeProps, false);
> final Serde pageViewSerde = 
> Serdes.serdeFrom(pageViewSerializer, pageViewDeserializer);
> If we use this POJO serializer/deserializer, the Serde can be created with 
> only one line
> Serdes.serdeFrom(RegionCount.class)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (KAFKA-3400) Topic stop working / can't describe topic

2016-11-26 Thread Ewen Cheslack-Postava (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-3400?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ewen Cheslack-Postava updated KAFKA-3400:
-
Fix Version/s: (was: 0.10.1.0)

> Topic stop working / can't describe topic
> -
>
> Key: KAFKA-3400
> URL: https://issues.apache.org/jira/browse/KAFKA-3400
> Project: Kafka
>  Issue Type: Bug
>Affects Versions: 0.9.0.1
>Reporter: Tobias
>Assignee: Ashish K Singh
>
> we are seeing an issue were we intermittently (every couple of hours) get and 
> error with certain topics. They stop working and producers give a 
> LeaderNotFoundException.
> When we then try to use kafka-topics.sh to describe the topic we get the 
> error below.
> Error while executing topic command : next on empty iterator
> {{
> [2016-03-15 17:30:26,231] ERROR java.util.NoSuchElementException: next on 
> empty iterator
>   at scala.collection.Iterator$$anon$2.next(Iterator.scala:39)
>   at scala.collection.Iterator$$anon$2.next(Iterator.scala:37)
>   at scala.collection.IterableLike$class.head(IterableLike.scala:91)
>   at scala.collection.AbstractIterable.head(Iterable.scala:54)
>   at 
> kafka.admin.TopicCommand$$anonfun$describeTopic$1.apply(TopicCommand.scala:198)
>   at 
> kafka.admin.TopicCommand$$anonfun$describeTopic$1.apply(TopicCommand.scala:188)
>   at 
> scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
>   at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47)
>   at kafka.admin.TopicCommand$.describeTopic(TopicCommand.scala:188)
>   at kafka.admin.TopicCommand$.main(TopicCommand.scala:66)
>   at kafka.admin.TopicCommand.main(TopicCommand.scala)
>  (kafka.admin.TopicCommand$)
> }}
> if we delete the topic, then it will start to work again for a while
> We can't see anything obvious in the logs but are happy to provide if needed



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (KAFKA-4063) Add support for infinite endpoints for range queries in Kafka Streams KV stores

2016-11-26 Thread Ewen Cheslack-Postava (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-4063?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ewen Cheslack-Postava updated KAFKA-4063:
-
Fix Version/s: (was: 0.10.1.0)

> Add support for infinite endpoints for range queries in Kafka Streams KV 
> stores
> ---
>
> Key: KAFKA-4063
> URL: https://issues.apache.org/jira/browse/KAFKA-4063
> Project: Kafka
>  Issue Type: Improvement
>  Components: streams
>Affects Versions: 0.10.1.0
>Reporter: Roger Hoover
>Assignee: Roger Hoover
>Priority: Minor
>
> In some applications, it's useful to iterate over the key-value store either:
> 1. from the beginning up to a certain key
> 2. from a certain key to the end
> We can add two new methods rangeUtil() and rangeFrom() easily to support this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (KAFKA-3775) Throttle maximum number of tasks assigned to a single KafkaStreams

2016-11-26 Thread Ewen Cheslack-Postava (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-3775?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ewen Cheslack-Postava updated KAFKA-3775:
-
Fix Version/s: (was: 0.10.1.0)

> Throttle maximum number of tasks assigned to a single KafkaStreams
> --
>
> Key: KAFKA-3775
> URL: https://issues.apache.org/jira/browse/KAFKA-3775
> Project: Kafka
>  Issue Type: Improvement
>  Components: streams
>Affects Versions: 0.10.0.0
>Reporter: Yuto Kawamura
>Assignee: Yuto Kawamura
>
> As of today, if I start a Kafka Streams app on a single machine which 
> consists of single KafkaStreams instance, that instance gets all partitions 
> of the target topic assigned.
> As we're using it to process topics which has huge number of partitions and 
> message traffic, it is a problem that we don't have a way of throttling the 
> maximum amount of partitions assigned to a single instance.
> In fact, when we started a Kafka Streams app which consumes a topic which has 
> more than 10MB/sec traffic of each partition we saw that all partitions 
> assigned to the first instance and soon the app dead by OOM.
> I know that there's some workarounds considerable here. for example:
> - Start multiple instances at once so the partitions distributed evenly.
>   => Maybe works. but as Kafka Streams is a library but not an execution 
> framework, there's no predefined procedure of starting Kafka Streams apps so 
> some users might wanna take an option to start the first single instance and 
> check if it works as expected with lesster number of partitions(I want :p)
> - Adjust config parameters such as {{buffered.records.per.partition}}, 
> {{max.partition.fetch.bytes}} and {{max.poll.records}} to reduce the heap 
> pressure.
>   => Maybe works. but still have two problems IMO:
>   - Still leads traffic explosion with high throughput processing as it 
> accepts all incoming messages from hundreads of partitions.
>   - In the first place, by the distributed system principle, it's wired that 
> users don't have a away to control maximum "partitions" assigned to a single 
> shard(an instance of KafkaStreams here). Users should be allowed to provide 
> the maximum amount of partitions that is considered as possible to be 
> processed with single instance(or host).
> Here, I'd like to introduce a new configuration parameter 
> {{max.tasks.assigned}}, which limits the number of tasks(a notion of 
> partition) assigned to the processId(which is the notion of single 
> KafkaStreams instance).
> At the same time we need to change StreamPartitionAssignor(TaskAssignor) to 
> tolerate the incomplete assignment. That is, Kafka Streams should continue 
> working for the part of partitions even there are some partitions left 
> unassigned, in order to satisfy this> "user may want to take an option to 
> start the first single instance and check if it works as expected with 
> lesster number of partitions(I want :p)".
> I've implemented the rough POC for this. PTAL and if it make sense I will 
> continue sophisticating it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (KAFKA-4449) Add Serializer/Deserializer for POJO

2016-11-26 Thread Ewen Cheslack-Postava (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-4449?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ewen Cheslack-Postava updated KAFKA-4449:
-
Fix Version/s: (was: 0.10.1.0)

> Add Serializer/Deserializer for POJO
> 
>
> Key: KAFKA-4449
> URL: https://issues.apache.org/jira/browse/KAFKA-4449
> Project: Kafka
>  Issue Type: Improvement
>  Components: clients
>Affects Versions: 0.10.0.1
>Reporter: Jason Guo
>Priority: Minor
>  Labels: easyfix, features
>   Original Estimate: 1h
>  Remaining Estimate: 1h
>
> Currently, there are only build-in serializer/deserializer for basic data 
> type (String, Long, etc). It's better to have serializer/deserializer for 
> POJO.
> If we had this, user can serialize/deserialize all of their POJO with it. 
> Otherwise, user may need to create e pair of serializer and deserializer for 
> each kind of POJO, just like the implementation in the stream example 
> PageViewTypeDemo
> https://github.com/apache/kafka/blob/trunk/streams/examples/src/main/java/org/apache/kafka/streams/examples/pageview/PageViewTypedDemo.java
> Let's take above streams-example as an example, Serde was created for 
> PageView as below
> final Serializer pageViewSerializer = new 
> JsonPOJOSerializer<>();
> serdeProps.put("JsonPOJOClass", PageView.class);
> pageViewSerializer.configure(serdeProps, false);
> final Deserializer pageViewDeserializer = new 
> JsonPOJODeserializer<>();
> serdeProps.put("JsonPOJOClass", PageView.class);
> pageViewDeserializer.configure(serdeProps, false);
> final Serde pageViewSerde = 
> Serdes.serdeFrom(pageViewSerializer, pageViewDeserializer);
> If we use this POJO serializer/deserializer, the Serde can be created with 
> only one line
> Serdes.serdeFrom(RegionCount.class)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (KAFKA-4316) Kafka Streams 0.10.0.1 does not run on Windows x64

2016-11-26 Thread Ewen Cheslack-Postava (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-4316?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ewen Cheslack-Postava updated KAFKA-4316:
-
Fix Version/s: (was: 0.10.1.0)

> Kafka Streams 0.10.0.1 does not run on Windows x64
> --
>
> Key: KAFKA-4316
> URL: https://issues.apache.org/jira/browse/KAFKA-4316
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Affects Versions: 0.10.0.1
>Reporter: Seweryn Habdank-Wojewodzki
>
> We had encountered the problem that starting application with Kafka Streams 
> 0.10.0.1 leads to runtime exception, that rocksdb DLL is missing on the 
> Windows x64 machine. 
> Part of the stacktrace:
> {code}
> Caused by: java.lang.RuntimeException: librocksdbjni-win64.dll was not found 
> inside JAR.
> at 
> org.rocksdb.NativeLibraryLoader.loadLibraryFromJarToTemp(NativeLibraryLoader.java:106)
> {code}
> It is true, as Kafka 0.10.0.1 uses RocksDB 4.8.0. This RocksDB release has 
> broken Java API. 
> See: 
> https://github.com/facebook/rocksdb/issues/1177
> https://github.com/facebook/rocksdb/issues/1302
> This critical (for Windows) bug was fixed in RocksDB 4.9.0.
> Please update Kafka gradle\dependencies.gradle to use at least 4.9.0:
> So the line shall be rocksDB: "4.9.0".
> I had tested some basic functionality with Kafka Streams with RocksDB 4.11.2 
> and it was promissing and definitely the bug was away.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (KAFKA-4156) Not able to download tests jar of kafka and kafka-streams from maven repo.

2016-11-26 Thread Ewen Cheslack-Postava (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-4156?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ewen Cheslack-Postava updated KAFKA-4156:
-
Fix Version/s: (was: 0.10.0.1)
   (was: 0.10.0.0)

> Not able to download tests jar of kafka and kafka-streams from maven repo.
> --
>
> Key: KAFKA-4156
> URL: https://issues.apache.org/jira/browse/KAFKA-4156
> Project: Kafka
>  Issue Type: Bug
>  Components: packaging
>Affects Versions: 0.10.0.0, 0.10.0.1
>Reporter: Satish Duggana
>
> Below dependency is added in one of our repos to use EmbeddedKafkaCluster but 
> dependency installation fails with an error mentioned later.
> 
> org.apache.kafka
> kafka-streams
> 0.10.0.0
> test-jar
> test
> 
> This fails with an error below as 
> https://repository.apache.org/content/repositories/snapshots/org/apache/kafka/kafka-streams/0.10.0.0/kafka-streams-0.10.0.0-tests.jar
>  not available. But 
> https://repository.apache.org/content/repositories/snapshots/org/apache/kafka/kafka-streams/0.10.0.0/kafka-streams-0.10.0.0-test.jar
>  is available. You may need to fix POM to install right name which is 
> kafka-streams-0.10.0.0-tests.jar instead of kafka-streams-0.10.0.0-test.jar
> [ERROR] Failed to execute goal on project schema-registry-avro: Could not 
> resolve dependencies for project 
> com.hortonworks.registries:schema-registry-avro:jar:0.1.0-SNAPSHOT: The 
> following artifacts could not be resolved: 
> org.apache.kafka:kafka-clients:jar:tests:0.10.0.0, 
> org.apache.kafka:kafka-streams:jar:tests:0.10.0.0: Could not find artifact 
> org.apache.kafka:kafka-clients:jar:tests:0.10.0.0 in central 
> (http://repo1.maven.org/maven2/) -> [Help 1]



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (KAFKA-4403) Update KafkaBasedLog to use new endOffsets consumer API

2016-11-25 Thread Ewen Cheslack-Postava (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-4403?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ewen Cheslack-Postava updated KAFKA-4403:
-
Assignee: Balint Molnar  (was: Ewen Cheslack-Postava)

> Update KafkaBasedLog to use new endOffsets consumer API
> ---
>
> Key: KAFKA-4403
> URL: https://issues.apache.org/jira/browse/KAFKA-4403
> Project: Kafka
>  Issue Type: Bug
>  Components: KafkaConnect
>Affects Versions: 0.10.1.0
>Reporter: Ewen Cheslack-Postava
>Assignee: Balint Molnar
>Priority: Minor
>  Labels: newbie
>
> As of 0.10.1.0 and KIP-79 
> (https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=65868090) 
> KafkaConsumer can now fetch offset information about topic partitions. 
> Previously KafkaBasedLog had to use a seekToEnd + position approach to 
> determine end offsets. With the new APIs we can simplify this code.
> This isn't critical as the current code works fine, but would be a nice 
> cleanup and simplification.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (KAFKA-4403) Update KafkaBasedLog to use new endOffsets consumer API

2016-11-25 Thread Ewen Cheslack-Postava (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-4403?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15696374#comment-15696374
 ] 

Ewen Cheslack-Postava commented on KAFKA-4403:
--

Absolutely, I've assigned the ticket to you.

> Update KafkaBasedLog to use new endOffsets consumer API
> ---
>
> Key: KAFKA-4403
> URL: https://issues.apache.org/jira/browse/KAFKA-4403
> Project: Kafka
>  Issue Type: Bug
>  Components: KafkaConnect
>Affects Versions: 0.10.1.0
>Reporter: Ewen Cheslack-Postava
>Assignee: Balint Molnar
>Priority: Minor
>  Labels: newbie
>
> As of 0.10.1.0 and KIP-79 
> (https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=65868090) 
> KafkaConsumer can now fetch offset information about topic partitions. 
> Previously KafkaBasedLog had to use a seekToEnd + position approach to 
> determine end offsets. With the new APIs we can simplify this code.
> This isn't critical as the current code works fine, but would be a nice 
> cleanup and simplification.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Resolved] (KAFKA-4417) Update build dependencies for 0.10.2 cycle

2016-11-17 Thread Ewen Cheslack-Postava (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-4417?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ewen Cheslack-Postava resolved KAFKA-4417.
--
Resolution: Fixed

Issue resolved by pull request 2144
[https://github.com/apache/kafka/pull/2144]

> Update build dependencies for 0.10.2 cycle
> --
>
> Key: KAFKA-4417
> URL: https://issues.apache.org/jira/browse/KAFKA-4417
> Project: Kafka
>  Issue Type: Improvement
>Reporter: Ismael Juma
>Assignee: Ismael Juma
> Fix For: 0.10.2.0
>
>
> We usually update build dependencies at the beginning of a new release cycle 
> to make it more likely that regressions are found before the release.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (KAFKA-4404) Add knowledge of sign to numeric schema types

2016-11-14 Thread Ewen Cheslack-Postava (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-4404?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15664439#comment-15664439
 ] 

Ewen Cheslack-Postava commented on KAFKA-4404:
--

What is the motivation for this? Support for signed and unsigned types is 
spotty at best in serialization formats, see 
https://cwiki.apache.org/confluence/display/KAFKA/Copycat+Data+API for some of 
the survey done when initially working on the API (basically protobufs is the 
only format that has complete support). Being exhaustive in type support ends 
up being a burden for connector and converter developers as it requires more 
effort to support all the types, so I think it's important to make sure any 
additions to the set of supported types are well motivated.

> Add knowledge of sign to numeric schema types
> -
>
> Key: KAFKA-4404
> URL: https://issues.apache.org/jira/browse/KAFKA-4404
> Project: Kafka
>  Issue Type: Improvement
>  Components: KafkaConnect
>Affects Versions: 0.10.0.1
>Reporter: Andy Bryant
>Assignee: Ewen Cheslack-Postava
>Priority: Minor
>
> For KafkaConnect schemas there is currently no concept of whether a numeric 
> field is signed or unsigned. 
> Add an additional `signed` attribute (like optional) or make it explicit that 
> numeric types must be signed.
> You could encode this as a parameter on the schema but this would not be 
> standard across all connectors.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (KAFKA-3959) __consumer_offsets wrong number of replicas at startup

2016-11-13 Thread Ewen Cheslack-Postava (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-3959?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15662738#comment-15662738
 ] 

Ewen Cheslack-Postava commented on KAFKA-3959:
--

[~granthenke] [~onurkaraman] [~toddpalino] Any more thoughts on this? I think 
the main use case for handling < 3 brokers by default is when we start up a 
"cluster" locally for test purposes. Any real use case that wanted a lower 
replication factor could set it explicitly. This is pretty important and we 
don't really want to have users jump through hoops to do so; that said, a 
dramatic warning wouldn't be the end of the world. Maybe even some combination 
of a low setting plus a setting that gives unsafe warnings but allows unsafely 
low replication factors for this topic?

> __consumer_offsets wrong number of replicas at startup
> --
>
> Key: KAFKA-3959
> URL: https://issues.apache.org/jira/browse/KAFKA-3959
> Project: Kafka
>  Issue Type: Bug
>  Components: consumer, offset manager, replication
>Affects Versions: 0.9.0.1, 0.10.0.0
> Environment: Brokers of 3 kafka nodes running Red Hat Enterprise 
> Linux Server release 7.2 (Maipo)
>Reporter: Alban Hurtaud
>
> When creating a stack of 3 kafka brokers, the consumer is starting faster 
> than kafka nodes and when trying to read a topic, only one kafka node is 
> available.
> So the __consumer_offsets is created with a replication factor set to 1 
> (instead of configured 3) :
> offsets.topic.replication.factor=3
> default.replication.factor=3
> min.insync.replicas=2
> Then, other kafka nodes go up and we have exceptions because the replicas # 
> for __consumer_offsets is 1 and min insync is 2. So exceptions are thrown.
> What I missed is : Why the __consumer_offsets is created with replication to 
> 1 (when 1 broker is running) whereas in server.properties it is set to 3 ?
> To reproduce : 
> - Prepare 3 kafka nodes with the 3 lines above added to servers.properties.
> - Run one kafka,
> - Run one consumer (the __consumer_offsets is created with replicas =1)
> - Run 2 more kafka nodes



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Resolved] (KAFKA-3967) Excessive Network IO between Kafka brokers

2016-11-13 Thread Ewen Cheslack-Postava (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-3967?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ewen Cheslack-Postava resolved KAFKA-3967.
--
Resolution: Invalid

[~Krishna82] Closing this for now since we haven't heard back. If there are 
some details that were missing in the initial report that show this is actually 
an unexpectedly high throughput for replication, please reopen and add some 
more details.

> Excessive Network IO between Kafka brokers 
> ---
>
> Key: KAFKA-3967
> URL: https://issues.apache.org/jira/browse/KAFKA-3967
> Project: Kafka
>  Issue Type: Bug
>Affects Versions: 0.8.2.2
>Reporter: Krishna
>
> Excessive Network IO between Kafka brokers running on AWS in different AZ's 
> as compared to actual message volume. 
> We are producing  2-5 MB /Sec message volume however kafka seems to me moving 
> 20 gb /hr on network. The data volume has around 12 GB of message log on each 
> nodes. Is this a natural behavior ?. I believe only the new messages will get 
> replicated on non-leader nodes however here it seems that entire log is 
> re-synced  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (KAFKA-4021) system tests need to enable trace level logging for controller and state-change log

2016-11-13 Thread Ewen Cheslack-Postava (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-4021?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15662381#comment-15662381
 ] 

Ewen Cheslack-Postava commented on KAFKA-4021:
--

[~junrao] Is this actually reasonable to do across the board for the Kafka 
service? Normally trace-level logs incur enough overhead that they can affect 
normal behavior and, e.g., could drastically affect anything trying to get 
performance stats.

> system tests need to enable trace level logging for controller and 
> state-change log
> ---
>
> Key: KAFKA-4021
> URL: https://issues.apache.org/jira/browse/KAFKA-4021
> Project: Kafka
>  Issue Type: Improvement
>  Components: system tests
>Affects Versions: 0.10.0.0
>Reporter: Jun Rao
>Assignee: Geoff Anderson
>
> We store detailed information about leader changes at trace level in the 
> controller and the state-change log. Currently, our system tests only collect 
> debug level logs. It would be useful to collect trace level logging for these 
> two logs and archive them if there are test failures, at least for 
> replication related tests.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (KAFKA-4402) Kafka Producer's DefaultPartitioner is actually not round robin as said in the code comments "If no partition or key is present choose a partition in a round-robin fash

2016-11-13 Thread Ewen Cheslack-Postava (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-4402?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15662197#comment-15662197
 ] 

Ewen Cheslack-Postava commented on KAFKA-4402:
--

[~Jun Yao] Sorry, missed that update with the example code while I was 
commenting on the PR. I understand how you can encounter the imbalance when 
using multiple topics, see PR for questions around the solution and whether we 
want to change the default vs provide an alternative when this is a problem.

> Kafka Producer's DefaultPartitioner is actually not round robin as said in 
> the code comments "If no partition or key is present choose a partition in a 
> round-robin fashion"
> 
>
> Key: KAFKA-4402
> URL: https://issues.apache.org/jira/browse/KAFKA-4402
> Project: Kafka
>  Issue Type: Improvement
>Reporter: Jun Yao
>Priority: Minor
>
> From this code comments, it is said that Kafka client  Producer's 
> DefaultPartitioner will do round robin if "no partition or key is present", 
> https://github.com/apache/kafka/blob/41e676d29587042994a72baa5000a8861a075c8c/clients/src/main/java/org/apache/kafka/clients/producer/internals/DefaultPartitioner.java#L34
> from the code it looks trying to do round robin as well, as it maintained a 
> counter and try to increase it every time and then will decide which 
> partition to go to; 
> However the issue here is the counter is a global counter that is shared by 
> all the topics, so  it is actually not round robin per topic and sometimes 
> caused unbalanced routing among different partitions. 
> Although we can pass a custom implementation of interface 
> "org.apache.kafka.clients.producer.Partitioner", it might be still good to 
> make the default implementation true round robin as comment. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (KAFKA-4402) Kafka Producer's DefaultPartitioner is actually not round robin as said in the code comments "If no partition or key is present choose a partition in a round-robin fash

2016-11-13 Thread Ewen Cheslack-Postava (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-4402?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15662198#comment-15662198
 ] 

Ewen Cheslack-Postava commented on KAFKA-4402:
--

[~Jun Yao] Sorry, missed that update with the example code while I was 
commenting on the PR. I understand how you can encounter the imbalance when 
using multiple topics, see PR for questions around the solution and whether we 
want to change the default vs provide an alternative when this is a problem.

> Kafka Producer's DefaultPartitioner is actually not round robin as said in 
> the code comments "If no partition or key is present choose a partition in a 
> round-robin fashion"
> 
>
> Key: KAFKA-4402
> URL: https://issues.apache.org/jira/browse/KAFKA-4402
> Project: Kafka
>  Issue Type: Improvement
>Reporter: Jun Yao
>Priority: Minor
>
> From this code comments, it is said that Kafka client  Producer's 
> DefaultPartitioner will do round robin if "no partition or key is present", 
> https://github.com/apache/kafka/blob/41e676d29587042994a72baa5000a8861a075c8c/clients/src/main/java/org/apache/kafka/clients/producer/internals/DefaultPartitioner.java#L34
> from the code it looks trying to do round robin as well, as it maintained a 
> counter and try to increase it every time and then will decide which 
> partition to go to; 
> However the issue here is the counter is a global counter that is shared by 
> all the topics, so  it is actually not round robin per topic and sometimes 
> caused unbalanced routing among different partitions. 
> Although we can pass a custom implementation of interface 
> "org.apache.kafka.clients.producer.Partitioner", it might be still good to 
> make the default implementation true round robin as comment. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (KAFKA-4403) Update KafkaBasedLog to use new endOffsets consumer API

2016-11-13 Thread Ewen Cheslack-Postava (JIRA)

Ewen Cheslack-Postava created KAFKA-4403:


 Summary: Update KafkaBasedLog to use new endOffsets consumer API
 Key: KAFKA-4403
 URL: https://issues.apache.org/jira/browse/KAFKA-4403
 Project: Kafka
  Issue Type: Bug
  Components: KafkaConnect
Affects Versions: 0.10.1.0
Reporter: Ewen Cheslack-Postava
Assignee: Ewen Cheslack-Postava
Priority: Minor


As of 0.10.1.0 and KIP-79 
(https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=65868090) 
KafkaConsumer can now fetch offset information about topic partitions. 
Previously KafkaBasedLog had to use a seekToEnd + position approach to 
determine end offsets. With the new APIs we can simplify this code.

This isn't critical as the current code works fine, but would be a nice cleanup 
and simplification.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (KAFKA-4402) Kafka Producer's DefaultPartitioner is actually not round robin as said in the code comments "If no partition or key is present choose a partition in a round-robin fash

2016-11-13 Thread Ewen Cheslack-Postava (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-4402?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15662153#comment-15662153
 ] 

Ewen Cheslack-Postava commented on KAFKA-4402:
--

[~Jun Yao] Which code are you looking at for this? 
https://github.com/apache/kafka/blob/trunk/clients/src/main/java/org/apache/kafka/clients/producer/internals/DefaultPartitioner.java#L55-L64
 seems to have the right behavior since it uses a counter and modulo to select 
a partition when the bytes are null. It isn't perfect when the number of 
partitions change since it might have outdated metadata, but that is an edge 
case. Perhaps you were looking at the old DefaultPartitioner.scala for the old 
producer?

> Kafka Producer's DefaultPartitioner is actually not round robin as said in 
> the code comments "If no partition or key is present choose a partition in a 
> round-robin fashion"
> 
>
> Key: KAFKA-4402
> URL: https://issues.apache.org/jira/browse/KAFKA-4402
> Project: Kafka
>  Issue Type: Improvement
>Reporter: Jun Yao
>Priority: Minor
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (KAFKA-4401) Change the KafkaServerTestHarness and IntegrationTestHarness from trait to abstract class.

2016-11-13 Thread Ewen Cheslack-Postava (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-4401?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15662147#comment-15662147
 ] 

Ewen Cheslack-Postava commented on KAFKA-4401:
--

[~becket_qin] We've already done much of this work for some of our Java-based 
projects, e.g. see 
https://github.com/confluentinc/schema-registry/blob/master/core/src/test/java/io/confluent/kafka/schemaregistry/ClusterTestHarness.java
 and with variants for security, e.g. 
https://github.com/confluentinc/schema-registry/blob/master/core/src/test/java/io/confluent/kafka/schemaregistry/SSLClusterTestHarness.java.
 These have actually caused a bit of pain because they rely on internals so can 
break unexpectedly due to changes in Kafka. Given that, it would be handy if 
they were just part of Kafka itself. We could probably lift most of these 
implementations directly (they include schema registry startup as well, but 
that should be trivial to strip out.)

That said, we've actually moved away from including integration tests like this 
in most of our projects in favor of putting tests like these into system tests. 
They remain in our schema registry and  REST proxy mainly for historical 
reasons, i.e. the cost of refactoring them hasn't become worth it in these 
cases since the tests can still run relatively quickly (compared to Kafka's 
tests which now have so many integration tests that they dominate the 15-20 
minute test runtime on a developer laptop). I'm a bit torn as to whether this 
would be a good addition; on the one hand people are doing this so 
standardizing it and avoiding 83 different implementations seems good, on the 
other hand I think it leads to people dumping too many tests that are actually 
system tests into tests that they call integration tests and run via unit 
tests...

> Change the KafkaServerTestHarness and IntegrationTestHarness from trait to 
> abstract class.
> --
>
> Key: KAFKA-4401
> URL: https://issues.apache.org/jira/browse/KAFKA-4401
> Project: Kafka
>  Issue Type: Task
>  Components: unit tests
>Affects Versions: 0.10.1.0
>Reporter: Jiangjie Qin
>Assignee: Jiangjie Qin
> Fix For: 0.10.1.1
>
>
> The IntegartionTestHarness and KafkaServerTestHarness are useful not only in 
> Kafka unit test, but also useful for the unit tests in other products that 
> depend on Kafka.
> Currently there are two issues making those two test harness classes hard to 
> use by other Java users.
> 1. The two classes are Scala traits. This makes it difficult for people to 
> write Java unit test code. 
> 2. Some of the interfaces are Scala only. 
> It will be good to expose those two classes for more general usage and make 
> them Java friendly.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Resolved] (KAFKA-3829) Warn that kafka-connect group.id must not conflict with connector names

2016-11-12 Thread Ewen Cheslack-Postava (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-3829?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ewen Cheslack-Postava resolved KAFKA-3829.
--
   Resolution: Fixed
Fix Version/s: (was: 0.10.1.1)
   0.10.2.0

Issue resolved by pull request 1911
[https://github.com/apache/kafka/pull/1911]

> Warn that kafka-connect group.id must not conflict with connector names
> ---
>
> Key: KAFKA-3829
> URL: https://issues.apache.org/jira/browse/KAFKA-3829
> Project: Kafka
>  Issue Type: Improvement
>  Components: KafkaConnect
>Affects Versions: 0.9.0.1
>Reporter: Barry Kaplan
>Assignee: Jason Gustafson
>  Labels: documentation
> Fix For: 0.10.2.0
>
>
> If the group.id value happens to have the same value as a connector names the 
> following error will be issued:
> {quote}
> Attempt to join group connect-elasticsearch-indexer failed due to: The group 
> member's supported protocols are incompatible with those of existing members.
> {quote}
> Maybe the documentation for Distributed Worker Configuration group.id could 
> be worded:
> {quote}
> A unique string that identifies the Connect cluster group this worker belongs 
> to. This value must be different than all connector configuration 'name' 
> properties.
> {quote}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (KAFKA-4400) Prefix for sink task consumer groups should be configurable

2016-11-11 Thread Ewen Cheslack-Postava (JIRA)

Ewen Cheslack-Postava created KAFKA-4400:


 Summary: Prefix for sink task consumer groups should be 
configurable
 Key: KAFKA-4400
 URL: https://issues.apache.org/jira/browse/KAFKA-4400
 Project: Kafka
  Issue Type: Bug
  Components: KafkaConnect
Affects Versions: 0.10.1.0
Reporter: Ewen Cheslack-Postava
Assignee: Ewen Cheslack-Postava


Currently the prefix for creating consumer groups is fixed. This means that if 
you run multiple Connect clusters using the same Kafka cluster and create 
connectors with the same name, sink tasks in different clusters will join the 
same group. Making this prefix configurable at the worker level would protect 
against this.

An alternative would be to define unique cluster IDs for each connect cluster, 
which would allow us to construct a unique name for the group without requiring 
yet another config (but presents something of a compatibility challenge).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Resolved] (KAFKA-4364) Sink tasks expose secrets in DEBUG logging

2016-11-09 Thread Ewen Cheslack-Postava (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-4364?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ewen Cheslack-Postava resolved KAFKA-4364.
--
   Resolution: Fixed
Fix Version/s: 0.10.2.0

Issue resolved by pull request 2115
[https://github.com/apache/kafka/pull/2115]

> Sink tasks expose secrets in DEBUG logging
> --
>
> Key: KAFKA-4364
> URL: https://issues.apache.org/jira/browse/KAFKA-4364
> Project: Kafka
>  Issue Type: Bug
>  Components: KafkaConnect
>Reporter: Ryan P
>Assignee: Ryan P
> Fix For: 0.10.2.0
>
>
> As it stands today worker tasks print secrets such as Key/Trust store 
> passwords to their respective logs. 
> https://github.com/confluentinc/kafka/blob/trunk/connect/runtime/src/main/java/org/apache/kafka/connect/runtime/WorkerSinkTask.java#L213-L214
> i.e.
> [2016-11-01 12:50:59,254] DEBUG Initializing connector test-sink with config 
> {consumer.ssl.truststore.password=password, 
> connector.class=io.confluent.connect.jdbc.JdbcSinkConnector, 
> connection.password=password, producer.security.protocol=SSL, 
> producer.ssl.truststore.password=password, topics=orders, tasks.max=1, 
> consumer.ssl.truststore.location=/tmp/truststore/kafka.trustore.jks, 
> producer.ssl.truststore.location=/tmp/truststore/kafka.trustore.jks, 
> connection.user=connect, name=test-sink, auto.create=true, 
> consumer.security.protocol=SSL, 
> connection.url=jdbc:postgresql://localhost/test} 
> (org.apache.kafka.connect.runtime.WorkerConnector:71)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (KAFKA-4381) Add per partition lag metric to KafkaConsumer.

2016-11-08 Thread Ewen Cheslack-Postava (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-4381?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15649280#comment-15649280
 ] 

Ewen Cheslack-Postava commented on KAFKA-4381:
--

Yeah, I don't want things to get bogged down in KIPs either. I just figure 
following the rule of "any public interface" makes it easy for everyone to know 
whether they need a KIP or not (e.g. monitoring is listed on the KIP page since 
they are user facing). We definitely play fast and loose with this elsewhere 
anyway though, e.g. command line tools often see user facing changes w/o KIPs, 
and most of them will be uncontroversial anyway. On the other hand, I've caught 
important backwards incompatible changes that almost made it through as well, 
so...

> Add per partition lag metric to KafkaConsumer.
> --
>
> Key: KAFKA-4381
> URL: https://issues.apache.org/jira/browse/KAFKA-4381
> Project: Kafka
>  Issue Type: Task
>  Components: clients, consumer
>Affects Versions: 0.10.1.0
>Reporter: Jiangjie Qin
>Assignee: Jiangjie Qin
> Fix For: 0.10.2.0
>
>
> Currently KafkaConsumer only has a metric of max lag across all the 
> partitions. It would be useful to know per partition lag as well.
> I remember there was a ticket created before but did not find it. So I am 
> creating this ticket.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (KAFKA-4284) Partitioner never closed by producer

2016-11-08 Thread Ewen Cheslack-Postava (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-4284?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ewen Cheslack-Postava updated KAFKA-4284:
-
   Resolution: Fixed
Fix Version/s: 0.10.2.0
   Status: Resolved  (was: Patch Available)

Issue resolved by pull request 2000
[https://github.com/apache/kafka/pull/2000]

> Partitioner never closed by producer
> 
>
> Key: KAFKA-4284
> URL: https://issues.apache.org/jira/browse/KAFKA-4284
> Project: Kafka
>  Issue Type: Bug
>  Components: producer 
>Affects Versions: 0.10.0.1
>Reporter: Theo Hultberg
>Priority: Minor
> Fix For: 0.10.2.0
>
>
> Partitioners are never closed by the producer, even though the Partitioner 
> interface has a close method.
> I looked at KAFKA-2091 and it seems like the close method has been there from 
> the beginning, but never been used.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (KAFKA-4353) Add semantic types to Kafka Connect

2016-11-07 Thread Ewen Cheslack-Postava (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-4353?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15645840#comment-15645840
 ] 

Ewen Cheslack-Postava commented on KAFKA-4353:
--

[~rhauch] Some of these make sense to me, others don't as much. UUID is an 
example that I think most programming languages have as a built-in now, so 
probably makes more sense as a native type (although interestingly, I would 
have represented it as bytes, not in string form). JSON might be a good example 
of the opposite, where if you're really intent on not passing it through 
Connect (and it'd be painful for every Converter to have to also support JSON), 
then I agree just naming the type should be enough.

There's a bit more to my concern around a large # of logical types than just 
Converters having to support them. The good thing w/ Converters is that there 
are bound to be relatively few of them, so while adding more types is annoying, 
it's not the end of the world. But if there are 40 specialized types, do we 
actually think connectors are commonly going to be able to do something useful 
with them? I just worry about having 15 different types for time since most 
systems in practice only have a couple (the fact that you're looking at CDC is 
probably why you're seeing a lot more, but there it doesn't look to me like 
there's actually a lot of overlap).

I think this is just a matter of impedance mismatch between different systems 
and how far we think it makes sense to bend over backwards to preserve as much 
info as possible vs where reasonable compromises can be made that make the 
story for Converter/Connector developers sane (and, frankly, users since once 
the data exits connect, they presumably need to understand all the types that 
can be emitted as well).

I think the idea of semantic types makes sense -- we wanted to be able to name 
types for exactly this reason (beyond even these close-to-primitive types). You 
can of course do this already with your own names, I think you're just trying 
to get coordination between source and sink connectors (and maybe other 
applications if they maintain & know to look at the schema name) since you'd 
prefer not to do this with debezium-specific names? Will all of the ones you 
listed actually make sense for applications? Take MicroTime vs NanoTime as an 
example -- they end up eating up the same storage anyway, would it make sense 
to just do it all as NanoTime (whereas MilliTimestamp and MicroTimestamp cover 
different possible ranges of time).

It might also make sense to try to get some feedback from the community as to 
which of these they'd use (and which might be missing, including logical 
types). It's a lot more compelling to hear that a dozen connectors are 
providing UUID as just a string because they don't have a named type.

> Add semantic types to Kafka Connect
> ---
>
> Key: KAFKA-4353
> URL: https://issues.apache.org/jira/browse/KAFKA-4353
> Project: Kafka
>  Issue Type: Improvement
>  Components: KafkaConnect
>Affects Versions: 0.10.0.1
>Reporter: Randall Hauch
>Assignee: Ewen Cheslack-Postava
>
> Kafka Connect's schema system defines several _core types_ that consist of:
> * STRUCT
> * ARRAY
> * MAP
> plus these _primitive types_:
> * INT8
> * INT16
> * INT32
> * INT64
> * FLOAT32
> * FLOAT64
> * BOOLEAN
> * STRING
> * BYTES
> The {{Schema}} for these core types define several attributes, but they do 
> not have a name.
> Kafka Connect also defines several _logical types_ that are specializations 
> of the primitive types and _do_ have schema names _and_ are automatically 
> mapped to/from Java objects:
> || Schema Name || Primitive Type || Java value class || Description ||
> | o.k.c.d.Decimal | {{BYTES}} | {{java.math.BigDecimal}} | An 
> arbitrary-precision signed decimal number. |
> | o.k.c.d.Date | {{INT32}} | {{java.util.Date}} | A date representing a 
> calendar day with no time of day or timezone. The {{java.util.Date}} value's 
> hours, minutes, seconds, milliseconds are set to 0. The underlying 
> representation is an integer representing the number of standardized days 
> (based on a number of milliseconds with 24 hours/day, 60 minutes/hour, 60 
> seconds/minute, 1000 milliseconds/second with n) since Unix epoch. |
> | o.k.c.d.Time | {{INT32}} | {{java.util.Date}} | A time representing a 
> specific point in a day, not tied to any specific date. Only the 
> {{java.util.Date}} value's hours, minutes, seconds, and milliseconds can be 
> non-zero. This effectively makes it a point in time during the first day 
> after the Unix epoch. The underlying representation is an integer 
> representing the number of milliseconds after midnight. |
> | o.k.c.d.Timestamp | {{INT32}} | {{java.util.Date}} | A timestamp 
> representing an absolute time, without timezone

[jira] [Created] (KAFKA-4388) Connect key and value converters are listed without default values

2016-11-07 Thread Ewen Cheslack-Postava (JIRA)

Ewen Cheslack-Postava created KAFKA-4388:


 Summary: Connect key and value converters are listed without 
default values
 Key: KAFKA-4388
 URL: https://issues.apache.org/jira/browse/KAFKA-4388
 Project: Kafka
  Issue Type: Bug
  Components: KafkaConnect
Affects Versions: 0.10.1.0
Reporter: Ewen Cheslack-Postava
Assignee: Ewen Cheslack-Postava
Priority: Minor


KIP-75 added per connector converters. This exposes the settings on a 
per-connector basis via the validation API. However, the way this is specified 
for each connector is via a config value with no default value. This means the 
validation API implies there is no setting unless you provide one.

It would be much better to include the default value extracted from the 
WorkerConfig instead so it's clear you shouldn't need to override the default.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (KAFKA-4383) Update API design subsection to reflect the current implementation of Producer/Consumer

2016-11-07 Thread Ewen Cheslack-Postava (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-4383?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ewen Cheslack-Postava updated KAFKA-4383:
-
Fix Version/s: (was: 0.10.0.1)
   (was: 0.10.0.0)

> Update API design subsection to reflect the current implementation of 
> Producer/Consumer
> ---
>
> Key: KAFKA-4383
> URL: https://issues.apache.org/jira/browse/KAFKA-4383
> Project: Kafka
>  Issue Type: Improvement
>  Components: documentation
>Affects Versions: 0.10.0.1
>Reporter: Konstantine Karantasis
>Assignee: Konstantine Karantasis
>
> After 0.9.0 and 0.10.0 the site docs were updated to reflect transition to 
> the new APIs for producer and consumer. Changes were made in sections such as 
> {{2. APIS}} and {{3. CONFIGURATION}}. 
> However, the related subsections under {{5.IMPLEMENTATION}} still describe 
> the implementation details of the old producer and consumer APIs. This 
> section needs to be re-written to reflect the implementation status of the 
> new APIs (possibly by retaining the description for the old APIs as well in a 
> separate subsection). 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (KAFKA-4382) Fix broken fragments in site docs

2016-11-07 Thread Ewen Cheslack-Postava (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-4382?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ewen Cheslack-Postava updated KAFKA-4382:
-
Fix Version/s: (was: 0.10.0.1)

> Fix broken fragments in site docs
> -
>
> Key: KAFKA-4382
> URL: https://issues.apache.org/jira/browse/KAFKA-4382
> Project: Kafka
>  Issue Type: Bug
>  Components: documentation
>Affects Versions: 0.10.0.1
>Reporter: Konstantine Karantasis
>Assignee: Konstantine Karantasis
>Priority: Minor
>  Labels: documentation
>   Original Estimate: 6h
>  Remaining Estimate: 6h
>
> There are just a few broken fragments in the current version of site docs. 
> For instance, under documentation.html in 0.10.1 such fragments are: 
> {quote}
> http://kafka.apache.org/documentation.html#newconsumerapi
> http://kafka.apache.org/documentation#config_broker
> http://kafka.apache.org/documentation#security_kerberos_sasl_clientconfig
> {quote}
> A more thorough search in the previous versions of the documentation might 
> reveal a few more. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (KAFKA-4381) Add per partition lag metric to KafkaConsumer.

2016-11-06 Thread Ewen Cheslack-Postava (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-4381?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15642780#comment-15642780
 ] 

Ewen Cheslack-Postava commented on KAFKA-4381:
--

I suppose since metrics can be considered public interface a KIP would be 
appropriate. But I would imagine it should be a very simple, uncontroversial 
one.

> Add per partition lag metric to KafkaConsumer.
> --
>
> Key: KAFKA-4381
> URL: https://issues.apache.org/jira/browse/KAFKA-4381
> Project: Kafka
>  Issue Type: Task
>  Components: clients, consumer
>Affects Versions: 0.10.1.0
>Reporter: Jiangjie Qin
>Assignee: Jiangjie Qin
> Fix For: 0.10.2.0
>
>
> Currently KafkaConsumer only has a metric of max lag across all the 
> partitions. It would be useful to know per partition lag as well.
> I remember there was a ticket created before but did not find it. So I am 
> creating this ticket.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Resolved] (KAFKA-4372) Kafka Connect REST API does not handle DELETE of connector with slashes in their names

2016-11-04 Thread Ewen Cheslack-Postava (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-4372?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ewen Cheslack-Postava resolved KAFKA-4372.
--
   Resolution: Fixed
 Reviewer: Ewen Cheslack-Postava
Fix Version/s: 0.10.2.0

> Kafka Connect REST API does not handle DELETE of connector with slashes in 
> their names
> --
>
> Key: KAFKA-4372
> URL: https://issues.apache.org/jira/browse/KAFKA-4372
> Project: Kafka
>  Issue Type: Bug
>  Components: KafkaConnect
>Affects Versions: 0.10.0.0, 0.10.0.1
>Reporter: Olivier Girardot
>Assignee: Ewen Cheslack-Postava
> Fix For: 0.10.2.0
>
>
> Currently there is nothing to prevent someone from registering a Kafka 
> Connector with slashes in its name, however it's impossible to DELETE it 
> afterwards because the DELETE REST API access point is using a PathParam and 
> does not allow slashes.
> A few other API points will have a tough times handling connectors with 
> slashes in their names.
> We should allow for slashes in the DELETE API points to allow current setups 
> to be cleaned up without having to drop all the other connectors, and not 
> allow anymore connectors to be created with slashes in their names.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (KAFKA-4371) Sporadic ConnectException shuts down the whole connect process

2016-11-03 Thread Ewen Cheslack-Postava (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-4371?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15634793#comment-15634793
 ] 

Ewen Cheslack-Postava commented on KAFKA-4371:
--

[~sagarrao] I'm a bit confused by the report. You say it takes down the whole 
connect process, but then talk about using the REST API after the failure. Does 
the process actually die or are you just saying that the process no longer 
continues to process data? If it's the former, then it'd be a bug with the 
framework. If it's a latter, this is just an issue with the JDBC connector 
which is why [~skyahead] was saying the bug probably should be filed in the 
JDBC connector's repository.


> Sporadic ConnectException shuts down the whole connect process
> --
>
> Key: KAFKA-4371
> URL: https://issues.apache.org/jira/browse/KAFKA-4371
> Project: Kafka
>  Issue Type: Bug
>Reporter: Sagar Rao
>Priority: Critical
>
> I had setup a 2 node distributed kafka-connect process. Everything went well 
> and I could see lot of data flowing into the relevant kafka topics.
> After some time, JDBCUtils.getCurrentTimeOnDB threw a ConnectException with 
> the following stacktrace:
> The last packet successfully received from the server was 792 milliseconds 
> ago.  The last packet sent successfully to the server was 286 milliseconds 
> ago. (io.confluent.connect.jdbc.source.JdbcSourceTask:234)
> [2016-11-02 12:42:06,116] ERROR Failed to get current time from DB using 
> query select CURRENT_TIMESTAMP; on database MySQL 
> (io.confluent.connect.jdbc.util.JdbcUtils:226)
> com.mysql.jdbc.exceptions.jdbc4.CommunicationsException: Communications link 
> failure
> The last packet successfully received from the server was 1,855 milliseconds 
> ago.  The last packet sent successfully to the server was 557 milliseconds 
> ago.
>at sun.reflect.GeneratedConstructorAccessor51.newInstance(Unknown 
> Source)
>at 
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
>at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
>at com.mysql.jdbc.Util.handleNewInstance(Util.java:411)
>at 
> com.mysql.jdbc.SQLError.createCommunicationsException(SQLError.java:1117)
>at com.mysql.jdbc.MysqlIO.send(MysqlIO.java:3829)
>at com.mysql.jdbc.MysqlIO.sendCommand(MysqlIO.java:2449)
>at com.mysql.jdbc.MysqlIO.sqlQueryDirect(MysqlIO.java:2629)
>at com.mysql.jdbc.ConnectionImpl.execSQL(ConnectionImpl.java:2719)
>at 
> com.mysql.jdbc.PreparedStatement.executeInternal(PreparedStatement.java:2155)
>at 
> com.mysql.jdbc.PreparedStatement.execute(PreparedStatement.java:1379)
>at 
> com.mysql.jdbc.StatementImpl.createResultSetUsingServerFetch(StatementImpl.java:651)
>at com.mysql.jdbc.StatementImpl.executeQuery(StatementImpl.java:1527)
>at 
> io.confluent.connect.jdbc.util.JdbcUtils.getCurrentTimeOnDB(JdbcUtils.java:220)
>at 
> io.confluent.connect.jdbc.source.TimestampIncrementingTableQuerier.executeQuery(TimestampIncrementingTableQuerier.java:157)
>at 
> io.confluent.connect.jdbc.source.TableQuerier.maybeStartQuery(TableQuerier.java:78)
>at 
> io.confluent.connect.jdbc.source.TimestampIncrementingTableQuerier.maybeStartQuery(TimestampIncrementingTableQuerier.java:57)
>at 
> io.confluent.connect.jdbc.source.JdbcSourceTask.poll(JdbcSourceTask.java:207)
>at 
> org.apache.kafka.connect.runtime.WorkerSourceTask.execute(WorkerSourceTask.java:155)
>at 
> org.apache.kafka.connect.runtime.WorkerTask.doRun(WorkerTask.java:140)
>at org.apache.kafka.connect.runtime.WorkerTask.run(WorkerTask.java:175)
>at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>at java.lang.Thread.run(Thread.java:745)
> Caused by: java.net.SocketException: Broken pipe (Write failed)
>at java.net.SocketOutputStream.socketWrite0(Native Method)
>at java.net.SocketOutputStream.socketWrite(SocketOutputStream.java:109)
>at java.net.SocketOutputStream.write(SocketOutputStream.java:153)
>at 
> java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:82)
>at java.io.BufferedOutputStream.flush(BufferedOutputStream.java:140)
>at com.mysql.jdbc.MysqlIO.send(MysqlIO.java:3810)
>... 20 more
> This was just a minor glitch to the connection as the ec2 isntances are able 
> to connect to the Mysql Aurora instances without any issues.
> But, after

[jira] [Updated] (KAFKA-4343) Connect REST API should expose whether each connector is a source or sink

2016-10-25 Thread Ewen Cheslack-Postava (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-4343?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ewen Cheslack-Postava updated KAFKA-4343:
-
Labels: needs-kip  (was: )

> Connect REST API should expose whether each connector is a source or sink
> -
>
> Key: KAFKA-4343
> URL: https://issues.apache.org/jira/browse/KAFKA-4343
> Project: Kafka
>  Issue Type: Bug
>  Components: KafkaConnect
>Reporter: Ewen Cheslack-Postava
>Assignee: Ewen Cheslack-Postava
>  Labels: needs-kip
>
> Currently we don't expose information about whether a connector is a source 
> or sink. This is useful when, e.g., categorizing connectors in a UI. Given 
> naming conventions we try to encourage you *might* be able to determine this 
> via the connector's class name, but that isn't reliable.
> This may be related to https://issues.apache.org/jira/browse/KAFKA-4279 as we 
> might just want to expose this information on a per-connector-plugin basis 
> and expect any users to tie the results from multiple API requests together. 
> An alternative that would probably be simpler for users would be to include 
> it in the /connectors//status output.
> Note that this will require a KIP as it adds to the public REST API.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (KAFKA-4343) Connect REST API should expose whether each connector is a source or sink

2016-10-25 Thread Ewen Cheslack-Postava (JIRA)

Ewen Cheslack-Postava created KAFKA-4343:


 Summary: Connect REST API should expose whether each connector is 
a source or sink
 Key: KAFKA-4343
 URL: https://issues.apache.org/jira/browse/KAFKA-4343
 Project: Kafka
  Issue Type: Bug
  Components: KafkaConnect
Reporter: Ewen Cheslack-Postava
Assignee: Ewen Cheslack-Postava


Currently we don't expose information about whether a connector is a source or 
sink. This is useful when, e.g., categorizing connectors in a UI. Given naming 
conventions we try to encourage you *might* be able to determine this via the 
connector's class name, but that isn't reliable.

This may be related to https://issues.apache.org/jira/browse/KAFKA-4279 as we 
might just want to expose this information on a per-connector-plugin basis and 
expect any users to tie the results from multiple API requests together. An 
alternative that would probably be simpler for users would be to include it in 
the /connectors//status output.

Note that this will require a KIP as it adds to the public REST API.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (KAFKA-4161) Decouple flush and offset commits

2016-10-25 Thread Ewen Cheslack-Postava (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-4161?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ewen Cheslack-Postava updated KAFKA-4161:
-
Assignee: Shikhar Bhushan  (was: Ewen Cheslack-Postava)

> Decouple flush and offset commits
> -
>
> Key: KAFKA-4161
> URL: https://issues.apache.org/jira/browse/KAFKA-4161
> Project: Kafka
>  Issue Type: Improvement
>  Components: KafkaConnect
>Reporter: Shikhar Bhushan
>Assignee: Shikhar Bhushan
>  Labels: needs-kip
>
> It is desirable to have, in addition to the time-based flush interval, volume 
> or size-based commits. E.g. a sink connector which is buffering in terms of 
> number of records may want to request a flush when the buffer is full, or 
> when sufficient amount of data has been buffered in a file.
> Having a method like say {{requestFlush()}} on the {{SinkTaskContext}} would 
> allow for connectors to have flexible policies around flushes. This would be 
> in addition to the time interval based flushes that are controlled with 
> {{offset.flush.interval.ms}}, for which the clock should be reset when any 
> kind of flush happens.
> We should probably also support requesting flushes via the 
> {{SourceTaskContext}} for consistency though a use-case doesn't come to mind 
> off the bat.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (KAFKA-4335) FileStreamSource Connector not working for large files (~ 1GB)

2016-10-24 Thread Ewen Cheslack-Postava (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-4335?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15602851#comment-15602851
 ] 

Ewen Cheslack-Postava commented on KAFKA-4335:
--

Can you be more specific about what isn't working? Does it throw an exception 
or some other error?

> FileStreamSource Connector not working for large files (~ 1GB)
> --
>
> Key: KAFKA-4335
> URL: https://issues.apache.org/jira/browse/KAFKA-4335
> Project: Kafka
>  Issue Type: Bug
>  Components: KafkaConnect
>Affects Versions: 0.10.0.0
>Reporter: Rahul Shukla
>Assignee: Ewen Cheslack-Postava
>
> I was trying to sink large file about (1gb). FileStreamSource connector is 
> not working for that it's working fine for small files.  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Resolved] (KAFKA-4334) HashCode in SinkRecord not handling null timestamp, checks on value

2016-10-23 Thread Ewen Cheslack-Postava (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-4334?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ewen Cheslack-Postava resolved KAFKA-4334.
--
Resolution: Fixed

> HashCode in SinkRecord not handling null timestamp, checks on value
> ---
>
> Key: KAFKA-4334
> URL: https://issues.apache.org/jira/browse/KAFKA-4334
> Project: Kafka
>  Issue Type: Bug
>  Components: KafkaConnect
>Affects Versions: 0.10.0.1
>Reporter: Andrew Stevenson
>Assignee: Ewen Cheslack-Postava
> Fix For: 0.10.2.0
>
>
> hashCode for timestamp field has null check on field value not timestamp



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (KAFKA-4334) HashCode in SinkRecord not handling null timestamp, checks on value

2016-10-23 Thread Ewen Cheslack-Postava (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-4334?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ewen Cheslack-Postava updated KAFKA-4334:
-
Fix Version/s: 0.10.2.0

> HashCode in SinkRecord not handling null timestamp, checks on value
> ---
>
> Key: KAFKA-4334
> URL: https://issues.apache.org/jira/browse/KAFKA-4334
> Project: Kafka
>  Issue Type: Bug
>  Components: KafkaConnect
>Affects Versions: 0.10.0.1
>Reporter: Andrew Stevenson
>Assignee: Ewen Cheslack-Postava
> Fix For: 0.10.2.0
>
>
> hashCode for timestamp field has null check on field value not timestamp



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (KAFKA-4290) High CPU caused by timeout overflow in WorkerCoordinator

2016-10-11 Thread Ewen Cheslack-Postava (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-4290?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ewen Cheslack-Postava updated KAFKA-4290:
-
Fix Version/s: 0.10.2.0

> High CPU caused by timeout overflow in WorkerCoordinator
> 
>
> Key: KAFKA-4290
> URL: https://issues.apache.org/jira/browse/KAFKA-4290
> Project: Kafka
>  Issue Type: Bug
>  Components: KafkaConnect
>Reporter: Jason Gustafson
>Assignee: Jason Gustafson
>Priority: Blocker
> Fix For: 0.10.1.0, 0.10.2.0
>
>
> The timeout passed to {{WorkerCoordinator.poll()}} can overflow if large 
> enough because we add it to the current time in order to calculate the call's 
> deadline. This shortcuts the poll loop and results in a very tight event loop 
> which can saturate a CPU. We hit this case out of the box because Connect 
> uses a default timeout of {{Long.MAX_VALUE}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Resolved] (KAFKA-4290) High CPU caused by timeout overflow in WorkerCoordinator

2016-10-11 Thread Ewen Cheslack-Postava (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-4290?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ewen Cheslack-Postava resolved KAFKA-4290.
--
Resolution: Fixed
  Reviewer: Ewen Cheslack-Postava

> High CPU caused by timeout overflow in WorkerCoordinator
> 
>
> Key: KAFKA-4290
> URL: https://issues.apache.org/jira/browse/KAFKA-4290
> Project: Kafka
>  Issue Type: Bug
>  Components: KafkaConnect
>Reporter: Jason Gustafson
>Assignee: Jason Gustafson
>Priority: Blocker
> Fix For: 0.10.1.0, 0.10.2.0
>
>
> The timeout passed to {{WorkerCoordinator.poll()}} can overflow if large 
> enough because we add it to the current time in order to calculate the call's 
> deadline. This shortcuts the poll loop and results in a very tight event loop 
> which can saturate a CPU. We hit this case out of the box because Connect 
> uses a default timeout of {{Long.MAX_VALUE}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Resolved] (KAFKA-4010) ConfigDef.toRst() should create sections for each group

2016-10-10 Thread Ewen Cheslack-Postava (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-4010?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ewen Cheslack-Postava resolved KAFKA-4010.
--
   Resolution: Fixed
Fix Version/s: 0.10.2.0

Issue resolved by pull request 1964
[https://github.com/apache/kafka/pull/1964]

> ConfigDef.toRst() should create sections for each group
> ---
>
> Key: KAFKA-4010
> URL: https://issues.apache.org/jira/browse/KAFKA-4010
> Project: Kafka
>  Issue Type: Improvement
>Reporter: Shikhar Bhushan
>Assignee: Rekha Joshi
>Priority: Minor
> Fix For: 0.10.2.0
>
>
> Currently the ordering seems a bit arbitrary. There is a logical grouping 
> that connectors are now able to specify with the 'group' field, which we 
> should use as section headers. Also it would be good to generate {{:ref:}} 
> for each section.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (KAFKA-4280) Add REST resource for showing available connector plugin configs

2016-10-10 Thread Ewen Cheslack-Postava (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-4280?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15562889#comment-15562889
 ] 

Ewen Cheslack-Postava commented on KAFKA-4280:
--

[~gwenshap] You can get the list of configs by submitting an empty config to 
the validation endpoint.

> Add REST resource for showing available connector plugin configs
> 
>
> Key: KAFKA-4280
> URL: https://issues.apache.org/jira/browse/KAFKA-4280
> Project: Kafka
>  Issue Type: Bug
>  Components: KafkaConnect
>Reporter: Gwen Shapira
>Assignee: Ewen Cheslack-Postava
>
> Connector-plugins allow listing the plugs and validating configs, but we have 
> nothing (I think?) for listing available configuration properties.
> If this doesn't exist, would be good for usability to add it. If it does 
> exist, perhaps document it?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (KAFKA-3808) Transient failure in ReplicaVerificationToolTest

2016-10-10 Thread Ewen Cheslack-Postava (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-3808?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15562803#comment-15562803
 ] 

Ewen Cheslack-Postava commented on KAFKA-3808:
--

Seen again running against trunk commit 44d18d2:
Module: kafkatest.tests.tools.replica_verification_test
Class: ReplicaVerificationToolTest
Method: test_replica_lags
Test run report: 
http://confluent-kafka-system-test-results.s3-us-west-2.amazonaws.com/2016-10-09--001.1476047058--apache--trunk--44d18d2/report.html
Archive with test run info: 
http://confluent-kafka-system-test-results.s3-us-west-2.amazonaws.com/2016-10-09--001.1476047058--apache--trunk--44d18d2/ReplicaVerificationToolTest/test_replica_lags.tgz
The following is the exception that causes the failure:

test_id: 
2016-10-09--001.kafkatest.tests.tools.replica_verification_test.ReplicaVerificationToolTest.test_replica_lags
status: FAIL
run time: 1 minute 12.266 seconds
Timed out waiting to reach non-zero number of replica lags.
Traceback (most recent call last):
File 
"/var/lib/jenkins/workspace/system-test-kafka/kafka/venv/local/lib/python2.7/site-packages/ducktape/tests/runner.py",
 line 106, in run_all_tests
data = self.run_single_test()
File 
"/var/lib/jenkins/workspace/system-test-kafka/kafka/venv/local/lib/python2.7/site-packages/ducktape/tests/runner.py",
 line 162, in run_single_test
return self.current_test_context.function(self.current_test)
File 
"/var/lib/jenkins/workspace/system-test-kafka/kafka/tests/kafkatest/tests/tools/replica_verification_test.py",
 line 88, in test_replica_lags
err_msg="Timed out waiting to reach non-zero number of replica lags.")
File 
"/var/lib/jenkins/workspace/system-test-kafka/kafka/venv/local/lib/python2.7/site-packages/ducktape/utils/util.py",
 line 36, in wait_until
raise TimeoutError(err_msg)
TimeoutError: Timed out waiting to reach non-zero number of replica lags.
I am not seeing anything in the recent commit history that looks related, so 
this may be an existing issue with the test. Nothing else obviously wrong 
popped out at me in the logs, but I don't know the details of this test.

> Transient failure in ReplicaVerificationToolTest
> 
>
> Key: KAFKA-3808
> URL: https://issues.apache.org/jira/browse/KAFKA-3808
> Project: Kafka
>  Issue Type: Bug
>  Components: system tests
>Reporter: Geoff Anderson
>
> {code}
> test_id:
> 2016-05-29--001.kafkatest.tests.tools.replica_verification_test.ReplicaVerificationToolTest.test_replica_lags
> status: FAIL
> run time:   1 minute 9.231 seconds
> Timed out waiting to reach non-zero number of replica lags.
> Traceback (most recent call last):
>   File 
> "/var/lib/jenkins/workspace/system-test-kafka/kafka/venv/local/lib/python2.7/site-packages/ducktape-0.5.1-py2.7.egg/ducktape/tests/runner.py",
>  line 106, in run_all_tests
> data = self.run_single_test()
>   File 
> "/var/lib/jenkins/workspace/system-test-kafka/kafka/venv/local/lib/python2.7/site-packages/ducktape-0.5.1-py2.7.egg/ducktape/tests/runner.py",
>  line 162, in run_single_test
> return self.current_test_context.function(self.current_test)
>   File 
> "/var/lib/jenkins/workspace/system-test-kafka/kafka/tests/kafkatest/tests/tools/replica_verification_test.py",
>  line 88, in test_replica_lags
> err_msg="Timed out waiting to reach non-zero number of replica lags.")
>   File 
> "/var/lib/jenkins/workspace/system-test-kafka/kafka/venv/local/lib/python2.7/site-packages/ducktape-0.5.1-py2.7.egg/ducktape/utils/util.py",
>  line 36, in wait_until
> raise TimeoutError(err_msg)
> TimeoutError: Timed out waiting to reach non-zero number of replica lags.
> {code}
> http://confluent-kafka-system-test-results.s3-us-west-2.amazonaws.com/2016-05-29--001.1464540508--apache--trunk--404b696/report.html



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Resolved] (KAFKA-4285) ReplicaVerificationToolTest.test_replica_lags: Timed out waiting to reach non-zero number of replica lags.

2016-10-10 Thread Ewen Cheslack-Postava (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-4285?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ewen Cheslack-Postava resolved KAFKA-4285.
--
Resolution: Duplicate

> ReplicaVerificationToolTest.test_replica_lags: Timed out waiting to reach 
> non-zero number of replica lags.
> --
>
> Key: KAFKA-4285
> URL: https://issues.apache.org/jira/browse/KAFKA-4285
> Project: Kafka
>  Issue Type: Bug
>Reporter: Ewen Cheslack-Postava
>Assignee: Flavio Junqueira
>  Labels: system-test-failure
>
> This failure happened on trunk against commit 44d18d2:
> Module: kafkatest.tests.tools.replica_verification_test
> Class:  ReplicaVerificationToolTest
> Method: test_replica_lags
> Test run report: 
> http://confluent-kafka-system-test-results.s3-us-west-2.amazonaws.com/2016-10-09--001.1476047058--apache--trunk--44d18d2/report.html
> Archive with test run info: 
> http://confluent-kafka-system-test-results.s3-us-west-2.amazonaws.com/2016-10-09--001.1476047058--apache--trunk--44d18d2/ReplicaVerificationToolTest/test_replica_lags.tgz
> The following is the exception that causes the failure:
> {quote}
> 
> test_id:
> 2016-10-09--001.kafkatest.tests.tools.replica_verification_test.ReplicaVerificationToolTest.test_replica_lags
> status: FAIL
> run time:   1 minute 12.266 seconds
> Timed out waiting to reach non-zero number of replica lags.
> Traceback (most recent call last):
>   File 
> "/var/lib/jenkins/workspace/system-test-kafka/kafka/venv/local/lib/python2.7/site-packages/ducktape/tests/runner.py",
>  line 106, in run_all_tests
> data = self.run_single_test()
>   File 
> "/var/lib/jenkins/workspace/system-test-kafka/kafka/venv/local/lib/python2.7/site-packages/ducktape/tests/runner.py",
>  line 162, in run_single_test
> return self.current_test_context.function(self.current_test)
>   File 
> "/var/lib/jenkins/workspace/system-test-kafka/kafka/tests/kafkatest/tests/tools/replica_verification_test.py",
>  line 88, in test_replica_lags
> err_msg="Timed out waiting to reach non-zero number of replica lags.")
>   File 
> "/var/lib/jenkins/workspace/system-test-kafka/kafka/venv/local/lib/python2.7/site-packages/ducktape/utils/util.py",
>  line 36, in wait_until
> raise TimeoutError(err_msg)
> TimeoutError: Timed out waiting to reach non-zero number of replica lags.
> {quote}
> I am not seeing anything in the recent commit history that looks related, so 
> this may be an existing issue with the test. Nothing else obviously wrong 
> popped out at me in the logs, but I don't know the details of this test.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (KAFKA-4285) ReplicaVerificationToolTest.test_replica_lags: Timed out waiting to reach non-zero number of replica lags.

2016-10-10 Thread Ewen Cheslack-Postava (JIRA)

Ewen Cheslack-Postava created KAFKA-4285:


 Summary: ReplicaVerificationToolTest.test_replica_lags: Timed out 
waiting to reach non-zero number of replica lags.
 Key: KAFKA-4285
 URL: https://issues.apache.org/jira/browse/KAFKA-4285
 Project: Kafka
  Issue Type: Bug
Reporter: Ewen Cheslack-Postava
Assignee: Flavio Junqueira


This failure happened on trunk against commit 44d18d2:

Module: kafkatest.tests.tools.replica_verification_test
Class:  ReplicaVerificationToolTest
Method: test_replica_lags

Test run report: 
http://confluent-kafka-system-test-results.s3-us-west-2.amazonaws.com/2016-10-09--001.1476047058--apache--trunk--44d18d2/report.html

Archive with test run info: 
http://confluent-kafka-system-test-results.s3-us-west-2.amazonaws.com/2016-10-09--001.1476047058--apache--trunk--44d18d2/ReplicaVerificationToolTest/test_replica_lags.tgz

The following is the exception that causes the failure:
{quote}

test_id:
2016-10-09--001.kafkatest.tests.tools.replica_verification_test.ReplicaVerificationToolTest.test_replica_lags
status: FAIL
run time:   1 minute 12.266 seconds


Timed out waiting to reach non-zero number of replica lags.
Traceback (most recent call last):
  File 
"/var/lib/jenkins/workspace/system-test-kafka/kafka/venv/local/lib/python2.7/site-packages/ducktape/tests/runner.py",
 line 106, in run_all_tests
data = self.run_single_test()
  File 
"/var/lib/jenkins/workspace/system-test-kafka/kafka/venv/local/lib/python2.7/site-packages/ducktape/tests/runner.py",
 line 162, in run_single_test
return self.current_test_context.function(self.current_test)
  File 
"/var/lib/jenkins/workspace/system-test-kafka/kafka/tests/kafkatest/tests/tools/replica_verification_test.py",
 line 88, in test_replica_lags
err_msg="Timed out waiting to reach non-zero number of replica lags.")
  File 
"/var/lib/jenkins/workspace/system-test-kafka/kafka/venv/local/lib/python2.7/site-packages/ducktape/utils/util.py",
 line 36, in wait_until
raise TimeoutError(err_msg)
TimeoutError: Timed out waiting to reach non-zero number of replica lags.
{quote}

I am not seeing anything in the recent commit history that looks related, so 
this may be an existing issue with the test. Nothing else obviously wrong 
popped out at me in the logs, but I don't know the details of this test.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Resolved] (KAFKA-4251) Test driver not launching in Vagrant 1.8.6

2016-10-04 Thread Ewen Cheslack-Postava (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-4251?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ewen Cheslack-Postava resolved KAFKA-4251.
--
   Resolution: Fixed
Fix Version/s: 0.10.2.0
   0.10.1.0

Issue resolved by pull request 1962
[https://github.com/apache/kafka/pull/1962]

> Test driver not launching in Vagrant 1.8.6
> --
>
> Key: KAFKA-4251
> URL: https://issues.apache.org/jira/browse/KAFKA-4251
> Project: Kafka
>  Issue Type: Bug
>  Components: system tests
>Reporter: Xavier Léauté
> Fix For: 0.10.1.0, 0.10.2.0
>
>
> custom ip resolver in test driver makes incorrect assumption when calling 
> vm.communicate.execute, causing driver to fail launching with Vagrant 1.8.6



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (KAFKA-4202) Facing error while trying to create the Producer.

2016-09-22 Thread Ewen Cheslack-Postava (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-4202?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15514380#comment-15514380
 ] 

Ewen Cheslack-Postava commented on KAFKA-4202:
--

It looks like you may have mismatched versions of Kafka jars on your classpath 
since there is a reference to a method that doesn't exist. You should check 
what's on your classpath and verify there aren't multiple versions.

> Facing error while trying to create the Producer.
> -
>
> Key: KAFKA-4202
> URL: https://issues.apache.org/jira/browse/KAFKA-4202
> Project: Kafka
>  Issue Type: Bug
>Reporter: Rohan
>
> While trying to run the command 
> bin/kafka-console-producer.sh --broker-list localhost:9092 --topic first-topic
> I am facing the below error.
> ERROR StatusLogger No log4j2 configuration file found. Using default 
> configuration: logging only errors to the console.
> Exception in thread "main" java.lang.NoSuchMethodError: 
> kafka.utils.CommandLineUtils$.parseKeyValueArgs(Lscala/collection/Iterable;)Ljava/util/Properties;
>   at 
> kafka.tools.ConsoleProducer$ProducerConfig.(ConsoleProducer.scala:279)
>   at kafka.tools.ConsoleProducer$.main(ConsoleProducer.scala:38)
>   at kafka.tools.ConsoleProducer.main(ConsoleProducer.scala)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (KAFKA-4202) Facing error while trying to create the Producer.

2016-09-22 Thread Ewen Cheslack-Postava (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-4202?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ewen Cheslack-Postava updated KAFKA-4202:
-
Assignee: (was: Ewen Cheslack-Postava)

> Facing error while trying to create the Producer.
> -
>
> Key: KAFKA-4202
> URL: https://issues.apache.org/jira/browse/KAFKA-4202
> Project: Kafka
>  Issue Type: Bug
>Reporter: Rohan
>
> While trying to run the command 
> bin/kafka-console-producer.sh --broker-list localhost:9092 --topic first-topic
> I am facing the below error.
> ERROR StatusLogger No log4j2 configuration file found. Using default 
> configuration: logging only errors to the console.
> Exception in thread "main" java.lang.NoSuchMethodError: 
> kafka.utils.CommandLineUtils$.parseKeyValueArgs(Lscala/collection/Iterable;)Ljava/util/Properties;
>   at 
> kafka.tools.ConsoleProducer$ProducerConfig.(ConsoleProducer.scala:279)
>   at kafka.tools.ConsoleProducer$.main(ConsoleProducer.scala:38)
>   at kafka.tools.ConsoleProducer.main(ConsoleProducer.scala)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (KAFKA-4202) Facing error while trying to create the Producer.

2016-09-22 Thread Ewen Cheslack-Postava (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-4202?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ewen Cheslack-Postava updated KAFKA-4202:
-
Component/s: (was: KafkaConnect)

> Facing error while trying to create the Producer.
> -
>
> Key: KAFKA-4202
> URL: https://issues.apache.org/jira/browse/KAFKA-4202
> Project: Kafka
>  Issue Type: Bug
>Reporter: Rohan
>Assignee: Ewen Cheslack-Postava
>
> While trying to run the command 
> bin/kafka-console-producer.sh --broker-list localhost:9092 --topic first-topic
> I am facing the below error.
> ERROR StatusLogger No log4j2 configuration file found. Using default 
> configuration: logging only errors to the console.
> Exception in thread "main" java.lang.NoSuchMethodError: 
> kafka.utils.CommandLineUtils$.parseKeyValueArgs(Lscala/collection/Iterable;)Ljava/util/Properties;
>   at 
> kafka.tools.ConsoleProducer$ProducerConfig.(ConsoleProducer.scala:279)
>   at kafka.tools.ConsoleProducer$.main(ConsoleProducer.scala:38)
>   at kafka.tools.ConsoleProducer.main(ConsoleProducer.scala)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

< 1 2 3 4 5 6 7 8 9 10 >

401 - 500 of 1448 matches

Mail list logo