Build failed in Jenkins: kafka-trunk-jdk8 #1135

2016-12-29 Thread Apache Jenkins Server
See 

Changes:

[me] MINOR: Mx4jLoader always returns false even if mx4j is loaded & started

[me] MINOR: Rephrase Javadoc summary for ConsumerRecord

--
[...truncated 32958 lines...]

org.apache.kafka.streams.KafkaStreamsTest > 
shouldNotGetAllTasksWithStoreWhenNotRunning STARTED

org.apache.kafka.streams.KafkaStreamsTest > 
shouldNotGetAllTasksWithStoreWhenNotRunning PASSED

org.apache.kafka.streams.KafkaStreamsTest > testCannotStartOnceClosed STARTED

org.apache.kafka.streams.KafkaStreamsTest > testCannotStartOnceClosed PASSED

org.apache.kafka.streams.KafkaStreamsTest > testCleanup STARTED

org.apache.kafka.streams.KafkaStreamsTest > testCleanup PASSED

org.apache.kafka.streams.KafkaStreamsTest > testStartAndClose STARTED

org.apache.kafka.streams.KafkaStreamsTest > testStartAndClose PASSED

org.apache.kafka.streams.KafkaStreamsTest > testCloseIsIdempotent STARTED

org.apache.kafka.streams.KafkaStreamsTest > testCloseIsIdempotent PASSED

org.apache.kafka.streams.KafkaStreamsTest > testCannotCleanupWhileRunning 
STARTED

org.apache.kafka.streams.KafkaStreamsTest > testCannotCleanupWhileRunning PASSED

org.apache.kafka.streams.KafkaStreamsTest > testCannotStartTwice STARTED

org.apache.kafka.streams.KafkaStreamsTest > testCannotStartTwice PASSED

org.apache.kafka.streams.integration.KStreamKTableJoinIntegrationTest > 
shouldCountClicksPerRegion[0] STARTED

org.apache.kafka.streams.integration.KStreamKTableJoinIntegrationTest > 
shouldCountClicksPerRegion[0] PASSED

org.apache.kafka.streams.integration.KStreamKTableJoinIntegrationTest > 
shouldCountClicksPerRegion[1] STARTED

org.apache.kafka.streams.integration.KStreamKTableJoinIntegrationTest > 
shouldCountClicksPerRegion[1] PASSED

org.apache.kafka.streams.integration.QueryableStateIntegrationTest > 
shouldBeAbleToQueryState[0] STARTED

org.apache.kafka.streams.integration.QueryableStateIntegrationTest > 
shouldBeAbleToQueryState[0] PASSED

org.apache.kafka.streams.integration.QueryableStateIntegrationTest > 
shouldNotMakeStoreAvailableUntilAllStoresAvailable[0] STARTED

org.apache.kafka.streams.integration.QueryableStateIntegrationTest > 
shouldNotMakeStoreAvailableUntilAllStoresAvailable[0] PASSED

org.apache.kafka.streams.integration.QueryableStateIntegrationTest > 
queryOnRebalance[0] STARTED

org.apache.kafka.streams.integration.QueryableStateIntegrationTest > 
queryOnRebalance[0] FAILED
java.lang.AssertionError: Condition not met within timeout 6. Did not 
receive 1 number of records
at org.apache.kafka.test.TestUtils.waitForCondition(TestUtils.java:259)
at 
org.apache.kafka.streams.integration.utils.IntegrationTestUtils.waitUntilMinValuesRecordsReceived(IntegrationTestUtils.java:251)
at 
org.apache.kafka.streams.integration.QueryableStateIntegrationTest.waitUntilAtLeastNumRecordProcessed(QueryableStateIntegrationTest.java:669)
at 
org.apache.kafka.streams.integration.QueryableStateIntegrationTest.queryOnRebalance(QueryableStateIntegrationTest.java:350)

org.apache.kafka.streams.integration.QueryableStateIntegrationTest > 
concurrentAccesses[0] STARTED

org.apache.kafka.streams.integration.QueryableStateIntegrationTest > 
concurrentAccesses[0] PASSED

org.apache.kafka.streams.integration.QueryableStateIntegrationTest > 
shouldBeAbleToQueryState[1] STARTED

org.apache.kafka.streams.integration.QueryableStateIntegrationTest > 
shouldBeAbleToQueryState[1] PASSED

org.apache.kafka.streams.integration.QueryableStateIntegrationTest > 
shouldNotMakeStoreAvailableUntilAllStoresAvailable[1] STARTED

org.apache.kafka.streams.integration.QueryableStateIntegrationTest > 
shouldNotMakeStoreAvailableUntilAllStoresAvailable[1] PASSED

org.apache.kafka.streams.integration.QueryableStateIntegrationTest > 
queryOnRebalance[1] STARTED

org.apache.kafka.streams.integration.QueryableStateIntegrationTest > 
queryOnRebalance[1] PASSED

org.apache.kafka.streams.integration.QueryableStateIntegrationTest > 
concurrentAccesses[1] STARTED

org.apache.kafka.streams.integration.QueryableStateIntegrationTest > 
concurrentAccesses[1] PASSED

org.apache.kafka.streams.integration.RegexSourceIntegrationTest > 
testShouldReadFromRegexAndNamedTopics STARTED

org.apache.kafka.streams.integration.RegexSourceIntegrationTest > 
testShouldReadFromRegexAndNamedTopics PASSED

org.apache.kafka.streams.integration.RegexSourceIntegrationTest > 
testRegexMatchesTopicsAWhenCreated STARTED

org.apache.kafka.streams.integration.RegexSourceIntegrationTest > 
testRegexMatchesTopicsAWhenCreated PASSED

org.apache.kafka.streams.integration.RegexSourceIntegrationTest > 
testMultipleConsumersCanReadFromPartitionedTopic STARTED

org.apache.kafka.streams.integration.RegexSourceIntegrationTest > 
testMultipleConsumersCanReadFromPartitionedTopic PASSED

org.apache.kafka.streams.integration.RegexSourceIntegrationTest > 
testRegexMatchesTopicsAWhenDeleted STARTED


Jenkins build is back to normal : kafka-trunk-jdk7 #1788

2016-12-29 Thread Apache Jenkins Server
See 



Re: custom offsets in ProduceRequest

2016-12-29 Thread radai
or even better - if topic creation is done dynamically by the replicator,
setting the initial offsets for partitions could be made part of topic
creation ? even less API changes this way

On Thu, Dec 29, 2016 at 10:49 PM, radai  wrote:

> ah, I didnt realize we are limiting the discussion to master --> slave.
>
> but - if we're talking about master-slave replication, and under the
> conditions i outlined earlier (src and dest match in #partitions, no
> foreign writes to dest) it "just works", seems to me the only thing youre
> really missing is not an explicit desired offset param on each and every
> request, but just the ability to "reset" the starting offset on the dest
> cluster at topic creation.
>
> let me try and run through a more detailed scenario:
>
> 1. suppose i set up the original cluster (src). no remote cluster yet.
> lets say over some period of time i produce 1 million msgs to topic X on
> this src cluster.
> 2. company grows, 2nd site is opened, dest cluster is created, topic X is
> created on (brand new) dest cluster.
> 3. offsets are manually set on every partition of X on the dest cluster to
> match either the oldest retained or current offset of the matching
> partition of X in src. in pseudo code:
>
>  for (partI in numPartitions) {
> partIOffset
> if (replicateAllRetainedHistory) {
>partIOffset = src.getOldestRetained(partI)
> } else {
>partIOffset = src.getCurrent(partI) //will not copy over history
> }
> dest.resetStartingOffset(partI, partIOffset)   < new mgmt API
>  }
>
> 4. now you are free to start replicating. under master --> slave
> assumptions offsets will match from this point forward
>
> seems to me something like this could be made part of the replicator
> component (mirror maker, or whatever else you want to use) - if topic X
> does not exist in destination, create it, reset initial offsets to match
> source, start replication
>
> On Thu, Dec 29, 2016 at 12:41 PM, Andrey L. Neporada <
> anepor...@yandex-team.ru> wrote:
>
>>
>> > On 29 Dec 2016, at 20:43, radai  wrote:
>> >
>> > so, if i follow your suggested logic correctly, there would be some
>> sort of
>> > :
>> >
>> > produce(partition, msg, requestedOffset)
>> >
>>
>> > which would fail if requestedOffset is already taken (by another
>> previous
>> > such explicit call or by another regular call that just happened to get
>> > assigned that offset by the partition leader on the target cluster).
>> >
>>
>> Yes. More formally, my proposal is to extend ProduceRequest by adding
>> MessageSetStartOffset:
>>
>> ProduceRequest => RequiredAcks Timeout [TopicName [Partition
>> MessageSetStartOffset MessageSetSize MessageSet]]
>>   RequiredAcks => int16
>>   Timeout => int32
>>   Partition => int32
>>   MessageSetSize => int32
>>   MessageSetStartOffset => int64
>>
>> If MessageSetStartOffset is -1, ProduceRequest should work exactly as
>> before - i.e. assign next available offset to given MessageSet.
>>
>>
>> > how would you meaningfully handle this failure?
>> >
>> > suppose this happens to some cross-cluster replicator (like mirror
>> maker).
>> > there is no use in retrying. the options would be:
>> >
>> > 1. get the next available offset - which would violate what youre
>> trying to
>> > achieve
>> > 2. skip msgs - so replication is incomplete, any offset "already taken"
>> on
>> > the destination is not replicated from source
>> > 3. stop replication for this partition completely - because starting
>> from
>> > now _ALL_ offsets will be taken - 1 foreign msg ruins everything for the
>> > entire partition.
>> >
>> > none of these options look good to me.
>> >
>> >
>>
>> Since we are discussing master-slave replication, the only client writing
>> to slave cluster is the replicator itself.
>> In this case ProduceRequest failure is some kind of replication logic
>> error - for example when two replication instances are somehow launched for
>> single partition.
>> The best option here is just to stop replication process.
>>
>> So the answer to your question is (3), but this scenario should never
>> happen.
>>
>>
>> >
>> > On Thu, Dec 29, 2016 at 3:22 AM, Andrey L. Neporada <
>> > anepor...@yandex-team.ru> wrote:
>> >
>> >> Hi!
>> >>
>> >>> On 27 Dec 2016, at 19:35, radai  wrote:
>> >>>
>> >>> IIUC if you replicate from a single source cluster to a single target
>> >>> cluster, the topic has the same number of partitions on both, and no
>> one
>> >>> writes directly to the target cluster (so master --> slave) the
>> offsets
>> >>> would be preserved.
>> >>>
>> >>
>> >> Yes, exactly. When you
>> >> 1) create topic with the same number of partitions on both master and
>> >> slave clusters
>> >> 2) write only to master
>> >> 3) replicate partition to partition from master to slave
>> >> - in this case the offsets will be preserved.
>> >>
>> >> However, you usually 

Re: custom offsets in ProduceRequest

2016-12-29 Thread radai
ah, I didnt realize we are limiting the discussion to master --> slave.

but - if we're talking about master-slave replication, and under the
conditions i outlined earlier (src and dest match in #partitions, no
foreign writes to dest) it "just works", seems to me the only thing youre
really missing is not an explicit desired offset param on each and every
request, but just the ability to "reset" the starting offset on the dest
cluster at topic creation.

let me try and run through a more detailed scenario:

1. suppose i set up the original cluster (src). no remote cluster yet. lets
say over some period of time i produce 1 million msgs to topic X on this
src cluster.
2. company grows, 2nd site is opened, dest cluster is created, topic X is
created on (brand new) dest cluster.
3. offsets are manually set on every partition of X on the dest cluster to
match either the oldest retained or current offset of the matching
partition of X in src. in pseudo code:

 for (partI in numPartitions) {
partIOffset
if (replicateAllRetainedHistory) {
   partIOffset = src.getOldestRetained(partI)
} else {
   partIOffset = src.getCurrent(partI) //will not copy over history
}
dest.resetStartingOffset(partI, partIOffset)   < new mgmt API
 }

4. now you are free to start replicating. under master --> slave
assumptions offsets will match from this point forward

seems to me something like this could be made part of the replicator
component (mirror maker, or whatever else you want to use) - if topic X
does not exist in destination, create it, reset initial offsets to match
source, start replication

On Thu, Dec 29, 2016 at 12:41 PM, Andrey L. Neporada <
anepor...@yandex-team.ru> wrote:

>
> > On 29 Dec 2016, at 20:43, radai  wrote:
> >
> > so, if i follow your suggested logic correctly, there would be some sort
> of
> > :
> >
> > produce(partition, msg, requestedOffset)
> >
>
> > which would fail if requestedOffset is already taken (by another previous
> > such explicit call or by another regular call that just happened to get
> > assigned that offset by the partition leader on the target cluster).
> >
>
> Yes. More formally, my proposal is to extend ProduceRequest by adding
> MessageSetStartOffset:
>
> ProduceRequest => RequiredAcks Timeout [TopicName [Partition
> MessageSetStartOffset MessageSetSize MessageSet]]
>   RequiredAcks => int16
>   Timeout => int32
>   Partition => int32
>   MessageSetSize => int32
>   MessageSetStartOffset => int64
>
> If MessageSetStartOffset is -1, ProduceRequest should work exactly as
> before - i.e. assign next available offset to given MessageSet.
>
>
> > how would you meaningfully handle this failure?
> >
> > suppose this happens to some cross-cluster replicator (like mirror
> maker).
> > there is no use in retrying. the options would be:
> >
> > 1. get the next available offset - which would violate what youre trying
> to
> > achieve
> > 2. skip msgs - so replication is incomplete, any offset "already taken"
> on
> > the destination is not replicated from source
> > 3. stop replication for this partition completely - because starting from
> > now _ALL_ offsets will be taken - 1 foreign msg ruins everything for the
> > entire partition.
> >
> > none of these options look good to me.
> >
> >
>
> Since we are discussing master-slave replication, the only client writing
> to slave cluster is the replicator itself.
> In this case ProduceRequest failure is some kind of replication logic
> error - for example when two replication instances are somehow launched for
> single partition.
> The best option here is just to stop replication process.
>
> So the answer to your question is (3), but this scenario should never
> happen.
>
>
> >
> > On Thu, Dec 29, 2016 at 3:22 AM, Andrey L. Neporada <
> > anepor...@yandex-team.ru> wrote:
> >
> >> Hi!
> >>
> >>> On 27 Dec 2016, at 19:35, radai  wrote:
> >>>
> >>> IIUC if you replicate from a single source cluster to a single target
> >>> cluster, the topic has the same number of partitions on both, and no
> one
> >>> writes directly to the target cluster (so master --> slave) the offsets
> >>> would be preserved.
> >>>
> >>
> >> Yes, exactly. When you
> >> 1) create topic with the same number of partitions on both master and
> >> slave clusters
> >> 2) write only to master
> >> 3) replicate partition to partition from master to slave
> >> - in this case the offsets will be preserved.
> >>
> >> However, you usually already have cluster that works and want to
> replicate
> >> some topics to another one.
> >> IMHO, in this scenario there should be a way to make message offsets
> equal
> >> on both clusters.
> >>
> >>> but in the general case - how would you handle the case where multiple
> >>> producers "claim" the same offset ?
> >>
> >> The same way as Kafka handles concurrent produce requests for the same
> >> partition - produce requests for 

[jira] [Commented] (KAFKA-3297) More optimally balanced partition assignment strategy (new consumer)

2016-12-29 Thread Jeff Widman (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-3297?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15786986#comment-15786986
 ] 

Jeff Widman commented on KAFKA-3297:


[~ewencp] KIP-54 tackles both sticky re-assignments and a more "fair" initial 
assignment. Search the wiki page for "fair yet sticky" for a section that 
provides a bit more context.

> More optimally balanced partition assignment strategy (new consumer)
> 
>
> Key: KAFKA-3297
> URL: https://issues.apache.org/jira/browse/KAFKA-3297
> Project: Kafka
>  Issue Type: Improvement
>  Components: consumer
>Reporter: Andrew Olson
>Assignee: Andrew Olson
> Fix For: 0.10.2.0
>
>
> While the roundrobin partition assignment strategy is an improvement over the 
> range strategy, when the consumer topic subscriptions are not identical 
> (previously disallowed but will be possible as of KAFKA-2172) it can produce 
> heavily skewed assignments. As suggested 
> [here|https://issues.apache.org/jira/browse/KAFKA-2172?focusedCommentId=14530767=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14530767]
>  it would be nice to have a strategy that attempts to assign an equal number 
> of partitions to each consumer in a group, regardless of how similar their 
> individual topic subscriptions are. We can accomplish this by tracking the 
> number of partitions assigned to each consumer, and having the partition 
> assignment loop assign each partition to a consumer interested in that topic 
> with the least number of partitions assigned. 
> Additionally, we can optimize the distribution fairness by adjusting the 
> partition assignment order:
> * Topics with fewer consumers are assigned first.
> * In the event of a tie for least consumers, the topic with more partitions 
> is assigned first.
> The general idea behind these two rules is to keep the most flexible 
> assignment choices available as long as possible by starting with the most 
> constrained partitions/consumers.
> This JIRA addresses the new consumer. For the original high-level consumer, 
> see KAFKA-2435.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: Some questions about providing corrections to documentation.

2016-12-29 Thread Guozhang Wang
Hello Dhwani,

Yes since it is a hotfix on the website, you need to submit a patch to
kafka-site on all the error directories instead of on kafka repo's docs
folder.

Details about web docs contribution can be found here:

https://cwiki.apache.org/confluence/display/KAFKA/Contributing+Website+Documentation+Changes


Guozhang


On Thu, Dec 29, 2016 at 9:46 AM, Dhwani Katagade <
dhwani_katag...@persistent.co.in> wrote:

> Hi,
>
> _Observations:_
>
> While going through the design documentation I noticed a minor error of
> duplication in points 4 and 5 here https://kafka.apache.org/docum
> entation/#design_compactionguarantees. When I looked up the files in git,
> I see the duplication in all of the below files in kafka-site repo
>
> https://github.com/apache/kafka-site/blob/asf-site/081/design.html
> https://github.com/apache/kafka-site/blob/asf-site/082/design.html
> https://github.com/apache/kafka-site/blob/asf-site/090/design.html
> https://github.com/apache/kafka-site/blob/asf-site/0100/design.html
> https://github.com/apache/kafka-site/blob/asf-site/0101/design.html
>
> And also in the following branches under kafka repo
> https://github.com/apache/kafka/blob/0.9.0/docs/design.html
> https://github.com/apache/kafka/blob/0.10.0/docs/design.html
> https://github.com/apache/kafka/blob/0.10.1/docs/design.html
>
> But the same is corrected under trunk here https://github.com/apache/kafk
> a/blob/trunk/docs/design.html
>
> _Questions:_
>
> 1. If I have to provide a patch/PR to cleanup this documentation error,
>should I provide the fix in files corresponding to all the versions
>under kafka-site?
> 2. When the next release happens, as I understand, a new directory will
>be added under kafka-site:asf-site. Since this would come from
>kafka:trunk it will have the correction. Is my understanding correct?
> 3. As I understand, older branches under kafka repo are release
>branches, and hence we should not make any new changes under the
>docs directory on these branches. Is my understanding correct?
> 4. As I understand, this fix does not require a JIRA issue to be
>logged. Is my understanding correct?
>
> Thanks in advance for the clarifications.
>
> -dhwani
>
>
> DISCLAIMER
> ==
> This e-mail may contain privileged and confidential information which is
> the property of Persistent Systems Ltd. It is intended only for the use of
> the individual or entity to which it is addressed. If you are not the
> intended recipient, you are not authorized to read, retain, copy, print,
> distribute or use this message. If you have received this communication in
> error, please notify the sender and delete all copies of this message.
> Persistent Systems Ltd. does not accept any liability for virus infected
> mails.
>
>


-- 
-- Guozhang


[jira] [Updated] (KAFKA-2196) remove roundrobin identical topic constraint in consumer coordinator

2016-12-29 Thread Ewen Cheslack-Postava (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-2196?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ewen Cheslack-Postava updated KAFKA-2196:
-
Fix Version/s: 0.9.0.0

> remove roundrobin identical topic constraint in consumer coordinator
> 
>
> Key: KAFKA-2196
> URL: https://issues.apache.org/jira/browse/KAFKA-2196
> Project: Kafka
>  Issue Type: Sub-task
>Reporter: Onur Karaman
>Assignee: Onur Karaman
> Fix For: 0.9.0.0
>
> Attachments: KAFKA-2196.patch
>
>
> roundrobin doesn't need to make all consumers have identical topic 
> subscriptions.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-2196) remove roundrobin identical topic constraint in consumer coordinator

2016-12-29 Thread Ewen Cheslack-Postava (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-2196?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15786960#comment-15786960
 ] 

Ewen Cheslack-Postava commented on KAFKA-2196:
--

Based on git history, looks like 0.9.0.0, I'll update the fix version 
accordingly.

> remove roundrobin identical topic constraint in consumer coordinator
> 
>
> Key: KAFKA-2196
> URL: https://issues.apache.org/jira/browse/KAFKA-2196
> Project: Kafka
>  Issue Type: Sub-task
>Reporter: Onur Karaman
>Assignee: Onur Karaman
> Fix For: 0.9.0.0
>
> Attachments: KAFKA-2196.patch
>
>
> roundrobin doesn't need to make all consumers have identical topic 
> subscriptions.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-3297) More optimally balanced partition assignment strategy (new consumer)

2016-12-29 Thread Ewen Cheslack-Postava (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-3297?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15786948#comment-15786948
 ] 

Ewen Cheslack-Postava commented on KAFKA-3297:
--

[~jeffwidman] I don't think KIP-54 supercedes this. KIP-54 is about maintaining 
stability across multiple reassignments. KIP-49 would apply to the *first* 
assignment. The two may interact, but they seem to have different goals.

If this KIP got the vote thread through, it seems like it might be easy to get 
included -- since new consumer assignments are performed by one group member, 
there aren't any real compatibility concerns.

> More optimally balanced partition assignment strategy (new consumer)
> 
>
> Key: KAFKA-3297
> URL: https://issues.apache.org/jira/browse/KAFKA-3297
> Project: Kafka
>  Issue Type: Improvement
>  Components: consumer
>Reporter: Andrew Olson
>Assignee: Andrew Olson
> Fix For: 0.10.2.0
>
>
> While the roundrobin partition assignment strategy is an improvement over the 
> range strategy, when the consumer topic subscriptions are not identical 
> (previously disallowed but will be possible as of KAFKA-2172) it can produce 
> heavily skewed assignments. As suggested 
> [here|https://issues.apache.org/jira/browse/KAFKA-2172?focusedCommentId=14530767=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14530767]
>  it would be nice to have a strategy that attempts to assign an equal number 
> of partitions to each consumer in a group, regardless of how similar their 
> individual topic subscriptions are. We can accomplish this by tracking the 
> number of partitions assigned to each consumer, and having the partition 
> assignment loop assign each partition to a consumer interested in that topic 
> with the least number of partitions assigned. 
> Additionally, we can optimize the distribution fairness by adjusting the 
> partition assignment order:
> * Topics with fewer consumers are assigned first.
> * In the event of a tie for least consumers, the topic with more partitions 
> is assigned first.
> The general idea behind these two rules is to keep the most flexible 
> assignment choices available as long as possible by starting with the most 
> constrained partitions/consumers.
> This JIRA addresses the new consumer. For the original high-level consumer, 
> see KAFKA-2435.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (KAFKA-3297) More optimally balanced partition assignment strategy (new consumer)

2016-12-29 Thread Ewen Cheslack-Postava (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-3297?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ewen Cheslack-Postava updated KAFKA-3297:
-
Component/s: consumer

> More optimally balanced partition assignment strategy (new consumer)
> 
>
> Key: KAFKA-3297
> URL: https://issues.apache.org/jira/browse/KAFKA-3297
> Project: Kafka
>  Issue Type: Improvement
>  Components: consumer
>Reporter: Andrew Olson
>Assignee: Andrew Olson
> Fix For: 0.10.2.0
>
>
> While the roundrobin partition assignment strategy is an improvement over the 
> range strategy, when the consumer topic subscriptions are not identical 
> (previously disallowed but will be possible as of KAFKA-2172) it can produce 
> heavily skewed assignments. As suggested 
> [here|https://issues.apache.org/jira/browse/KAFKA-2172?focusedCommentId=14530767=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14530767]
>  it would be nice to have a strategy that attempts to assign an equal number 
> of partitions to each consumer in a group, regardless of how similar their 
> individual topic subscriptions are. We can accomplish this by tracking the 
> number of partitions assigned to each consumer, and having the partition 
> assignment loop assign each partition to a consumer interested in that topic 
> with the least number of partitions assigned. 
> Additionally, we can optimize the distribution fairness by adjusting the 
> partition assignment order:
> * Topics with fewer consumers are assigned first.
> * In the event of a tie for least consumers, the topic with more partitions 
> is assigned first.
> The general idea behind these two rules is to keep the most flexible 
> assignment choices available as long as possible by starting with the most 
> constrained partitions/consumers.
> This JIRA addresses the new consumer. For the original high-level consumer, 
> see KAFKA-2435.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-3502) Build is killed during kafka streams tests due to `pure virtual method called` error

2016-12-29 Thread Ewen Cheslack-Postava (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-3502?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15786941#comment-15786941
 ] 

Ewen Cheslack-Postava commented on KAFKA-3502:
--

I've seen this in a number of Jenkins builds recently too, e.g.

{quote}
org.apache.kafka.streams.KeyValueTest > shouldHaveSaneEqualsAndHashCode STARTED

org.apache.kafka.streams.KeyValueTest > shouldHaveSaneEqualsAndHashCode PASSED
pure virtual method called
{quote}
from https://builds.apache.org/job/kafka-pr-jdk7-scala2.10/399/console

Transient integration test failures still seem to dominate, but this seems to 
be increasing the frequency of test failures for PRs recently.

> Build is killed during kafka streams tests due to `pure virtual method 
> called` error
> 
>
> Key: KAFKA-3502
> URL: https://issues.apache.org/jira/browse/KAFKA-3502
> Project: Kafka
>  Issue Type: Sub-task
>Reporter: Ashish K Singh
>  Labels: transient-unit-test-failure
>
> Build failed due to failure in streams' test. Not clear which test led to 
> this.
> Jenkins console: 
> https://builds.apache.org/job/kafka-trunk-git-pr-jdk7/3210/console
> {code}
> org.apache.kafka.streams.kstream.internals.KTableFilterTest > testValueGetter 
> PASSED
> org.apache.kafka.streams.kstream.internals.KStreamFlatMapTest > testFlatMap 
> PASSED
> org.apache.kafka.streams.kstream.internals.KTableAggregateTest > testAggBasic 
> PASSED
> org.apache.kafka.streams.kstream.internals.KStreamFlatMapValuesTest > 
> testFlatMapValues PASSED
> org.apache.kafka.streams.kstream.KStreamBuilderTest > testMerge PASSED
> org.apache.kafka.streams.kstream.KStreamBuilderTest > testFrom PASSED
> org.apache.kafka.streams.kstream.KStreamBuilderTest > testNewName PASSED
> pure virtual method called
> terminate called without an active exception
> :streams:test FAILED
> FAILURE: Build failed with an exception.
> * What went wrong:
> Execution failed for task ':streams:test'.
> > Process 'Gradle Test Executor 4' finished with non-zero exit value 134
> {code}
> Tried reproducing the issue locally, but could not.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-2019) Old consumer RoundRobinAssignor clusters by consumer

2016-12-29 Thread Ewen Cheslack-Postava (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-2019?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15786931#comment-15786931
 ] 

Ewen Cheslack-Postava commented on KAFKA-2019:
--

[~jeffwidman] I updated the title to clarify. It may make sense to start 
tracking new consumer stuff via a different component so they can be easily 
differentiated (although I don't like the idea of permanently labelling it "new 
consumer").

re: merging, agreed, it seems unlikely at this point since nobody has pushed 
for it to get merged for the past 18 months, although it would be completely 
reasonable as a new class to avoid compatibility concerns.

> Old consumer RoundRobinAssignor clusters by consumer
> 
>
> Key: KAFKA-2019
> URL: https://issues.apache.org/jira/browse/KAFKA-2019
> Project: Kafka
>  Issue Type: Bug
>  Components: consumer
>Reporter: Joseph Holsten
>Assignee: Neha Narkhede
>Priority: Minor
> Attachments: 0001-sort-consumer-thread-ids-by-hashcode.patch, 
> KAFKA-2019.patch
>
>
> When rolling out a change today, I noticed that some of my consumers are 
> "greedy", taking far more partitions than others.
> The cause is that the RoundRobinAssignor is using a list of ConsumerThreadIds 
> sorted by toString, which is {{ "%s-%d".format(consumer, threadId)}}. This 
> causes each consumer's threads to be adjacent to each other.
> One possible fix would be to define ConsumerThreadId.hashCode, and sort by 
> that.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (KAFKA-2019) Old consumer RoundRobinAssignor clusters by consumer

2016-12-29 Thread Ewen Cheslack-Postava (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-2019?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ewen Cheslack-Postava updated KAFKA-2019:
-
Summary: Old consumer RoundRobinAssignor clusters by consumer  (was: 
RoundRobinAssignor clusters by consumer)

> Old consumer RoundRobinAssignor clusters by consumer
> 
>
> Key: KAFKA-2019
> URL: https://issues.apache.org/jira/browse/KAFKA-2019
> Project: Kafka
>  Issue Type: Bug
>  Components: consumer
>Reporter: Joseph Holsten
>Assignee: Neha Narkhede
>Priority: Minor
> Attachments: 0001-sort-consumer-thread-ids-by-hashcode.patch, 
> KAFKA-2019.patch
>
>
> When rolling out a change today, I noticed that some of my consumers are 
> "greedy", taking far more partitions than others.
> The cause is that the RoundRobinAssignor is using a list of ConsumerThreadIds 
> sorted by toString, which is {{ "%s-%d".format(consumer, threadId)}}. This 
> causes each consumer's threads to be adjacent to each other.
> One possible fix would be to define ConsumerThreadId.hashCode, and sort by 
> that.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-2331) Kafka does not spread partitions in a topic among all consumers evenly

2016-12-29 Thread Ewen Cheslack-Postava (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-2331?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15786920#comment-15786920
 ] 

Ewen Cheslack-Postava commented on KAFKA-2331:
--

This seems like a separate issue -- it sounds like the consumers are joining 
the group but not getting assigned any partitions when there is only 1 topic 
and 10 partitions. Both range and round-robin assignment should have the same 
behavior in that case. But the split that is shown (3,2,1,1,1,1,1) doesn't seem 
likely either. It seems more likely that is an artifact of just aggregating all 
partitions each thread saw messages for without taking into account that when 
the first couple of instances join there will be a period when they are trying 
to fetch data for *all* partitions and can validly see data for more than 1 
partition.

This is quite old and is for the old consumer, but if someone wanted to tackle 
it, one strategy might be to use a ConsumerRebalanceListener to get more info 
about how partitions are being assigned. That might also reveal other issues 
such as some members not successfully joining the group.

> Kafka does not spread partitions in a topic among all consumers evenly
> --
>
> Key: KAFKA-2331
> URL: https://issues.apache.org/jira/browse/KAFKA-2331
> Project: Kafka
>  Issue Type: Bug
>  Components: clients, consumer
>Affects Versions: 0.8.1.1
>Reporter: Stefan Miklosovic
>
> I want to have 1 topic with 10 partitions. I am using default configuration 
> of Kafka. I create 1 topic with 10 partitions by that helper script and now I 
> am about to produce messages to it.
> The thing is that even all partitions are indeed consumed, some consumers 
> have more then 1 partition assigned even I have number of consumer threads 
> equal to partitions in a topic hence some threads are idle.
> Let's describe it in more detail.
> I know that common stuff that you need one consumer thread per partition. I 
> want to be able to commit offsets per partition and this is possible only 
> when I have 1 thread per consumer connector per partition (I am using high 
> level consumer).
> So I create 10 threads, in each thread I am calling 
> Consumer.createJavaConsumerConnector() where I am doing this
> topicCountMap.put("mytopic", 1);
> and in the end I have 1 iterator which consumes messages from 1 partition.
> When I do this 10 times, I have 10 consumers, consumer per thread per 
> partition where I can commit offsets independently per partition because if I 
> put different number from 1 in topic map, I would end up with more then 1 
> consumer thread for that topic for given consumer instance so if I am about 
> to commit offsets with created consumer instance, it would commit them for 
> all threads which is not desired.
> But the thing is that when I use consumers, only 7 consumers are involved and 
> it seems that other consumer threads are idle but I do not know why.
> The thing is that I am creating these consumer threads in a loop. So I start 
> first thread (submit to executor service), then another, then another and so 
> on.
> So the scenario is that first consumer gets all 10 partitions, then 2nd 
> connects so it is splits between these two to 5 and 5 (or something similar), 
> then other threads are connecting.
> I understand this as a partition rebalancing among all consumers so it 
> behaves well in such sense that if more consumers are being created, 
> partition rebalancing occurs between these consumers so every consumer should 
> have some partitions to operate upon.
> But from the results I see that there is only 7 consumers and according to 
> consumed messages it seems they are split like 3,2,1,1,1,1,1 partition-wise. 
> Yes, these 7 consumers covered all 10 partitions, but why consumers with more 
> then 1 partition do no split and give partitions to remaining 3 consumers?
> I am pretty much wondering what is happening with remaining 3 threads and why 
> they do not "grab" partitions from consumers which have more then 1 partition 
> assigned.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (KAFKA-2331) Kafka does not spread partitions in a topic among all consumers evenly

2016-12-29 Thread Ewen Cheslack-Postava (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-2331?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ewen Cheslack-Postava updated KAFKA-2331:
-
Component/s: (was: core)
 consumer
 clients

> Kafka does not spread partitions in a topic among all consumers evenly
> --
>
> Key: KAFKA-2331
> URL: https://issues.apache.org/jira/browse/KAFKA-2331
> Project: Kafka
>  Issue Type: Bug
>  Components: clients, consumer
>Affects Versions: 0.8.1.1
>Reporter: Stefan Miklosovic
>
> I want to have 1 topic with 10 partitions. I am using default configuration 
> of Kafka. I create 1 topic with 10 partitions by that helper script and now I 
> am about to produce messages to it.
> The thing is that even all partitions are indeed consumed, some consumers 
> have more then 1 partition assigned even I have number of consumer threads 
> equal to partitions in a topic hence some threads are idle.
> Let's describe it in more detail.
> I know that common stuff that you need one consumer thread per partition. I 
> want to be able to commit offsets per partition and this is possible only 
> when I have 1 thread per consumer connector per partition (I am using high 
> level consumer).
> So I create 10 threads, in each thread I am calling 
> Consumer.createJavaConsumerConnector() where I am doing this
> topicCountMap.put("mytopic", 1);
> and in the end I have 1 iterator which consumes messages from 1 partition.
> When I do this 10 times, I have 10 consumers, consumer per thread per 
> partition where I can commit offsets independently per partition because if I 
> put different number from 1 in topic map, I would end up with more then 1 
> consumer thread for that topic for given consumer instance so if I am about 
> to commit offsets with created consumer instance, it would commit them for 
> all threads which is not desired.
> But the thing is that when I use consumers, only 7 consumers are involved and 
> it seems that other consumer threads are idle but I do not know why.
> The thing is that I am creating these consumer threads in a loop. So I start 
> first thread (submit to executor service), then another, then another and so 
> on.
> So the scenario is that first consumer gets all 10 partitions, then 2nd 
> connects so it is splits between these two to 5 and 5 (or something similar), 
> then other threads are connecting.
> I understand this as a partition rebalancing among all consumers so it 
> behaves well in such sense that if more consumers are being created, 
> partition rebalancing occurs between these consumers so every consumer should 
> have some partitions to operate upon.
> But from the results I see that there is only 7 consumers and according to 
> consumed messages it seems they are split like 3,2,1,1,1,1,1 partition-wise. 
> Yes, these 7 consumers covered all 10 partitions, but why consumers with more 
> then 1 partition do no split and give partitions to remaining 3 consumers?
> I am pretty much wondering what is happening with remaining 3 threads and why 
> they do not "grab" partitions from consumers which have more then 1 partition 
> assigned.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] kafka pull request #2290: MINOR: Rephrase Javadoc summary for ConsumerRecord

2016-12-29 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/kafka/pull/2290


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] kafka pull request #2295: MINOR: Mx4jLoader always returns false even if mx4...

2016-12-29 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/kafka/pull/2295


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Comment Edited] (KAFKA-4567) Connect Producer and Consumer ignore ssl parameters configured for worker

2016-12-29 Thread Ewen Cheslack-Postava (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-4567?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15786745#comment-15786745
 ] 

Ewen Cheslack-Postava edited comment on KAFKA-4567 at 12/30/16 3:13 AM:


Given some future security features we may want to support (e.g. supporting 
different identities for different connectors), we probably don't want to just 
include the worker-level security configs into the producer & consumer. It's 
annoying to have to duplicate them now, but we probably want to support more 
flexible combinations in the future, such as having unique credentials for the 
workers (limiting the ability to, e.g., write to the config/offsets/status 
topics) than those used by producers and consumers (where we may want both 
unique credentials to apply ACLs and maybe support things like delegation 
tokens in the future).

So I think the short term solution is probably to just update the docs to 
clarify that you'll currently need the settings both at the worker level and 
prefixed by {{producer.}} and {{consumer.}} if you're trying to use the same 
credentials for worker, producer, and consumer.


was (Author: ewencp):
Given some future security features we may want to support (e.g. supporting 
different identities for different connectors), we probably don't want to just 
include the worker-level security configs into the producer & consumer. It's 
annoying to have to duplicate them now, but we probably want to support more 
flexible combinations in the future, such as having unique credentials for the 
workers (limiting the ability to, e.g., write to the config/offsets/status 
topics) than those used by producers and consumers (where we may want both 
unique credentials to apply ACLs and maybe support things like delegation 
tokens in the future).

So I think the short term solution is probably to just update the docs to 
clarify that you'll currently need the settings both at the worker level and 
prefixed by {{producer.} and {{consumer.}} if you're trying to use the same 
credentials for worker, producer, and consumer.

> Connect Producer and Consumer ignore ssl parameters configured for worker
> -
>
> Key: KAFKA-4567
> URL: https://issues.apache.org/jira/browse/KAFKA-4567
> Project: Kafka
>  Issue Type: Bug
>  Components: KafkaConnect
>Affects Versions: 0.10.1.1
>Reporter: Sönke Liebau
>Assignee: Ewen Cheslack-Postava
>Priority: Minor
>
> When using Connect with a SSL enabled Kafka cluster, the configuration 
> options are either documented a bit misleading, or handled in an incorrect 
> way.
> The documentation states the usual available SSL options 
> (ssl.keystore.location, ssl.truststore.location, ...) and these are picked up 
> and used for the producers and consumers that are used to communicate with 
> the status, offset and configs topics.
> For the producers and consumers that are used for the actual data, these 
> parameters are ignored as can be seen 
> [here|https://github.com/apache/kafka/blob/trunk/connect/runtime/src/main/java/org/apache/kafka/connect/runtime/Worker.java#L98],
>  which results in plaintext communication on an SSL port, leading to an OOM 
> exception ([KAFKA-4493|https://issues.apache.org/jira/browse/KAFKA-4493]).
> So in order to get Connect to communicate with a secured cluster you need to 
> override all SSL configs with the prefixes _consumer._ and _producer._ and 
> duplicate the values already set at a global level.
> The documentation states: 
> bq. The most critical site-specific options, such as the Kafka bootstrap 
> servers, are already exposed via the standard worker configuration.
> Since the address for the cluster is exposed here, I would propose that there 
> is no reason not to also pass the SSL parameters through to the consumers and 
> producers, as it is clearly intended that communication happens with the same 
> cluster. 
> In fringe cases, these can still be overridden manually to achieve different 
> behavior.
> I am happy to create a pull request to address this or clarify the docs, 
> after we decide which one is the appropriate course of action.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-4567) Connect Producer and Consumer ignore ssl parameters configured for worker

2016-12-29 Thread Ewen Cheslack-Postava (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-4567?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15786745#comment-15786745
 ] 

Ewen Cheslack-Postava commented on KAFKA-4567:
--

Given some future security features we may want to support (e.g. supporting 
different identities for different connectors), we probably don't want to just 
include the worker-level security configs into the producer & consumer. It's 
annoying to have to duplicate them now, but we probably want to support more 
flexible combinations in the future, such as having unique credentials for the 
workers (limiting the ability to, e.g., write to the config/offsets/status 
topics) than those used by producers and consumers (where we may want both 
unique credentials to apply ACLs and maybe support things like delegation 
tokens in the future).

So I think the short term solution is probably to just update the docs to 
clarify that you'll currently need the settings both at the worker level and 
prefixed by {{producer.} and {{consumer.}} if you're trying to use the same 
credentials for worker, producer, and consumer.

> Connect Producer and Consumer ignore ssl parameters configured for worker
> -
>
> Key: KAFKA-4567
> URL: https://issues.apache.org/jira/browse/KAFKA-4567
> Project: Kafka
>  Issue Type: Bug
>  Components: KafkaConnect
>Affects Versions: 0.10.1.1
>Reporter: Sönke Liebau
>Assignee: Ewen Cheslack-Postava
>Priority: Minor
>
> When using Connect with a SSL enabled Kafka cluster, the configuration 
> options are either documented a bit misleading, or handled in an incorrect 
> way.
> The documentation states the usual available SSL options 
> (ssl.keystore.location, ssl.truststore.location, ...) and these are picked up 
> and used for the producers and consumers that are used to communicate with 
> the status, offset and configs topics.
> For the producers and consumers that are used for the actual data, these 
> parameters are ignored as can be seen 
> [here|https://github.com/apache/kafka/blob/trunk/connect/runtime/src/main/java/org/apache/kafka/connect/runtime/Worker.java#L98],
>  which results in plaintext communication on an SSL port, leading to an OOM 
> exception ([KAFKA-4493|https://issues.apache.org/jira/browse/KAFKA-4493]).
> So in order to get Connect to communicate with a secured cluster you need to 
> override all SSL configs with the prefixes _consumer._ and _producer._ and 
> duplicate the values already set at a global level.
> The documentation states: 
> bq. The most critical site-specific options, such as the Kafka bootstrap 
> servers, are already exposed via the standard worker configuration.
> Since the address for the cluster is exposed here, I would propose that there 
> is no reason not to also pass the SSL parameters through to the consumers and 
> producers, as it is clearly intended that communication happens with the same 
> cluster. 
> In fringe cases, these can still be overridden manually to achieve different 
> behavior.
> I am happy to create a pull request to address this or clarify the docs, 
> after we decide which one is the appropriate course of action.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-2260) Allow specifying expected offset on produce

2016-12-29 Thread Ewen Cheslack-Postava (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-2260?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15786719#comment-15786719
 ] 

Ewen Cheslack-Postava commented on KAFKA-2260:
--

[~wwarshaw] That sounds right -- the epoch for the PID would ensure a single 
writer and then the actual offset wouldn't matter.

KIP-98 hasn't been voted on yet, so it's be difficult to give a timeline now, 
but it seems unlikely to happen before the June release timeframe.

> Allow specifying expected offset on produce
> ---
>
> Key: KAFKA-2260
> URL: https://issues.apache.org/jira/browse/KAFKA-2260
> Project: Kafka
>  Issue Type: Improvement
>Reporter: Ben Kirwin
>Assignee: Ewen Cheslack-Postava
>Priority: Minor
> Attachments: KAFKA-2260.patch, expected-offsets.patch
>
>
> I'd like to propose a change that adds a simple CAS-like mechanism to the 
> Kafka producer. This update has a small footprint, but enables a bunch of 
> interesting uses in stream processing or as a commit log for process state.
> h4. Proposed Change
> In short:
> - Allow the user to attach a specific offset to each message produced.
> - The server assigns offsets to messages in the usual way. However, if the 
> expected offset doesn't match the actual offset, the server should fail the 
> produce request instead of completing the write.
> This is a form of optimistic concurrency control, like the ubiquitous 
> check-and-set -- but instead of checking the current value of some state, it 
> checks the current offset of the log.
> h4. Motivation
> Much like check-and-set, this feature is only useful when there's very low 
> contention. Happily, when Kafka is used as a commit log or as a 
> stream-processing transport, it's common to have just one producer (or a 
> small number) for a given partition -- and in many of these cases, predicting 
> offsets turns out to be quite useful.
> - We get the same benefits as the 'idempotent producer' proposal: a producer 
> can retry a write indefinitely and be sure that at most one of those attempts 
> will succeed; and if two producers accidentally write to the end of the 
> partition at once, we can be certain that at least one of them will fail.
> - It's possible to 'bulk load' Kafka this way -- you can write a list of n 
> messages consecutively to a partition, even if the list is much larger than 
> the buffer size or the producer has to be restarted.
> - If a process is using Kafka as a commit log -- reading from a partition to 
> bootstrap, then writing any updates to that same partition -- it can be sure 
> that it's seen all of the messages in that partition at the moment it does 
> its first (successful) write.
> There's a bunch of other similar use-cases here, but they all have roughly 
> the same flavour.
> h4. Implementation
> The major advantage of this proposal over other suggested transaction / 
> idempotency mechanisms is its minimality: it gives the 'obvious' meaning to a 
> currently-unused field, adds no new APIs, and requires very little new code 
> or additional work from the server.
> - Produced messages already carry an offset field, which is currently ignored 
> by the server. This field could be used for the 'expected offset', with a 
> sigil value for the current behaviour. (-1 is a natural choice, since it's 
> already used to mean 'next available offset'.)
> - We'd need a new error and error code for a 'CAS failure'.
> - The server assigns offsets to produced messages in 
> {{ByteBufferMessageSet.validateMessagesAndAssignOffsets}}. After this 
> changed, this method would assign offsets in the same way -- but if they 
> don't match the offset in the message, we'd return an error instead of 
> completing the write.
> - To avoid breaking existing clients, this behaviour would need to live 
> behind some config flag. (Possibly global, but probably more useful 
> per-topic?)
> I understand all this is unsolicited and possibly strange: happy to answer 
> questions, and if this seems interesting, I'd be glad to flesh this out into 
> a full KIP or patch. (And apologies if this is the wrong venue for this sort 
> of thing!)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-4570) How to transfer extended fields in producing or consuming requests.

2016-12-29 Thread Ewen Cheslack-Postava (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-4570?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15786713#comment-15786713
 ] 

Ewen Cheslack-Postava commented on KAFKA-4570:
--

[~zander] You're correct that Kafka does not handle extra user metadata today. 
However, there is an ongoing discussion to [add headers to Kafka 
messages|https://cwiki.apache.org/confluence/display/KAFKA/KIP-82+-+Add+Record+Headers].

> How to transfer extended fields in producing or consuming requests.
> ---
>
> Key: KAFKA-4570
> URL: https://issues.apache.org/jira/browse/KAFKA-4570
> Project: Kafka
>  Issue Type: Wish
>  Components: clients
>Affects Versions: 0.10.1.1
>Reporter: zander
>Priority: Critical
>  Labels: features
>
> We encounter a problem that  we can not transfer extended fields for 
> producing or consuming requests to the broker.
> We want to validate  the producers or consumers in a custom way other than 
> using SSL.
> In general, such as JMS, it is possible to transfer user-related fields to 
> server.
> But it seems that Kafka dose not support this, its protocol is very tight and 
>  unable to add user-defined fields.
> So is there any way  achieving this goal ?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-4571) Consumer fails to retrieve messages if started before producer

2016-12-29 Thread Ewen Cheslack-Postava (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-4571?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15786707#comment-15786707
 ] 

Ewen Cheslack-Postava commented on KAFKA-4571:
--

Since another consumer group ID consumes the message, I'm assuming you're using 
auto.offset.reset = earliest?

Does the consumer *never* consume messages, or do you just need to wait long 
enough? If you start the consumer before the topic exists, it will get metadata 
indicating there are no partitions. I think you'd ned to wait for the metadata 
timeout before it would try to refresh the metadata and finally see the created 
topic partitions (default metadata.max.age.ms is 5 minutes).

> Consumer fails to retrieve messages if started before producer
> --
>
> Key: KAFKA-4571
> URL: https://issues.apache.org/jira/browse/KAFKA-4571
> Project: Kafka
>  Issue Type: Bug
>  Components: clients
>Affects Versions: 0.10.1.1
> Environment: Ubuntu Desktop 16.04 LTS, Oracle Java 8 1.8.0_101, Core 
> i7 4770K
>Reporter: Sergiu Hlihor
>
> In a configuration where topic was never created before, starting the 
> consumer before the producer leads to no message being consumed 
> (KafkaConsumer.pool() returns always an instance of ConsumerRecords with 0 
> count ). 
> Starting another consumer on the same group, same topic after messages were 
> produced is still not consuming them. Starting another consumer with another 
> groupId appears to be working.
> In the consumer logs I see: WARN  NetworkClient - Error while fetching 
> metadata with correlation id 1 : {measurements021=LEADER_NOT_AVAILABLE} 
> Both producer and consumer were launched from inside same JVM. 
> The configuration used is the standard one found in Kafka distribution. If 
> this is a configuration issue, please suggest any change that I should do.
> Thank you



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: [VOTE] KIP-99: Add Global Tables to Kafka Streams

2016-12-29 Thread Guozhang Wang
Besides my comments on the other DISCUSS email thread, I'm +1.


On Mon, Dec 12, 2016 at 9:32 AM, Bill Bejeck  wrote:

> +1
>
> On Mon, Dec 12, 2016 at 12:29 PM, Matthias J. Sax 
> wrote:
>
> > +1
> >
> > On 12/12/16 3:45 AM, Damian Guy wrote:
> > > Hi all,
> > >
> > > I'd like to start the vote for KIP-99:
> > > https://cwiki.apache.org/confluence/pages/viewpage.
> > action?pageId=67633649
> > >
> > > There is a PR for it here: https://github.com/apache/kafka/pull/2244
> > >
> > > Thanks,
> > > Damian
> > >
> >
> >
>



-- 
-- Guozhang


Re: [DISCUSS] KIP-99: Add Global Tables to Kafka Streams

2016-12-29 Thread Guozhang Wang
1/2: Sounds good, let's remove the joins within KGlobalTable for now.

3. I see, makes sense.

Unfortunately since TopologyBuilder is a public class we cannot separate
its internal usage only functions like build / buildWithGlobalTables / etc
with other user functions like stream / table / etc. We need to consider
refactoring this interface sooner than later.

4/6. OK.


Guozhang


On Tue, Dec 20, 2016 at 2:16 PM, Damian Guy  wrote:

> Hi Guozhang,
>
> Thanks for your input. Answers below, but i'm thinking we should remove
> joins from GlobalKTables for the time being and re-visit if necessary in
> the future.
>
> 1. with a global table the joins are never really materialized (at least
> how i see it), rather they are just views on the existing global tables.
> I've deliberately taken this approach so we don't have to create yet
> another State Store and changelog topic etc. These all consume resources
> that i believe are unnecessary. So, i don't really see the point of having
> a materialize method. Further, one of the major benefits of joining two
> global tables is being able to query them via Interactive Queries. For this
> you need the name, so i think it makes sense to provide it with the join.
>
> 2. This has been discussed already in this thread (with Michael), and
> outerJoin is deliberately not part of the KIP. To be able to join both
> ways, as you suggest, requires that both inputs are able to map to the same
> key. This is not always going to be possible, i.e., relationships can be
> one way, so for that reason i felt it was best to not go down that path as
> we'd not be able to resolve it at the time that
> globalTable.join(otherGlobalTable,...) was called, and this would result
> in
> possible confusion. Also, to support this we'd need to physically
> materialize a StateStore that represents the join (which i think is a waste
> of resources), or, we'd need to provide another interface where we can map
> from the key of the resulting global table to the keys of both of the
> joined tables.
>
> 3. The intention is that the GlobalKTables are in a single topology that is
> owned and updated by a single thread. So yes it is necessary that they can
> be created separately.
>
> 4. Bootstrapping and maintaining of the state of GlobalKTables are done on
> a single thread. This thread will run simultaneously with the current
> StreamThreads. It doesn't make sense to move the bootstrapping of the
> StandbyTasks to this thread as they are logically part of a StreamThread,
> they are 'assigned' to the StreamThread. With GlobalKTables there is no
> assignment as such, the thread just maintains all of them.
>
> 5. Yes i'll update the KIP - the state directory will be under the same
> path as StreamsConfig.STATE_DIR_CONFIG, but it will be a specific
> directory, i.e, global_state, rather then being a task directory.
>
> 6. The whole point of GlobalKTables is to have a copy of ALL of the data on
> each node. I don't think it makes sense to be able to reset the starting
> position.
>
> Thanks,
> Damian
>
> On Tue, 20 Dec 2016 at 20:00 Guozhang Wang  wrote:
>
> > One more thing to add:
> >
> > 6. For KGlobalTable, it is always bootstrapped from the beginning while
> for
> > other KTables, we are enabling users to override their resetting position
> > as in
> >
> > https://github.com/apache/kafka/pull/2007
> >
> > Should we consider doing the same for KGlobalTable as well?
> >
> >
> > Guozhang
> >
> >
> > On Tue, Dec 20, 2016 at 11:39 AM, Guozhang Wang 
> > wrote:
> >
> > > Thanks for the very well written proposal, and sorry for the very-late
> > > review. I have a few comments here:
> > >
> > > 1. We are introducing a "queryableViewName" in the GlobalTable join
> > > results, while I'm wondering if we should just add a more general
> > function
> > > like "materialize" to KTable and KGlobalTable with the name to be used
> in
> > > queries?
> > >
> > > 2. For KGlobalTable's own "join" and "leftJoin": since we are only
> > passing
> > > the KeyValueMapper keyMapper it seems that for either case
> only
> > > the left hand side will logically "trigger" the join, which is
> different
> > to
> > > KTable's join semantics. I'm wondering if it would be more consistent
> to
> > > have them as:
> > >
> > >
> > >  GlobalKTable join(final GlobalKTable other,
> > > final KeyValueMapper
> > > leftkeyMapper,
> > > final KeyValueMapper
> > > rightkeyMapper,
> > > final ValueJoiner
> > joiner
> > > final String
> queryableViewName);
> > >
> > >  GlobalKTable outerJoin(final GlobalKTable
> > other,
> > >  final KeyValueMapper
> > > leftkeyMapper,
> > >  

[jira] [Updated] (KAFKA-4115) Grow default heap settings for distributed Connect from 256M to 1G

2016-12-29 Thread Ewen Cheslack-Postava (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-4115?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ewen Cheslack-Postava updated KAFKA-4115:
-
Fix Version/s: 0.10.2.0

> Grow default heap settings for distributed Connect from 256M to 1G
> --
>
> Key: KAFKA-4115
> URL: https://issues.apache.org/jira/browse/KAFKA-4115
> Project: Kafka
>  Issue Type: Improvement
>  Components: KafkaConnect
>Reporter: Shikhar Bhushan
>Assignee: Shikhar Bhushan
> Fix For: 0.10.2.0
>
>
> Currently, both {{connect-standalone.sh}} and {{connect-distributed.sh}} 
> start the Connect JVM with the default heap settings from 
> {{kafka-run-class.sh}} of {{-Xmx256M}}.
> At least for distributed connect, we should default to a much higher limit 
> like 1G. While the 'correct' sizing is workload dependent, with a system 
> where you can run arbitrary connector plugins which may perform buffering of 
> data, we should provide for more headroom.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-3947) kafka-reassign-partitions.sh should support dumping current assignment

2016-12-29 Thread Ewen Cheslack-Postava (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-3947?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15786244#comment-15786244
 ] 

Ewen Cheslack-Postava commented on KAFKA-3947:
--

[~kawamuray] Any progress on this? It got bumped to 0.10.2.0 since the patch 
was still in flight, and now we're getting closer to 0.10.2 feature freeze (and 
this may need some public discussion which can take some time). Any follow up 
based on the comments above?

> kafka-reassign-partitions.sh should support dumping current assignment
> --
>
> Key: KAFKA-3947
> URL: https://issues.apache.org/jira/browse/KAFKA-3947
> Project: Kafka
>  Issue Type: Improvement
>  Components: tools
>Affects Versions: 0.10.0.0
>Reporter: Yuto Kawamura
>Assignee: Yuto Kawamura
>Priority: Minor
> Fix For: 0.10.2.0
>
>
> When I building my own tool to perform reassignment of partitions, I realized 
> that there's no way to dump the current partition assignment in machine 
> parsable format such as JSON.
> Actually giving {{\-\-generate}} option to the kafka-reassign-partitions.sh 
> script dumps the current assignment of topic given by 
> {{\-\-topics-to-assign-json-file}} but it's very inconvenient because of:
> - I want the dump containing all topics. That is, I wanna skip generating the 
> list of current topics to pass it to the generate command.
> - The output is concatenated with the result of reassignment so can't do 
> simply something like: {{kafka-reassign-partitions.sh --generate ... > 
> current-assignment.json}}
> - Don't need to ask kafka to generate reassginment to get the current 
> assignment in the first place.
> Here I'd like to add the {{\-\-dump}} option to kafka-reassign-partitions.sh.
> I was wondering whether this functionality should be provided by 
> {{kafka-reassign-partitions.sh}} or {{kafka-topics.sh}} but now I think 
> {{kafka-reassign-partitions.sh}} should be much proper as the resulting JSON 
> should be in the format of {{\-\-reassignment-json-file}} which sticks to 
> this command.
> Will follow up the patch implements this shortly.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (KAFKA-4404) Add knowledge of sign to numeric schema types

2016-12-29 Thread Ewen Cheslack-Postava (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-4404?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ewen Cheslack-Postava updated KAFKA-4404:
-
Fix Version/s: 0.10.2.0
   Status: Patch Available  (was: Open)

> Add knowledge of sign to numeric schema types
> -
>
> Key: KAFKA-4404
> URL: https://issues.apache.org/jira/browse/KAFKA-4404
> Project: Kafka
>  Issue Type: Improvement
>  Components: KafkaConnect
>Affects Versions: 0.10.0.1
>Reporter: Andy Bryant
>Assignee: Ewen Cheslack-Postava
>Priority: Minor
> Fix For: 0.10.2.0
>
>
> For KafkaConnect schemas there is currently no concept of whether a numeric 
> field is signed or unsigned. 
> Add an additional `signed` attribute (like optional) or make it explicit that 
> numeric types must be signed.
> You could encode this as a parameter on the schema but this would not be 
> standard across all connectors.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-4404) Add knowledge of sign to numeric schema types

2016-12-29 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-4404?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15786227#comment-15786227
 ] 

ASF GitHub Bot commented on KAFKA-4404:
---

GitHub user ewencp opened a pull request:

https://github.com/apache/kafka/pull/2296

KAFKA-4404: Add javadocs to document core Connect types, especially that 
integer types are signed



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/ewencp/kafka 
kafka-4404-document-connect-signed-integer-types

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/2296.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2296


commit 0cc67cf094bdbf4d3dffc863a88e9c19d62d8887
Author: Ewen Cheslack-Postava 
Date:   2016-12-29T22:00:33Z

KAFKA-4404: Add javadocs to document core Connect types, especially that 
integer types are signed




> Add knowledge of sign to numeric schema types
> -
>
> Key: KAFKA-4404
> URL: https://issues.apache.org/jira/browse/KAFKA-4404
> Project: Kafka
>  Issue Type: Improvement
>  Components: KafkaConnect
>Affects Versions: 0.10.0.1
>Reporter: Andy Bryant
>Assignee: Ewen Cheslack-Postava
>Priority: Minor
>
> For KafkaConnect schemas there is currently no concept of whether a numeric 
> field is signed or unsigned. 
> Add an additional `signed` attribute (like optional) or make it explicit that 
> numeric types must be signed.
> You could encode this as a parameter on the schema but this would not be 
> standard across all connectors.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] kafka pull request #2296: KAFKA-4404: Add javadocs to document core Connect ...

2016-12-29 Thread ewencp
GitHub user ewencp opened a pull request:

https://github.com/apache/kafka/pull/2296

KAFKA-4404: Add javadocs to document core Connect types, especially that 
integer types are signed



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/ewencp/kafka 
kafka-4404-document-connect-signed-integer-types

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/2296.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2296


commit 0cc67cf094bdbf4d3dffc863a88e9c19d62d8887
Author: Ewen Cheslack-Postava 
Date:   2016-12-29T22:00:33Z

KAFKA-4404: Add javadocs to document core Connect types, especially that 
integer types are signed




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (KAFKA-4497) log cleaner breaks on timeindex

2016-12-29 Thread Jiangjie Qin (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-4497?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15786213#comment-15786213
 ] 

Jiangjie Qin commented on KAFKA-4497:
-

[~sandieg] I cannot think of a good workaround on this without deploying new 
code. If the log cleaner thread is dead, bouncing the broker seems the best we 
can do although the log cleaner thread may die again after the bounce. If 
possible, you can try reduce the log.cleaner.io.buffer.size a little to reduce 
the likelihood of hitting this issue, but notice that the IO buffer size should 
be at least 2x of max message size.

> log cleaner breaks on timeindex
> ---
>
> Key: KAFKA-4497
> URL: https://issues.apache.org/jira/browse/KAFKA-4497
> Project: Kafka
>  Issue Type: Bug
>  Components: log
>Affects Versions: 0.10.1.0
> Environment: Debian Jessie, Oracle Java 8u92, kafka_2.11-0.10.1.0
>Reporter: Robert Schumann
>Assignee: Jiangjie Qin
>Priority: Critical
> Fix For: 0.10.1.1
>
>
> _created from ML entry by request of [~ijuma]_
> Hi all,
> we are facing an issue with latest kafka 0.10.1 and the log cleaner thread 
> with regards to the timeindex files. From the log of the log-cleaner we see 
> after startup that it tries to cleanup a topic called xdc_listing-status-v2 
> [1]. The topic is setup with log compaction [2] and the kafka cluster 
> configuration has log.cleaner enabled [3]. Looking at the log and the newly 
> created file [4], the cleaner seems to refer to tombstones prior to 
> epoch_time=0 - maybe because he finds messages, which don’t have a timestamp 
> at all (?). All producers and consumers are using 0.10.1 and the topics have 
> been created completely new, so I’m not sure, where this issue would come 
> from. The original timeindex file [5] seems to show only valid timestamps for 
> the mentioned offsets. I would also like to mention that the issue happened 
> in two independent datacenters at the same time, so I would rather expect an 
> application/producer issue instead of random disk failures. We didn’t have 
> the problem with 0.10.0 for around half a year, it appeared shortly after the 
> upgrade to 0.10.1.
> The puzzling message from the cleaner “cleaning prior to Fri Dec 02 16:35:50 
> CET 2016, discarding tombstones prior to Thu Jan 01 01:00:00 CET 1970” also 
> confuses me a bit. Does that mean, it does not find any log segments which 
> can be cleaned up or the last timestamp of the last log segment is somehow 
> broken/missing?
> I’m also a bit wondering, why the log cleaner thread stops completely after 
> an error with one topic. I would at least expect that it keeps on cleaning up 
> other topics, but apparently it doesn’t do that, e.g. it’s not even cleaning 
> the __consumer_offsets anymore.
> Does anybody have the same issues or can explain, what’s going on? Thanks for 
> any help or suggestions.
> Cheers
> Robert
> [1]
> {noformat}
> [2016-12-06 12:49:17,885] INFO Starting the log cleaner (kafka.log.LogCleaner)
> [2016-12-06 12:49:17,895] INFO [kafka-log-cleaner-thread-0], Starting  
> (kafka.log.LogCleaner)
> [2016-12-06 12:49:17,947] INFO Cleaner 0: Beginning cleaning of log 
> xdc_listing-status-v2-1. (kafka.log.LogCleaner)
> [2016-12-06 12:49:17,948] INFO Cleaner 0: Building offset map for 
> xdc_listing-status-v2-1... (kafka.log.LogCleaner)
> [2016-12-06 12:49:17,989] INFO Cleaner 0: Building offset map for log 
> xdc_listing-status-v2-1 for 1 segments in offset range [0, 194991). 
> (kafka.log.LogCleaner)
> [2016-12-06 12:49:24,572] INFO Cleaner 0: Offset map for log 
> xdc_listing-status-v2-1 complete. (kafka.log.LogCleaner)
> [2016-12-06 12:49:24,577] INFO Cleaner 0: Cleaning log 
> xdc_listing-status-v2-1 (cleaning prior to Fri Dec 02 16:35:50 CET 2016, 
> discarding tombstones prior to Thu Jan 01 01:00:00 CET 1970)... 
> (kafka.log.LogCleaner)
> [2016-12-06 12:49:24,580] INFO Cleaner 0: Cleaning segment 0 in log 
> xdc_listing-status-v2-1 (largest timestamp Fri Dec 02 16:35:50 CET 2016) into 
> 0, retaining deletes. (kafka.log.LogCleaner)
> [2016-12-06 12:49:24,968] ERROR [kafka-log-cleaner-thread-0], Error due to  
> (kafka.log.LogCleaner)
> kafka.common.InvalidOffsetException: Attempt to append an offset (-1) to slot 
> 9 no larger than the last offset appended (11832) to 
> /var/lib/kafka/xdc_listing-status-v2-1/.timeindex.cleaned.
> at 
> kafka.log.TimeIndex$$anonfun$maybeAppend$1.apply$mcV$sp(TimeIndex.scala:117)
> at 
> kafka.log.TimeIndex$$anonfun$maybeAppend$1.apply(TimeIndex.scala:107)
> at 
> kafka.log.TimeIndex$$anonfun$maybeAppend$1.apply(TimeIndex.scala:107)
> at kafka.utils.CoreUtils$.inLock(CoreUtils.scala:234)
> at kafka.log.TimeIndex.maybeAppend(TimeIndex.scala:107)
> at 

[GitHub] kafka pull request #2295: MINOR: Mx4jLoader always returns false even if mx4...

2016-12-29 Thread eribeiro
GitHub user eribeiro opened a pull request:

https://github.com/apache/kafka/pull/2295

MINOR: Mx4jLoader always returns false even if mx4j is loaded & started

Mx4jLoader.scala should explicitly `return true` if the class is 
successfully loaded and started, otherwise it will return false even if the 
class is loaded.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/eribeiro/kafka mx4jloader-bug

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/2295.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2295


commit e6344181b4719fd8ba0de59862f39871d787c62a
Author: Edward Ribeiro 
Date:   2016-12-29T21:10:14Z

MINOR: Mx4jLoader always returns false even if mx4j is loaded




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


Re: custom offsets in ProduceRequest

2016-12-29 Thread Andrey L. Neporada

> On 29 Dec 2016, at 20:43, radai  wrote:
> 
> so, if i follow your suggested logic correctly, there would be some sort of
> :
> 
> produce(partition, msg, requestedOffset)
> 

> which would fail if requestedOffset is already taken (by another previous
> such explicit call or by another regular call that just happened to get
> assigned that offset by the partition leader on the target cluster).
> 

Yes. More formally, my proposal is to extend ProduceRequest by adding 
MessageSetStartOffset:

ProduceRequest => RequiredAcks Timeout [TopicName [Partition 
MessageSetStartOffset MessageSetSize MessageSet]]
  RequiredAcks => int16
  Timeout => int32
  Partition => int32
  MessageSetSize => int32
  MessageSetStartOffset => int64

If MessageSetStartOffset is -1, ProduceRequest should work exactly as before - 
i.e. assign next available offset to given MessageSet.


> how would you meaningfully handle this failure?
> 
> suppose this happens to some cross-cluster replicator (like mirror maker).
> there is no use in retrying. the options would be:
> 
> 1. get the next available offset - which would violate what youre trying to
> achieve
> 2. skip msgs - so replication is incomplete, any offset "already taken" on
> the destination is not replicated from source
> 3. stop replication for this partition completely - because starting from
> now _ALL_ offsets will be taken - 1 foreign msg ruins everything for the
> entire partition.
> 
> none of these options look good to me.
> 
> 

Since we are discussing master-slave replication, the only client writing to 
slave cluster is the replicator itself.
In this case ProduceRequest failure is some kind of replication logic error - 
for example when two replication instances are somehow launched for single 
partition.
The best option here is just to stop replication process.

So the answer to your question is (3), but this scenario should never happen.


> 
> On Thu, Dec 29, 2016 at 3:22 AM, Andrey L. Neporada <
> anepor...@yandex-team.ru> wrote:
> 
>> Hi!
>> 
>>> On 27 Dec 2016, at 19:35, radai  wrote:
>>> 
>>> IIUC if you replicate from a single source cluster to a single target
>>> cluster, the topic has the same number of partitions on both, and no one
>>> writes directly to the target cluster (so master --> slave) the offsets
>>> would be preserved.
>>> 
>> 
>> Yes, exactly. When you
>> 1) create topic with the same number of partitions on both master and
>> slave clusters
>> 2) write only to master
>> 3) replicate partition to partition from master to slave
>> - in this case the offsets will be preserved.
>> 
>> However, you usually already have cluster that works and want to replicate
>> some topics to another one.
>> IMHO, in this scenario there should be a way to make message offsets equal
>> on both clusters.
>> 
>>> but in the general case - how would you handle the case where multiple
>>> producers "claim" the same offset ?
>> 
>> The same way as Kafka handles concurrent produce requests for the same
>> partition - produce requests for partition are serialized.
>> If the next produce request “overlaps” with previous one, it fails.
>> 
>>> 
>>> 
>>> On Mon, Dec 26, 2016 at 4:52 AM, Andrey L. Neporada <
>>> anepor...@yandex-team.ru> wrote:
>>> 
 Hi all!
 
 Suppose you have two Kafka clusters and want to replicate topics from
 primary cluster to secondary one.
 It would be very convenient for readers if the message offsets for
 replicated topics would be the same as for primary topics.
 
 As far as I know, currently there is no way to achieve this.
 I wonder is it possible/reasonable to add message offset to
>> ProduceRequest?
 
 
 —
 Andrey Neporada
 
 
 
 
>> 
>> 



Build failed in Jenkins: kafka-trunk-jdk7 #1787

2016-12-29 Thread Apache Jenkins Server
See 

Changes:

[me] MINOR: Sync up 'kafka-run-class.bat' with 'kafka-run-class.sh'

--
[...truncated 17594 lines...]
org.apache.kafka.streams.processor.internals.StreamsMetadataStateTest > 
shouldGetInstanceWithKeyAndCustomPartitioner PASSED

org.apache.kafka.streams.processor.internals.StreamsMetadataStateTest > 
shouldThrowIfStoreNameIsNullOnGetAllInstancesWithStore STARTED

org.apache.kafka.streams.processor.internals.StreamsMetadataStateTest > 
shouldThrowIfStoreNameIsNullOnGetAllInstancesWithStore PASSED

org.apache.kafka.streams.processor.internals.AbstractTaskTest > 
shouldThrowProcessorStateExceptionOnInitializeOffsetsWhenAuthorizationException 
STARTED

org.apache.kafka.streams.processor.internals.AbstractTaskTest > 
shouldThrowProcessorStateExceptionOnInitializeOffsetsWhenAuthorizationException 
PASSED

org.apache.kafka.streams.processor.internals.AbstractTaskTest > 
shouldThrowProcessorStateExceptionOnInitializeOffsetsWhenKafkaException STARTED

org.apache.kafka.streams.processor.internals.AbstractTaskTest > 
shouldThrowProcessorStateExceptionOnInitializeOffsetsWhenKafkaException PASSED

org.apache.kafka.streams.processor.internals.AbstractTaskTest > 
shouldThrowWakeupExceptionOnInitializeOffsetsWhenWakeupException STARTED

org.apache.kafka.streams.processor.internals.AbstractTaskTest > 
shouldThrowWakeupExceptionOnInitializeOffsetsWhenWakeupException PASSED

org.apache.kafka.streams.processor.internals.InternalTopicConfigTest > 
shouldHaveCompactionPropSetIfSupplied STARTED

org.apache.kafka.streams.processor.internals.InternalTopicConfigTest > 
shouldHaveCompactionPropSetIfSupplied PASSED

org.apache.kafka.streams.processor.internals.InternalTopicConfigTest > 
shouldThrowIfNameIsNull STARTED

org.apache.kafka.streams.processor.internals.InternalTopicConfigTest > 
shouldThrowIfNameIsNull PASSED

org.apache.kafka.streams.processor.internals.InternalTopicConfigTest > 
shouldConfigureRetentionMsWithAdditionalRetentionWhenCompactAndDelete STARTED

org.apache.kafka.streams.processor.internals.InternalTopicConfigTest > 
shouldConfigureRetentionMsWithAdditionalRetentionWhenCompactAndDelete PASSED

org.apache.kafka.streams.processor.internals.InternalTopicConfigTest > 
shouldBeCompactedIfCleanupPolicyCompactOrCompactAndDelete STARTED

org.apache.kafka.streams.processor.internals.InternalTopicConfigTest > 
shouldBeCompactedIfCleanupPolicyCompactOrCompactAndDelete PASSED

org.apache.kafka.streams.processor.internals.InternalTopicConfigTest > 
shouldNotBeCompactedWhenCleanupPolicyIsDelete STARTED

org.apache.kafka.streams.processor.internals.InternalTopicConfigTest > 
shouldNotBeCompactedWhenCleanupPolicyIsDelete PASSED

org.apache.kafka.streams.processor.internals.InternalTopicConfigTest > 
shouldHavePropertiesSuppliedByUser STARTED

org.apache.kafka.streams.processor.internals.InternalTopicConfigTest > 
shouldHavePropertiesSuppliedByUser PASSED

org.apache.kafka.streams.processor.internals.InternalTopicConfigTest > 
shouldUseCleanupPolicyFromConfigIfSupplied STARTED

org.apache.kafka.streams.processor.internals.InternalTopicConfigTest > 
shouldUseCleanupPolicyFromConfigIfSupplied PASSED

org.apache.kafka.streams.processor.internals.InternalTopicConfigTest > 
shouldNotConfigureRetentionMsWhenCompact STARTED

org.apache.kafka.streams.processor.internals.InternalTopicConfigTest > 
shouldNotConfigureRetentionMsWhenCompact PASSED

org.apache.kafka.streams.processor.internals.InternalTopicConfigTest > 
shouldNotConfigureRetentionMsWhenDelete STARTED

org.apache.kafka.streams.processor.internals.InternalTopicConfigTest > 
shouldNotConfigureRetentionMsWhenDelete PASSED

org.apache.kafka.streams.processor.internals.StandbyTaskTest > 
shouldNotThrowUnsupportedOperationExceptionWhenInitializingStateStores STARTED

org.apache.kafka.streams.processor.internals.StandbyTaskTest > 
shouldNotThrowUnsupportedOperationExceptionWhenInitializingStateStores PASSED

org.apache.kafka.streams.processor.internals.StandbyTaskTest > 
testStorePartitions STARTED

org.apache.kafka.streams.processor.internals.StandbyTaskTest > 
testStorePartitions PASSED

org.apache.kafka.streams.processor.internals.StandbyTaskTest > testUpdateKTable 
STARTED

org.apache.kafka.streams.processor.internals.StandbyTaskTest > testUpdateKTable 
PASSED

org.apache.kafka.streams.processor.internals.StandbyTaskTest > 
testUpdateNonPersistentStore STARTED

org.apache.kafka.streams.processor.internals.StandbyTaskTest > 
testUpdateNonPersistentStore PASSED

org.apache.kafka.streams.processor.internals.StandbyTaskTest > testUpdate 
STARTED

org.apache.kafka.streams.processor.internals.StandbyTaskTest > testUpdate PASSED

org.apache.kafka.streams.processor.internals.RecordCollectorTest > 
shouldThrowStreamsExceptionOnFlushIfASendFailed STARTED

org.apache.kafka.streams.processor.internals.RecordCollectorTest > 
shouldThrowStreamsExceptionOnFlushIfASendFailed 

Build failed in Jenkins: kafka-trunk-jdk8 #1134

2016-12-29 Thread Apache Jenkins Server
See 

Changes:

[me] MINOR: Sync up 'kafka-run-class.bat' with 'kafka-run-class.sh'

--
[...truncated 17618 lines...]

org.apache.kafka.streams.KafkaStreamsTest > 
shouldNotGetAllTasksWithStoreWhenNotRunning STARTED

org.apache.kafka.streams.KafkaStreamsTest > 
shouldNotGetAllTasksWithStoreWhenNotRunning PASSED

org.apache.kafka.streams.KafkaStreamsTest > testCannotStartOnceClosed STARTED

org.apache.kafka.streams.KafkaStreamsTest > testCannotStartOnceClosed PASSED

org.apache.kafka.streams.KafkaStreamsTest > testCleanup STARTED

org.apache.kafka.streams.KafkaStreamsTest > testCleanup PASSED

org.apache.kafka.streams.KafkaStreamsTest > testStartAndClose STARTED

org.apache.kafka.streams.KafkaStreamsTest > testStartAndClose PASSED

org.apache.kafka.streams.KafkaStreamsTest > testCloseIsIdempotent STARTED

org.apache.kafka.streams.KafkaStreamsTest > testCloseIsIdempotent PASSED

org.apache.kafka.streams.KafkaStreamsTest > testCannotCleanupWhileRunning 
STARTED

org.apache.kafka.streams.KafkaStreamsTest > testCannotCleanupWhileRunning PASSED

org.apache.kafka.streams.KafkaStreamsTest > testCannotStartTwice STARTED

org.apache.kafka.streams.KafkaStreamsTest > testCannotStartTwice PASSED

org.apache.kafka.streams.integration.KStreamKTableJoinIntegrationTest > 
shouldCountClicksPerRegion[0] STARTED

org.apache.kafka.streams.integration.KStreamKTableJoinIntegrationTest > 
shouldCountClicksPerRegion[0] PASSED

org.apache.kafka.streams.integration.KStreamKTableJoinIntegrationTest > 
shouldCountClicksPerRegion[1] STARTED

org.apache.kafka.streams.integration.KStreamKTableJoinIntegrationTest > 
shouldCountClicksPerRegion[1] PASSED

org.apache.kafka.streams.integration.QueryableStateIntegrationTest > 
shouldBeAbleToQueryState[0] STARTED

org.apache.kafka.streams.integration.QueryableStateIntegrationTest > 
shouldBeAbleToQueryState[0] PASSED

org.apache.kafka.streams.integration.QueryableStateIntegrationTest > 
shouldNotMakeStoreAvailableUntilAllStoresAvailable[0] STARTED

org.apache.kafka.streams.integration.QueryableStateIntegrationTest > 
shouldNotMakeStoreAvailableUntilAllStoresAvailable[0] PASSED

org.apache.kafka.streams.integration.QueryableStateIntegrationTest > 
queryOnRebalance[0] STARTED

org.apache.kafka.streams.integration.QueryableStateIntegrationTest > 
queryOnRebalance[0] FAILED
java.lang.AssertionError: Condition not met within timeout 6. Did not 
receive 1 number of records
at org.apache.kafka.test.TestUtils.waitForCondition(TestUtils.java:259)
at 
org.apache.kafka.streams.integration.utils.IntegrationTestUtils.waitUntilMinValuesRecordsReceived(IntegrationTestUtils.java:251)
at 
org.apache.kafka.streams.integration.QueryableStateIntegrationTest.waitUntilAtLeastNumRecordProcessed(QueryableStateIntegrationTest.java:669)
at 
org.apache.kafka.streams.integration.QueryableStateIntegrationTest.queryOnRebalance(QueryableStateIntegrationTest.java:350)

org.apache.kafka.streams.integration.QueryableStateIntegrationTest > 
concurrentAccesses[0] STARTED

org.apache.kafka.streams.integration.QueryableStateIntegrationTest > 
concurrentAccesses[0] PASSED

org.apache.kafka.streams.integration.QueryableStateIntegrationTest > 
shouldBeAbleToQueryState[1] STARTED

org.apache.kafka.streams.integration.QueryableStateIntegrationTest > 
shouldBeAbleToQueryState[1] PASSED

org.apache.kafka.streams.integration.QueryableStateIntegrationTest > 
shouldNotMakeStoreAvailableUntilAllStoresAvailable[1] STARTED

org.apache.kafka.streams.integration.QueryableStateIntegrationTest > 
shouldNotMakeStoreAvailableUntilAllStoresAvailable[1] PASSED

org.apache.kafka.streams.integration.QueryableStateIntegrationTest > 
queryOnRebalance[1] STARTED

org.apache.kafka.streams.integration.QueryableStateIntegrationTest > 
queryOnRebalance[1] PASSED

org.apache.kafka.streams.integration.QueryableStateIntegrationTest > 
concurrentAccesses[1] STARTED

org.apache.kafka.streams.integration.QueryableStateIntegrationTest > 
concurrentAccesses[1] PASSED

org.apache.kafka.streams.integration.RegexSourceIntegrationTest > 
testShouldReadFromRegexAndNamedTopics STARTED

org.apache.kafka.streams.integration.RegexSourceIntegrationTest > 
testShouldReadFromRegexAndNamedTopics PASSED

org.apache.kafka.streams.integration.RegexSourceIntegrationTest > 
testRegexMatchesTopicsAWhenCreated STARTED

org.apache.kafka.streams.integration.RegexSourceIntegrationTest > 
testRegexMatchesTopicsAWhenCreated PASSED

org.apache.kafka.streams.integration.RegexSourceIntegrationTest > 
testMultipleConsumersCanReadFromPartitionedTopic STARTED

org.apache.kafka.streams.integration.RegexSourceIntegrationTest > 
testMultipleConsumersCanReadFromPartitionedTopic PASSED

org.apache.kafka.streams.integration.RegexSourceIntegrationTest > 
testRegexMatchesTopicsAWhenDeleted STARTED

org.apache.kafka.streams.integration.RegexSourceIntegrationTest > 

[jira] [Assigned] (KAFKA-4575) Transient failure in ConnectDistributedTest.test_pause_and_resume_sink in consuming messages after resuming source connector

2016-12-29 Thread Shikhar Bhushan (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-4575?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shikhar Bhushan reassigned KAFKA-4575:
--

Assignee: Shikhar Bhushan

> Transient failure in ConnectDistributedTest.test_pause_and_resume_sink in 
> consuming messages after resuming source connector
> 
>
> Key: KAFKA-4575
> URL: https://issues.apache.org/jira/browse/KAFKA-4575
> Project: Kafka
>  Issue Type: Test
>  Components: KafkaConnect, system tests
>Reporter: Shikhar Bhushan
>Assignee: Shikhar Bhushan
>
> http://confluent-kafka-system-test-results.s3-us-west-2.amazonaws.com/2016-12-29--001.1483003056--apache--trunk--dc55025/report.html
> {noformat}
> [INFO  - 2016-12-29 08:56:23,050 - runner_client - log - lineno:221]: 
> RunnerClient: 
> kafkatest.tests.connect.connect_distributed_test.ConnectDistributedTest.test_pause_and_resume_sink:
>  Summary: Failed to consume messages after resuming source connector
> Traceback (most recent call last):
>   File 
> "/var/lib/jenkins/workspace/system-test-kafka/kafka/venv/local/lib/python2.7/site-packages/ducktape-0.6.0-py2.7.egg/ducktape/tests/runner_client.py",
>  line 123, in run
> data = self.run_test()
>   File 
> "/var/lib/jenkins/workspace/system-test-kafka/kafka/venv/local/lib/python2.7/site-packages/ducktape-0.6.0-py2.7.egg/ducktape/tests/runner_client.py",
>  line 176, in run_test
> return self.test_context.function(self.test)
>   File 
> "/var/lib/jenkins/workspace/system-test-kafka/kafka/tests/kafkatest/tests/connect/connect_distributed_test.py",
>  line 267, in test_pause_and_resume_sink
> err_msg="Failed to consume messages after resuming source connector")
>   File 
> "/var/lib/jenkins/workspace/system-test-kafka/kafka/venv/local/lib/python2.7/site-packages/ducktape-0.6.0-py2.7.egg/ducktape/utils/util.py",
>  line 36, in wait_until
> raise TimeoutError(err_msg)
> TimeoutError: Failed to consume messages after resuming source connector
> {noformat}
> We recently fixed KAFKA-4527 and this is a new kind of failure in the same 
> test.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (KAFKA-4575) Transient failure in ConnectDistributedTest.test_pause_and_resume_sink in consuming messages after resuming source connector

2016-12-29 Thread Shikhar Bhushan (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-4575?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shikhar Bhushan updated KAFKA-4575:
---
Component/s: system tests
 KafkaConnect

> Transient failure in ConnectDistributedTest.test_pause_and_resume_sink in 
> consuming messages after resuming source connector
> 
>
> Key: KAFKA-4575
> URL: https://issues.apache.org/jira/browse/KAFKA-4575
> Project: Kafka
>  Issue Type: Test
>  Components: KafkaConnect, system tests
>Reporter: Shikhar Bhushan
>
> http://confluent-kafka-system-test-results.s3-us-west-2.amazonaws.com/2016-12-29--001.1483003056--apache--trunk--dc55025/report.html
> {noformat}
> [INFO  - 2016-12-29 08:56:23,050 - runner_client - log - lineno:221]: 
> RunnerClient: 
> kafkatest.tests.connect.connect_distributed_test.ConnectDistributedTest.test_pause_and_resume_sink:
>  Summary: Failed to consume messages after resuming source connector
> Traceback (most recent call last):
>   File 
> "/var/lib/jenkins/workspace/system-test-kafka/kafka/venv/local/lib/python2.7/site-packages/ducktape-0.6.0-py2.7.egg/ducktape/tests/runner_client.py",
>  line 123, in run
> data = self.run_test()
>   File 
> "/var/lib/jenkins/workspace/system-test-kafka/kafka/venv/local/lib/python2.7/site-packages/ducktape-0.6.0-py2.7.egg/ducktape/tests/runner_client.py",
>  line 176, in run_test
> return self.test_context.function(self.test)
>   File 
> "/var/lib/jenkins/workspace/system-test-kafka/kafka/tests/kafkatest/tests/connect/connect_distributed_test.py",
>  line 267, in test_pause_and_resume_sink
> err_msg="Failed to consume messages after resuming source connector")
>   File 
> "/var/lib/jenkins/workspace/system-test-kafka/kafka/venv/local/lib/python2.7/site-packages/ducktape-0.6.0-py2.7.egg/ducktape/utils/util.py",
>  line 36, in wait_until
> raise TimeoutError(err_msg)
> TimeoutError: Failed to consume messages after resuming source connector
> {noformat}
> We recently fixed KAFKA-4527 and this is a new kind of failure in the same 
> test.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (KAFKA-4575) Transient failure in ConnectDistributedTest.test_pause_and_resume_sink in consuming messages after resuming source connector

2016-12-29 Thread Shikhar Bhushan (JIRA)
Shikhar Bhushan created KAFKA-4575:
--

 Summary: Transient failure in 
ConnectDistributedTest.test_pause_and_resume_sink in consuming messages after 
resuming source connector
 Key: KAFKA-4575
 URL: https://issues.apache.org/jira/browse/KAFKA-4575
 Project: Kafka
  Issue Type: Test
Reporter: Shikhar Bhushan


http://confluent-kafka-system-test-results.s3-us-west-2.amazonaws.com/2016-12-29--001.1483003056--apache--trunk--dc55025/report.html

{noformat}
[INFO  - 2016-12-29 08:56:23,050 - runner_client - log - lineno:221]: 
RunnerClient: 
kafkatest.tests.connect.connect_distributed_test.ConnectDistributedTest.test_pause_and_resume_sink:
 Summary: Failed to consume messages after resuming source connector
Traceback (most recent call last):
  File 
"/var/lib/jenkins/workspace/system-test-kafka/kafka/venv/local/lib/python2.7/site-packages/ducktape-0.6.0-py2.7.egg/ducktape/tests/runner_client.py",
 line 123, in run
data = self.run_test()
  File 
"/var/lib/jenkins/workspace/system-test-kafka/kafka/venv/local/lib/python2.7/site-packages/ducktape-0.6.0-py2.7.egg/ducktape/tests/runner_client.py",
 line 176, in run_test
return self.test_context.function(self.test)
  File 
"/var/lib/jenkins/workspace/system-test-kafka/kafka/tests/kafkatest/tests/connect/connect_distributed_test.py",
 line 267, in test_pause_and_resume_sink
err_msg="Failed to consume messages after resuming source connector")
  File 
"/var/lib/jenkins/workspace/system-test-kafka/kafka/venv/local/lib/python2.7/site-packages/ducktape-0.6.0-py2.7.egg/ducktape/utils/util.py",
 line 36, in wait_until
raise TimeoutError(err_msg)
TimeoutError: Failed to consume messages after resuming source connector
{noformat}

We recently fixed KAFKA-4527 and this is a new kind of failure in the same test.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (KAFKA-4574) Transient failure in ZooKeeperSecurityUpgradeTest.test_zk_security_upgrade with security_protocol = SASL_PLAINTEXT, SSL

2016-12-29 Thread Shikhar Bhushan (JIRA)
Shikhar Bhushan created KAFKA-4574:
--

 Summary: Transient failure in 
ZooKeeperSecurityUpgradeTest.test_zk_security_upgrade with security_protocol = 
SASL_PLAINTEXT, SSL
 Key: KAFKA-4574
 URL: https://issues.apache.org/jira/browse/KAFKA-4574
 Project: Kafka
  Issue Type: Test
  Components: system tests
Reporter: Shikhar Bhushan


http://confluent-kafka-system-test-results.s3-us-west-2.amazonaws.com/2016-12-29--001.1483003056--apache--trunk--dc55025/report.html

{{ZooKeeperSecurityUpgradeTest.test_zk_security_upgrade}} failed with these 
{{security_protocol}} parameters 

{noformat}

test_id:
kafkatest.tests.core.zookeeper_security_upgrade_test.ZooKeeperSecurityUpgradeTest.test_zk_security_upgrade.security_protocol=SASL_PLAINTEXT
status: FAIL
run time:   3 minutes 44.094 seconds


1 acked message did not make it to the Consumer. They are: [5076]. We 
validated that the first 1 of these missing messages correctly made it into 
Kafka's data files. This suggests they were lost on their way to the consumer.
Traceback (most recent call last):
  File 
"/var/lib/jenkins/workspace/system-test-kafka/kafka/venv/local/lib/python2.7/site-packages/ducktape-0.6.0-py2.7.egg/ducktape/tests/runner_client.py",
 line 123, in run
data = self.run_test()
  File 
"/var/lib/jenkins/workspace/system-test-kafka/kafka/venv/local/lib/python2.7/site-packages/ducktape-0.6.0-py2.7.egg/ducktape/tests/runner_client.py",
 line 176, in run_test
return self.test_context.function(self.test)
  File 
"/var/lib/jenkins/workspace/system-test-kafka/kafka/venv/local/lib/python2.7/site-packages/ducktape-0.6.0-py2.7.egg/ducktape/mark/_mark.py",
 line 321, in wrapper
return functools.partial(f, *args, **kwargs)(*w_args, **w_kwargs)
  File 
"/var/lib/jenkins/workspace/system-test-kafka/kafka/tests/kafkatest/tests/core/zookeeper_security_upgrade_test.py",
 line 117, in test_zk_security_upgrade
self.run_produce_consume_validate(self.run_zk_migration)
  File 
"/var/lib/jenkins/workspace/system-test-kafka/kafka/tests/kafkatest/tests/produce_consume_validate.py",
 line 101, in run_produce_consume_validate
self.validate()
  File 
"/var/lib/jenkins/workspace/system-test-kafka/kafka/tests/kafkatest/tests/produce_consume_validate.py",
 line 163, in validate
assert success, msg
AssertionError: 1 acked message did not make it to the Consumer. They are: 
[5076]. We validated that the first 1 of these missing messages correctly made 
it into Kafka's data files. This suggests they were lost on their way to the 
consumer.
{noformat}

{noformat}

test_id:
kafkatest.tests.core.zookeeper_security_upgrade_test.ZooKeeperSecurityUpgradeTest.test_zk_security_upgrade.security_protocol=SSL
status: FAIL
run time:   3 minutes 50.578 seconds


1 acked message did not make it to the Consumer. They are: [3559]. We 
validated that the first 1 of these missing messages correctly made it into 
Kafka's data files. This suggests they were lost on their way to the consumer.
Traceback (most recent call last):
  File 
"/var/lib/jenkins/workspace/system-test-kafka/kafka/venv/local/lib/python2.7/site-packages/ducktape-0.6.0-py2.7.egg/ducktape/tests/runner_client.py",
 line 123, in run
data = self.run_test()
  File 
"/var/lib/jenkins/workspace/system-test-kafka/kafka/venv/local/lib/python2.7/site-packages/ducktape-0.6.0-py2.7.egg/ducktape/tests/runner_client.py",
 line 176, in run_test
return self.test_context.function(self.test)
  File 
"/var/lib/jenkins/workspace/system-test-kafka/kafka/venv/local/lib/python2.7/site-packages/ducktape-0.6.0-py2.7.egg/ducktape/mark/_mark.py",
 line 321, in wrapper
return functools.partial(f, *args, **kwargs)(*w_args, **w_kwargs)
  File 
"/var/lib/jenkins/workspace/system-test-kafka/kafka/tests/kafkatest/tests/core/zookeeper_security_upgrade_test.py",
 line 117, in test_zk_security_upgrade
self.run_produce_consume_validate(self.run_zk_migration)
  File 
"/var/lib/jenkins/workspace/system-test-kafka/kafka/tests/kafkatest/tests/produce_consume_validate.py",
 line 101, in run_produce_consume_validate
self.validate()
  File 
"/var/lib/jenkins/workspace/system-test-kafka/kafka/tests/kafkatest/tests/produce_consume_validate.py",
 line 163, in validate
assert success, msg
AssertionError: 1 acked message did not make it to the Consumer. They are: 
[3559]. We validated that the first 1 of these missing messages correctly made 
it into Kafka's data files. This suggests they were lost on their way to the 
consumer.
{noformat}

Previously: KAFKA-3985



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Some questions about providing corrections to documentation.

2016-12-29 Thread Dhwani Katagade

Hi,

_Observations:_

While going through the design documentation I noticed a minor error of 
duplication in points 4 and 5 here 
https://kafka.apache.org/documentation/#design_compactionguarantees. 
When I looked up the files in git, I see the duplication in all of the 
below files in kafka-site repo


https://github.com/apache/kafka-site/blob/asf-site/081/design.html
https://github.com/apache/kafka-site/blob/asf-site/082/design.html
https://github.com/apache/kafka-site/blob/asf-site/090/design.html
https://github.com/apache/kafka-site/blob/asf-site/0100/design.html
https://github.com/apache/kafka-site/blob/asf-site/0101/design.html

And also in the following branches under kafka repo
https://github.com/apache/kafka/blob/0.9.0/docs/design.html
https://github.com/apache/kafka/blob/0.10.0/docs/design.html
https://github.com/apache/kafka/blob/0.10.1/docs/design.html

But the same is corrected under trunk here 
https://github.com/apache/kafka/blob/trunk/docs/design.html


_Questions:_

1. If I have to provide a patch/PR to cleanup this documentation error,
   should I provide the fix in files corresponding to all the versions
   under kafka-site?
2. When the next release happens, as I understand, a new directory will
   be added under kafka-site:asf-site. Since this would come from
   kafka:trunk it will have the correction. Is my understanding correct?
3. As I understand, older branches under kafka repo are release
   branches, and hence we should not make any new changes under the
   docs directory on these branches. Is my understanding correct?
4. As I understand, this fix does not require a JIRA issue to be
   logged. Is my understanding correct?

Thanks in advance for the clarifications.

-dhwani


DISCLAIMER
==
This e-mail may contain privileged and confidential information which is the 
property of Persistent Systems Ltd. It is intended only for the use of the 
individual or entity to which it is addressed. If you are not the intended 
recipient, you are not authorized to read, retain, copy, print, distribute or 
use this message. If you have received this communication in error, please 
notify the sender and delete all copies of this message. Persistent Systems 
Ltd. does not accept any liability for virus infected mails.



Re: custom offsets in ProduceRequest

2016-12-29 Thread radai
so, if i follow your suggested logic correctly, there would be some sort of
:

produce(partition, msg, requestedOffset)

which would fail if requestedOffset is already taken (by another previous
such explicit call or by another regular call that just happened to get
assigned that offset by the partition leader on the target cluster).

how would you meaningfully handle this failure?

suppose this happens to some cross-cluster replicator (like mirror maker).
there is no use in retrying. the options would be:

1. get the next available offset - which would violate what youre trying to
achieve
2. skip msgs - so replication is incomplete, any offset "already taken" on
the destination is not replicated from source
3. stop replication for this partition completely - because starting from
now _ALL_ offsets will be taken - 1 foreign msg ruins everything for the
entire partition.

none of these options look good to me.



On Thu, Dec 29, 2016 at 3:22 AM, Andrey L. Neporada <
anepor...@yandex-team.ru> wrote:

> Hi!
>
> > On 27 Dec 2016, at 19:35, radai  wrote:
> >
> > IIUC if you replicate from a single source cluster to a single target
> > cluster, the topic has the same number of partitions on both, and no one
> > writes directly to the target cluster (so master --> slave) the offsets
> > would be preserved.
> >
>
> Yes, exactly. When you
> 1) create topic with the same number of partitions on both master and
> slave clusters
> 2) write only to master
> 3) replicate partition to partition from master to slave
> - in this case the offsets will be preserved.
>
> However, you usually already have cluster that works and want to replicate
> some topics to another one.
> IMHO, in this scenario there should be a way to make message offsets equal
> on both clusters.
>
> > but in the general case - how would you handle the case where multiple
> > producers "claim" the same offset ?
>
> The same way as Kafka handles concurrent produce requests for the same
> partition - produce requests for partition are serialized.
> If the next produce request “overlaps” with previous one, it fails.
>
> >
> >
> > On Mon, Dec 26, 2016 at 4:52 AM, Andrey L. Neporada <
> > anepor...@yandex-team.ru> wrote:
> >
> >> Hi all!
> >>
> >> Suppose you have two Kafka clusters and want to replicate topics from
> >> primary cluster to secondary one.
> >> It would be very convenient for readers if the message offsets for
> >> replicated topics would be the same as for primary topics.
> >>
> >> As far as I know, currently there is no way to achieve this.
> >> I wonder is it possible/reasonable to add message offset to
> ProduceRequest?
> >>
> >>
> >> —
> >> Andrey Neporada
> >>
> >>
> >>
> >>
>
>


[GitHub] kafka pull request #2238: MINOR: Sync up 'kafka-run-class.bat' with 'kafka-r...

2016-12-29 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/kafka/pull/2238


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


Re: custom offsets in ProduceRequest

2016-12-29 Thread Andrey L . Neporada
Hi!

> On 27 Dec 2016, at 19:35, radai  wrote:
> 
> IIUC if you replicate from a single source cluster to a single target
> cluster, the topic has the same number of partitions on both, and no one
> writes directly to the target cluster (so master --> slave) the offsets
> would be preserved.
> 

Yes, exactly. When you
1) create topic with the same number of partitions on both master and slave 
clusters
2) write only to master
3) replicate partition to partition from master to slave
- in this case the offsets will be preserved.

However, you usually already have cluster that works and want to replicate some 
topics to another one.
IMHO, in this scenario there should be a way to make message offsets equal on 
both clusters.

> but in the general case - how would you handle the case where multiple
> producers "claim" the same offset ?

The same way as Kafka handles concurrent produce requests for the same 
partition - produce requests for partition are serialized.
If the next produce request “overlaps” with previous one, it fails.

> 
> 
> On Mon, Dec 26, 2016 at 4:52 AM, Andrey L. Neporada <
> anepor...@yandex-team.ru> wrote:
> 
>> Hi all!
>> 
>> Suppose you have two Kafka clusters and want to replicate topics from
>> primary cluster to secondary one.
>> It would be very convenient for readers if the message offsets for
>> replicated topics would be the same as for primary topics.
>> 
>> As far as I know, currently there is no way to achieve this.
>> I wonder is it possible/reasonable to add message offset to ProduceRequest?
>> 
>> 
>> —
>> Andrey Neporada
>> 
>> 
>> 
>> 



Re: Experiencing trouble with KafkaCSVMetricsReporter

2016-12-29 Thread Ismael Juma
Hi Dongjin,

1. I'm not familiar with `KafkaCSVMetricsReporter`, but it seems to delete
and recreate the csv dir before starting `CsvReporter` which then creates
the csv files. Assuming the permissions are correct, not sure why it would
fail to create the files. It may be worth filing a JIRA (and possibly a PR
with a fix if you figure out the reason).

2. You could check BytesPerSec via JMX to confirm that the problem is
indeed in the CVS reporter (and/or related classes). Again, it's worth
filing a JIRA for this (and a PR with a fix would be even better).

Thanks,
Ismael

On Tue, Dec 27, 2016 at 2:31 PM, Dongjin Lee  wrote:

> In short: the resulting csv files from the brokers are filled with 0 only,
> although the broker cluster is running correctly.
>
> Hello. I am trying some benchmarks with KAFKA-4514[^1]. However, My
> KafkaCSVMetricsReporter is not working properly. I would like to ask if
> someone on this mailing list has experienced similar cases.
> Let me explain. I configured a Kafka cluster with 3 zookeeper instances
> and 3 Kafka Broker instances. After confirming all is working correctly, I
> added following properties in server.properties file and restarted the
> brokers.
> > kafka.metrics.polling.interval.secs=5> kafka.metrics.reporters=kafka.
> metrics.KafkaCSVMetricsReporter> kafka.csv.metrics.reporter.enabled=true
> By re-running the brokers and the producer, I acquired some csv files from
> kafka_metrics directory. But there are two weird things:
> 1. Kafka brokers repeatedly try to create csv files and fail with IO
> Exception. Why they try to create files, yet the resulting csv files
> already exist?2. All cells of the resulting csv files are filled with 0,
> except their headers. Since the producer generated so many messages, they
> (e.g., BytesPerSec) cannot be 0.
> Any pieces of advices or comments are welcome. Thanks in advance.
> Regards,Dongjin
> [^1]: https://issues.apache.org/jira/browse/KAFKA-4514