[jira] [Commented] (KAFKA-5358) Consumer perf tool should count rebalance time separately

2017-08-22 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-5358?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16137766#comment-16137766
 ] 

ASF GitHub Bot commented on KAFKA-5358:
---

Github user huxihx closed the pull request at:

https://github.com/apache/kafka/pull/3188


> Consumer perf tool should count rebalance time separately
> -
>
> Key: KAFKA-5358
> URL: https://issues.apache.org/jira/browse/KAFKA-5358
> Project: Kafka
>  Issue Type: Improvement
>Reporter: Jason Gustafson
>Assignee: huxihx
>
> It would be helpful to measure rebalance time separately in the performance 
> tool so that throughput between different versions can be compared more 
> easily in spite of improvements such as 
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-134%3A+Delay+initial+consumer+group+rebalance.
>  At the moment, running the perf tool on 0.11.0 or trunk for a short amount 
> of time will present a severely skewed picture since the overall time will be 
> dominated by the join group delay.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (KAFKA-5358) Consumer perf tool should count rebalance time separately

2017-08-22 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-5358?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16137764#comment-16137764
 ] 

ASF GitHub Bot commented on KAFKA-5358:
---

GitHub user huxihx opened a pull request:

https://github.com/apache/kafka/pull/3723

KAFKA-5358: Consumer perf tool should count rebalance time.



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/huxihx/kafka KAFKA-5358

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/3723.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3723


commit 934f24e32479453ce437fef27a796c9e2b1b2514
Author: huxihx 
Date:   2017-08-23T02:07:18Z

As per Jason's comments, refined  type from  to a naive




> Consumer perf tool should count rebalance time separately
> -
>
> Key: KAFKA-5358
> URL: https://issues.apache.org/jira/browse/KAFKA-5358
> Project: Kafka
>  Issue Type: Improvement
>Reporter: Jason Gustafson
>Assignee: huxihx
>
> It would be helpful to measure rebalance time separately in the performance 
> tool so that throughput between different versions can be compared more 
> easily in spite of improvements such as 
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-134%3A+Delay+initial+consumer+group+rebalance.
>  At the moment, running the perf tool on 0.11.0 or trunk for a short amount 
> of time will present a severely skewed picture since the overall time will be 
> dominated by the join group delay.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (KAFKA-5731) Connect WorkerSinkTask out of order offset commit can lead to inconsistent state

2017-08-22 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-5731?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16137736#comment-16137736
 ] 

ASF GitHub Bot commented on KAFKA-5731:
---

Github user rhauch closed the pull request at:

https://github.com/apache/kafka/pull/3717


> Connect WorkerSinkTask out of order offset commit can lead to inconsistent 
> state
> 
>
> Key: KAFKA-5731
> URL: https://issues.apache.org/jira/browse/KAFKA-5731
> Project: Kafka
>  Issue Type: Bug
>  Components: KafkaConnect
>Reporter: Jason Gustafson
>Assignee: Randall Hauch
> Fix For: 0.10.2.2, 0.11.0.1
>
>
> In Connect's WorkerSinkTask, we do sequence number validation to ensure that 
> offset commits are handled in the right order 
> (https://github.com/apache/kafka/blob/trunk/connect/runtime/src/main/java/org/apache/kafka/connect/runtime/WorkerSinkTask.java#L199).
>  
> Unfortunately, for asynchronous commits, the {{lastCommittedOffsets}} field 
> is overridden regardless of this sequence check as long as the response had 
> no error 
> (https://github.com/apache/kafka/blob/trunk/connect/runtime/src/main/java/org/apache/kafka/connect/runtime/WorkerSinkTask.java#L284):
> {code:java}
> OffsetCommitCallback cb = new OffsetCommitCallback() {
> @Override
> public void onComplete(Map 
> offsets, Exception error) {
> if (error == null) {
> lastCommittedOffsets = offsets;
> }
> onCommitCompleted(error, seqno);
> }
> };
> {code}
> Hence if we get an out of order commit, then the internal state will be 
> inconsistent. To fix this, we should only override {{lastCommittedOffsets}} 
> after sequence validation as part of the {{onCommitCompleted(...)}} method.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (KAFKA-5603) Streams should not abort transaction when closing zombie task

2017-08-22 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-5603?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16137706#comment-16137706
 ] 

ASF GitHub Bot commented on KAFKA-5603:
---

GitHub user mjsax opened a pull request:

https://github.com/apache/kafka/pull/3722

KAFKA-5603: Don't abort TX for zombie tasks



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/mjsax/kafka 
kafka-5603-dont-abort-tx-for-zombie-tasks-01101

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/3722.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3722


commit 72cff12f2274eb16f44fde85dcc7d5ff30614a3f
Author: Matthias J. Sax 
Date:   2017-08-23T01:06:32Z

KAFKA-5603: Don't abort TX for zombie tasks




> Streams should not abort transaction when closing zombie task
> -
>
> Key: KAFKA-5603
> URL: https://issues.apache.org/jira/browse/KAFKA-5603
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Affects Versions: 0.11.0.0
>Reporter: Matthias J. Sax
>Assignee: Matthias J. Sax
>Priority: Critical
> Fix For: 0.11.0.1
>
>
> The contract of the transactional producer API is to not call any 
> transactional method after a {{ProducerFenced}} exception was thrown.
> Streams however, does an unconditional call within {{StreamTask#close()}} to 
> {{abortTransaction()}} in case of unclean shutdown. We need to distinguish 
> between a {{ProducerFenced}} and other unclean shutdown cases.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (KAFKA-5768) Upgrade ducktape version to 0.7.0, and use new kill_java_processes

2017-08-22 Thread Colin P. McCabe (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-5768?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16137689#comment-16137689
 ] 

Colin P. McCabe commented on KAFKA-5768:


https://github.com/apache/kafka/pull/3721

> Upgrade ducktape version to 0.7.0, and use new kill_java_processes
> --
>
> Key: KAFKA-5768
> URL: https://issues.apache.org/jira/browse/KAFKA-5768
> Project: Kafka
>  Issue Type: Improvement
>Reporter: Colin P. McCabe
>Assignee: Colin P. McCabe
>
> Upgrade the ducktape version to 0.7.0.  Use the new {{kill_java_processes}} 
> function in ducktape to kill only the processes that are part of a service 
> when starting or stopping a service, rather than killing all java processes 
> (in some cases)



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (KAFKA-3705) Support non-key joining in KTable

2017-08-22 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-3705?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16137686#comment-16137686
 ] 

ASF GitHub Bot commented on KAFKA-3705:
---

GitHub user guozhangwang opened a pull request:

https://github.com/apache/kafka/pull/3720

[DO NOT MERGE] KAFKA-3705: non-key joins

This is just for reviewing the diff easily to see how it is done by 
@jfillipiak.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/Kaiserchen/kafka KAFKA3705

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/3720.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3720


commit 3da2b8f787a5d30dee2de71cf0f125ab3e57d89b
Author: jfilipiak 
Date:   2017-06-30T09:00:39Z

onetomany join signature to show on mailing list

commit cc9c6f4a68170fb829adb46a6de40ec0fc75716f
Author: jfilipiak 
Date:   2017-07-12T14:49:43Z

stores

commit 807e90aac82d7659310ce92066ac1df6e339068a
Author: jfilipiak 
Date:   2017-07-26T06:06:58Z

just throw in most of the processors, wont build

commit 1a6ff7b01ad35dd7eedf4c69aa534043ab1a8eb8
Author: jfilipiak 
Date:   2017-08-18T10:07:34Z

random clean up

commit ffe9b9496afbdad73bfcb9c014b6045b8ca95e79
Author: jfilipiak 
Date:   2017-08-19T19:22:02Z

clean up as much as possible




> Support non-key joining in KTable
> -
>
> Key: KAFKA-3705
> URL: https://issues.apache.org/jira/browse/KAFKA-3705
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Reporter: Guozhang Wang
>  Labels: api
>
> Today in Kafka Streams DSL, KTable joins are only based on keys. If users 
> want to join a KTable A by key {{a}} with another KTable B by key {{b}} but 
> with a "foreign key" {{a}}, and assuming they are read from two topics which 
> are partitioned on {{a}} and {{b}} respectively, they need to do the 
> following pattern:
> {code}
> tableB' = tableB.groupBy(/* select on field "a" */).agg(...); // now tableB' 
> is partitioned on "a"
> tableA.join(tableB', joiner);
> {code}
> Even if these two tables are read from two topics which are already 
> partitioned on {{a}}, users still need to do the pre-aggregation in order to 
> make the two joining streams to be on the same key. This is a draw-back from 
> programability and we should fix it.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (KAFKA-2590) KIP-28: Kafka Streams Checklist

2017-08-22 Thread Guozhang Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-2590?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Guozhang Wang updated KAFKA-2590:
-
Fix Version/s: 0.11.0.0

> KIP-28: Kafka Streams Checklist
> ---
>
> Key: KAFKA-2590
> URL: https://issues.apache.org/jira/browse/KAFKA-2590
> Project: Kafka
>  Issue Type: New Feature
>  Components: streams
>Reporter: Guozhang Wang
>Assignee: Guozhang Wang
> Fix For: 0.11.0.0
>
>
> This is an umbrella story for the processor client and Kafka Streams feature 
> implementation.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (KAFKA-4125) Provide low-level Processor API meta data in DSL layer

2017-08-22 Thread Guozhang Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-4125?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Guozhang Wang updated KAFKA-4125:
-
Issue Type: New Feature  (was: Sub-task)
Parent: (was: KAFKA-2590)

> Provide low-level Processor API meta data in DSL layer
> --
>
> Key: KAFKA-4125
> URL: https://issues.apache.org/jira/browse/KAFKA-4125
> Project: Kafka
>  Issue Type: New Feature
>  Components: streams
>Reporter: Matthias J. Sax
>Assignee: Jeyhun Karimov
>Priority: Minor
>  Labels: kip
> Fix For: 1.0.0
>
>
> For Processor API, user can get meta data like record offset, timestamp etc 
> via the provided {{Context}} object. It might be useful to allow uses to 
> access this information in DSL layer, too.
> The idea would be, to do it "the Flink way", ie, by providing
> RichFunctions; {{mapValue()}} for example.
> Is takes a {{ValueMapper}} that only has method
> {noformat}
> V2 apply(V1 value);
> {noformat}
> Thus, you cannot get any meta data within apply (it's completely "blind").
> We would add two more interfaces: {{RichFunction}} with a method
> {{open(Context context)}} and
> {noformat}
> RichValueMapper extends ValueMapper, RichFunction
> {noformat}
> This way, the user can chose to implement Rich- or Standard-function and
> we do not need to change existing APIs. Both can be handed into
> {{KStream.mapValues()}} for example. Internally, we check if a Rich
> function is provided, and if yes, hand in the {{Context}} object once, to
> make it available to the user who can now access it within {{apply()}} -- or
> course, the user must set a member variable in {{open()}} to hold the
> reference to the Context object.
> KIP: 
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-159%3A+Introducing+Rich+functions+to+Streams



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (KAFKA-5731) Connect WorkerSinkTask out of order offset commit can lead to inconsistent state

2017-08-22 Thread Ewen Cheslack-Postava (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-5731?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ewen Cheslack-Postava updated KAFKA-5731:
-
Fix Version/s: 0.10.2.2

> Connect WorkerSinkTask out of order offset commit can lead to inconsistent 
> state
> 
>
> Key: KAFKA-5731
> URL: https://issues.apache.org/jira/browse/KAFKA-5731
> Project: Kafka
>  Issue Type: Bug
>  Components: KafkaConnect
>Reporter: Jason Gustafson
>Assignee: Randall Hauch
> Fix For: 0.10.2.2, 0.11.0.1
>
>
> In Connect's WorkerSinkTask, we do sequence number validation to ensure that 
> offset commits are handled in the right order 
> (https://github.com/apache/kafka/blob/trunk/connect/runtime/src/main/java/org/apache/kafka/connect/runtime/WorkerSinkTask.java#L199).
>  
> Unfortunately, for asynchronous commits, the {{lastCommittedOffsets}} field 
> is overridden regardless of this sequence check as long as the response had 
> no error 
> (https://github.com/apache/kafka/blob/trunk/connect/runtime/src/main/java/org/apache/kafka/connect/runtime/WorkerSinkTask.java#L284):
> {code:java}
> OffsetCommitCallback cb = new OffsetCommitCallback() {
> @Override
> public void onComplete(Map 
> offsets, Exception error) {
> if (error == null) {
> lastCommittedOffsets = offsets;
> }
> onCommitCompleted(error, seqno);
> }
> };
> {code}
> Hence if we get an out of order commit, then the internal state will be 
> inconsistent. To fix this, we should only override {{lastCommittedOffsets}} 
> after sequence validation as part of the {{onCommitCompleted(...)}} method.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Resolved] (KAFKA-2590) KIP-28: Kafka Streams Checklist

2017-08-22 Thread Guozhang Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-2590?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Guozhang Wang resolved KAFKA-2590.
--
Resolution: Fixed
  Assignee: Guozhang Wang

> KIP-28: Kafka Streams Checklist
> ---
>
> Key: KAFKA-2590
> URL: https://issues.apache.org/jira/browse/KAFKA-2590
> Project: Kafka
>  Issue Type: New Feature
>  Components: streams
>Reporter: Guozhang Wang
>Assignee: Guozhang Wang
>
> This is an umbrella story for the processor client and Kafka Streams feature 
> implementation.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (KAFKA-3429) Remove Serdes needed for repartitioning in KTable stateful operations

2017-08-22 Thread Guozhang Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-3429?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Guozhang Wang updated KAFKA-3429:
-
Issue Type: Improvement  (was: Sub-task)
Parent: (was: KAFKA-2590)

> Remove Serdes needed for repartitioning in KTable stateful operations
> -
>
> Key: KAFKA-3429
> URL: https://issues.apache.org/jira/browse/KAFKA-3429
> Project: Kafka
>  Issue Type: Improvement
>  Components: streams
>Reporter: Guozhang Wang
>  Labels: api
>
> Currently in KTable aggregate operations where a repartition is possibly 
> needed since the aggregation key may not be the same as the original primary 
> key, we require the users to provide serdes (default to configured ones) for 
> read / write to the internally created re-partition topic. However, these are 
> not necessary since for all KTable instances either generated from the topics 
> directly:
> {code}table = builder.table(...){code}
> or from aggregation operations:
> {code}table = stream.aggregate(...){code}
> There are already serde provided for materializing the data, and hence the 
> same serde can be re-used when the resulted KTable is involved in future 
> aggregation operations. For example:
> {code}
> table1 = stream.aggregateByKey(serde);
> table2 = table1.aggregate(aggregator, selector, originalSerde, 
> aggregateSerde);
> {code}
> We would not need to require users to specify the "originalSerde" in 
> table1.aggregate since it could always reuse the "serde" from 
> stream.aggregateByKey, which is used to materialize the table1 object.
> In order to get ride of it, implementation-wise we need to carry the serde 
> information along with the KTableImpl instance in order to re-use it in a 
> future operation that requires repartitioning.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (KAFKA-4113) Allow KTable bootstrap

2017-08-22 Thread Guozhang Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-4113?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Guozhang Wang updated KAFKA-4113:
-
Issue Type: New Feature  (was: Sub-task)
Parent: (was: KAFKA-2590)

> Allow KTable bootstrap
> --
>
> Key: KAFKA-4113
> URL: https://issues.apache.org/jira/browse/KAFKA-4113
> Project: Kafka
>  Issue Type: New Feature
>  Components: streams
>Reporter: Matthias J. Sax
>Assignee: Guozhang Wang
>
> On the mailing list, there are multiple request about the possibility to 
> "fully populate" a KTable before actual stream processing start.
> Even if it is somewhat difficult to define, when the initial populating phase 
> should end, there are multiple possibilities:
> The main idea is, that there is a rarely updated topic that contains the 
> data. Only after this topic got read completely and the KTable is ready, the 
> application should start processing. This would indicate, that on startup, 
> the current partition sizes must be fetched and stored, and after KTable got 
> populated up to those offsets, stream processing can start.
> Other discussed ideas are:
> 1) an initial fixed time period for populating
> (it might be hard for a user to estimate the correct value)
> 2) an "idle" period, ie, if no update to a KTable for a certain time is
> done, we consider it as populated
> 3) a timestamp cut off point, ie, all records with an older timestamp
> belong to the initial populating phase
> The API change is not decided yet, and the API desing is part of this JIRA.
> One suggestion (for option (4)) was:
> {noformat}
> KTable table = builder.table("topic", 1000); // populate the table without 
> reading any other topics until see one record with timestamp 1000.
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (KAFKA-3429) Remove Serdes needed for repartitioning in KTable stateful operations

2017-08-22 Thread Guozhang Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-3429?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Guozhang Wang updated KAFKA-3429:
-
Labels: api  (was: api newbie++)

> Remove Serdes needed for repartitioning in KTable stateful operations
> -
>
> Key: KAFKA-3429
> URL: https://issues.apache.org/jira/browse/KAFKA-3429
> Project: Kafka
>  Issue Type: Sub-task
>  Components: streams
>Reporter: Guozhang Wang
>  Labels: api
>
> Currently in KTable aggregate operations where a repartition is possibly 
> needed since the aggregation key may not be the same as the original primary 
> key, we require the users to provide serdes (default to configured ones) for 
> read / write to the internally created re-partition topic. However, these are 
> not necessary since for all KTable instances either generated from the topics 
> directly:
> {code}table = builder.table(...){code}
> or from aggregation operations:
> {code}table = stream.aggregate(...){code}
> There are already serde provided for materializing the data, and hence the 
> same serde can be re-used when the resulted KTable is involved in future 
> aggregation operations. For example:
> {code}
> table1 = stream.aggregateByKey(serde);
> table2 = table1.aggregate(aggregator, selector, originalSerde, 
> aggregateSerde);
> {code}
> We would not need to require users to specify the "originalSerde" in 
> table1.aggregate since it could always reuse the "serde" from 
> stream.aggregateByKey, which is used to materialize the table1 object.
> In order to get ride of it, implementation-wise we need to carry the serde 
> information along with the KTableImpl instance in order to re-use it in a 
> future operation that requires repartitioning.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (KAFKA-5768) Upgrade ducktape version to 0.7.0, and use new kill_java_processes

2017-08-22 Thread Colin P. McCabe (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-5768?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Colin P. McCabe updated KAFKA-5768:
---
Description: Upgrade the ducktape version to 0.7.0.  Use the new 
{{kill_java_processes}} function in ducktape to kill only the processes that 
are part of a service when starting or stopping a service, rather than killing 
all java processes (in some cases)  (was: Upgrade the ducktape version to 
0.7.0.  Use the new `kill_java_processes` function in ducktape to kill only the 
processes that are part of a service when starting or stopping a service, 
rather than killing all java processes (in some cases))

> Upgrade ducktape version to 0.7.0, and use new kill_java_processes
> --
>
> Key: KAFKA-5768
> URL: https://issues.apache.org/jira/browse/KAFKA-5768
> Project: Kafka
>  Issue Type: Improvement
>Reporter: Colin P. McCabe
>Assignee: Colin P. McCabe
>
> Upgrade the ducktape version to 0.7.0.  Use the new {{kill_java_processes}} 
> function in ducktape to kill only the processes that are part of a service 
> when starting or stopping a service, rather than killing all java processes 
> (in some cases)



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (KAFKA-5768) Upgrade ducktape version to 0.7.0, and use new kill_java_processes

2017-08-22 Thread Colin P. McCabe (JIRA)
Colin P. McCabe created KAFKA-5768:
--

 Summary: Upgrade ducktape version to 0.7.0, and use new 
kill_java_processes
 Key: KAFKA-5768
 URL: https://issues.apache.org/jira/browse/KAFKA-5768
 Project: Kafka
  Issue Type: Improvement
Reporter: Colin P. McCabe
Assignee: Colin P. McCabe


Upgrade the ducktape version to 0.7.0.  Use the new `kill_java_processes` 
function in ducktape to kill only the processes that are part of a service when 
starting or stopping a service, rather than killing all java processes (in some 
cases)



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (KAFKA-5603) Streams should not abort transaction when closing zombie task

2017-08-22 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-5603?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16137647#comment-16137647
 ] 

ASF GitHub Bot commented on KAFKA-5603:
---

GitHub user mjsax opened a pull request:

https://github.com/apache/kafka/pull/3719

KAFKA-5603: Don't abort TX for zombie tasks



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/mjsax/kafka 
kafka-5603-dont-abort-tx-for-zombie-tasks-2

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/3719.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3719


commit d1814d2e657dea4fb37b98305cc0f960119a123f
Author: Matthias J. Sax 
Date:   2017-08-23T00:09:52Z

KAFKA-5603: Don't abort TX for zombie tasks




> Streams should not abort transaction when closing zombie task
> -
>
> Key: KAFKA-5603
> URL: https://issues.apache.org/jira/browse/KAFKA-5603
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Affects Versions: 0.11.0.0
>Reporter: Matthias J. Sax
>Assignee: Matthias J. Sax
>Priority: Critical
> Fix For: 0.11.0.1
>
>
> The contract of the transactional producer API is to not call any 
> transactional method after a {{ProducerFenced}} exception was thrown.
> Streams however, does an unconditional call within {{StreamTask#close()}} to 
> {{abortTransaction()}} in case of unclean shutdown. We need to distinguish 
> between a {{ProducerFenced}} and other unclean shutdown cases.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (KAFKA-5767) Kafka server should halt if IBP < 1.0.0 and there is log directory failure

2017-08-22 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-5767?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16137619#comment-16137619
 ] 

ASF GitHub Bot commented on KAFKA-5767:
---

GitHub user lindong28 opened a pull request:

https://github.com/apache/kafka/pull/3718

KAFKA-5767; Kafka server should halt if IBP < 1.0.0 and there is log 
directory failure



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/lindong28/kafka KAFKA-5767

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/3718.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3718


commit e3ca5fa9b5d78b987652bb2d1b7600e2df992109
Author: Dong Lin 
Date:   2017-08-22T23:40:17Z

KAFKA-5767; Kafka server should halt if IBP < 1.0.0 and there is log 
directory failure




> Kafka server should halt if IBP < 1.0.0 and there is log directory failure
> --
>
> Key: KAFKA-5767
> URL: https://issues.apache.org/jira/browse/KAFKA-5767
> Project: Kafka
>  Issue Type: Bug
>Reporter: Dong Lin
>Assignee: Dong Lin
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (KAFKA-5731) Connect WorkerSinkTask out of order offset commit can lead to inconsistent state

2017-08-22 Thread Randall Hauch (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-5731?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16137610#comment-16137610
 ] 

Randall Hauch commented on KAFKA-5731:
--

[~hachikuji] or [~ewencp]: I've reopened the issue so this can be backported to 
the {{0.10.2}} branch, and added [this pull 
request|https://github.com/apache/kafka/pull/3717] for that branch. Thanks!

> Connect WorkerSinkTask out of order offset commit can lead to inconsistent 
> state
> 
>
> Key: KAFKA-5731
> URL: https://issues.apache.org/jira/browse/KAFKA-5731
> Project: Kafka
>  Issue Type: Bug
>  Components: KafkaConnect
>Reporter: Jason Gustafson
>Assignee: Randall Hauch
> Fix For: 0.11.0.1
>
>
> In Connect's WorkerSinkTask, we do sequence number validation to ensure that 
> offset commits are handled in the right order 
> (https://github.com/apache/kafka/blob/trunk/connect/runtime/src/main/java/org/apache/kafka/connect/runtime/WorkerSinkTask.java#L199).
>  
> Unfortunately, for asynchronous commits, the {{lastCommittedOffsets}} field 
> is overridden regardless of this sequence check as long as the response had 
> no error 
> (https://github.com/apache/kafka/blob/trunk/connect/runtime/src/main/java/org/apache/kafka/connect/runtime/WorkerSinkTask.java#L284):
> {code:java}
> OffsetCommitCallback cb = new OffsetCommitCallback() {
> @Override
> public void onComplete(Map 
> offsets, Exception error) {
> if (error == null) {
> lastCommittedOffsets = offsets;
> }
> onCommitCompleted(error, seqno);
> }
> };
> {code}
> Hence if we get an out of order commit, then the internal state will be 
> inconsistent. To fix this, we should only override {{lastCommittedOffsets}} 
> after sequence validation as part of the {{onCommitCompleted(...)}} method.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (KAFKA-5731) Connect WorkerSinkTask out of order offset commit can lead to inconsistent state

2017-08-22 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-5731?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16137605#comment-16137605
 ] 

ASF GitHub Bot commented on KAFKA-5731:
---

GitHub user rhauch opened a pull request:

https://github.com/apache/kafka/pull/3717

KAFKA-5731 Corrected how the sink task worker updates the last committed 
offsets (0.10.2)

Prior to this change, it was possible for the synchronous consumer commit 
request to be handled before previously-submitted asynchronous commit requests. 
If that happened, the out-of-order handlers improperly set the last committed 
offsets, which then became inconsistent with the offsets the connector task is 
working with.

This change ensures that the last committed offsets are updated only for 
the most recent commit request, even if the consumer reorders the calls to the 
callbacks.

This change also backports the fix for KAFKA-4942, which was minimal that 
caused the new tests to fail.

**This is for the `0.10.2` branch; see #3662 for the equivalent and 
already-approved PR for `trunk` and #3672 for the equivalent and 
already-approved PR for the `0.11.0` branch.**

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/rhauch/kafka kafka-5731-0.10.2

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/3717.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3717


commit 4e39118ae302980e6d8b75cb09534172d99d5b7a
Author: Randall Hauch 
Date:   2017-08-12T00:42:06Z

KAFKA-5731 Corrected how the sink task worker updates the last committed 
offsets

Prior to this change, it was possible for the synchronous consumer commit 
request to be handled before previously-submitted asynchronous commit requests. 
If that happened, the out-of-order handlers improperly set the last committed 
offsets, which then became inconsistent with the offsets the connector task is 
working with.

This change ensures that the last committed offsets are updated only for 
the most recent commit request, even if the consumer reorders the calls to the 
callbacks.

commit 6f74ef8e89f1193a5deb66870022591d51eb6580
Author: Randall Hauch 
Date:   2017-08-14T19:11:08Z

KAFKA-5731 Corrected mock consumer behavior during rebalance

Corrects the test case added in the previous commit to properly revoke the 
existing partition assignments before adding new partition assigments.

commit 3c2531b1abdaf3cdaac3781a45597a616652ff1c
Author: Randall Hauch 
Date:   2017-08-14T19:11:45Z

KAFKA-5731 Added expected call that was missing in another test

commit 05567b1677e7f5a39ca0f20d86773c872193da0b
Author: Randall Hauch 
Date:   2017-08-14T22:24:35Z

KAFKA-5731 Improved log messages related to offset commits

# Conflicts:
#   
connect/runtime/src/main/java/org/apache/kafka/connect/runtime/WorkerSinkTask.java

commit 60eba0ce024f3a211f33bf20bc04febbebb7d1c4
Author: Randall Hauch 
Date:   2017-08-15T14:47:05Z

KAFKA-5731 More cleanup of log messages related to offset commits

commit ff123bfb910742e3a5c320fff6b23ff645ef62a2
Author: Randall Hauch 
Date:   2017-08-15T16:21:52Z

KAFKA-5731 More improvements to the log messages in WorkerSinkTask

# Conflicts:
#   
connect/runtime/src/main/java/org/apache/kafka/connect/runtime/WorkerSinkTask.java

commit d5f1b29a4cb41c139094cbed9b78cf51594f861c
Author: Randall Hauch 
Date:   2017-08-15T16:31:28Z

KAFKA-5731 Removed unnecessary log message

commit f2b02cda83876b7d55f331cf89d9a306ab2b467f
Author: Randall Hauch 
Date:   2017-08-15T17:54:16Z

KAFKA-5731 Additional tweaks to debug and trace log messages to ensure 
clarity and usefulness

commit fa427b7557b93a600e0007e1a4adfb4aa38f526b
Author: Randall Hauch 
Date:   2017-08-15T19:30:09Z

KAFKA-5731 Use the correct value in trace messages

commit 957f0acac6154fda6b522e89c00df3dcb299
Author: Randall Hauch 
Date:   2017-08-22T23:29:52Z

KAFKA-4942 Fix commitTimeoutMs being set before the commit actually started

Backported the fix for this issue, which was fixed in 0.11.0.0




> Connect WorkerSinkTask out of order offset commit can lead to inconsistent 
> state
> 
>
> Key: KAFKA-5731
> URL: https://issues.apache.org/jira/browse/KAFKA-5731
> Project: Kafka
>  Issue Type: Bug
>  Components: KafkaConnect
>Reporter: Jason Gustafson
>Assignee: Randall Hauch
> Fix For: 

[jira] [Created] (KAFKA-5767) Kafka server should halt if IBP < 1.0.0 and there is log directory failure

2017-08-22 Thread Dong Lin (JIRA)
Dong Lin created KAFKA-5767:
---

 Summary: Kafka server should halt if IBP < 1.0.0 and there is log 
directory failure
 Key: KAFKA-5767
 URL: https://issues.apache.org/jira/browse/KAFKA-5767
 Project: Kafka
  Issue Type: Bug
Reporter: Dong Lin
Assignee: Dong Lin






--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Assigned] (KAFKA-5156) Options for handling exceptions in streams

2017-08-22 Thread Matthias J. Sax (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-5156?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matthias J. Sax reassigned KAFKA-5156:
--

Assignee: Matthias J. Sax  (was: Eno Thereska)

> Options for handling exceptions in streams
> --
>
> Key: KAFKA-5156
> URL: https://issues.apache.org/jira/browse/KAFKA-5156
> Project: Kafka
>  Issue Type: Task
>  Components: streams
>Affects Versions: 0.10.2.0
>Reporter: Eno Thereska
>Assignee: Matthias J. Sax
>  Labels: user-experience
> Fix For: 1.0.0
>
>
> This is a task around options for handling exceptions in streams. It focuses 
> around options for dealing with corrupt data (keep going, stop streams, log, 
> retry, etc).



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (KAFKA-3473) Add controller channel manager request queue time metric.

2017-08-22 Thread Ismael Juma (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-3473?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16137320#comment-16137320
 ] 

Ismael Juma commented on KAFKA-3473:


[~omkreddy], we added metrics for the controller channel queue size, but not a 
queue time metric.

> Add controller channel manager request queue time metric.
> -
>
> Key: KAFKA-3473
> URL: https://issues.apache.org/jira/browse/KAFKA-3473
> Project: Kafka
>  Issue Type: Improvement
>  Components: controller
>Affects Versions: 0.10.0.0
>Reporter: Jiangjie Qin
>Assignee: Dong Lin
>
> Currently controller appends the requests to brokers into controller channel 
> manager queue during state transition. i.e. the state transition are 
> propagated asynchronously. We need to track the request queue time on the 
> controller side to see how long the state propagation is delayed after the 
> state transition finished on the controller.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (KAFKA-3473) Add controller channel manager request queue time metric.

2017-08-22 Thread Manikumar (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-3473?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16137281#comment-16137281
 ] 

Manikumar commented on KAFKA-3473:
--

@ijuma Is this covered in KAFKA-5135/KIP-143?

> Add controller channel manager request queue time metric.
> -
>
> Key: KAFKA-3473
> URL: https://issues.apache.org/jira/browse/KAFKA-3473
> Project: Kafka
>  Issue Type: Improvement
>  Components: controller
>Affects Versions: 0.10.0.0
>Reporter: Jiangjie Qin
>Assignee: Dong Lin
>
> Currently controller appends the requests to brokers into controller channel 
> manager queue during state transition. i.e. the state transition are 
> propagated asynchronously. We need to track the request queue time on the 
> controller side to see how long the state propagation is delayed after the 
> state transition finished on the controller.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Resolved] (KAFKA-3653) expose the queue size in ControllerChannelManager

2017-08-22 Thread Manikumar (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-3653?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Manikumar resolved KAFKA-3653.
--
Resolution: Fixed

Fixed in KAFKA-5135/KIP-143

> expose the queue size in ControllerChannelManager
> -
>
> Key: KAFKA-3653
> URL: https://issues.apache.org/jira/browse/KAFKA-3653
> Project: Kafka
>  Issue Type: Improvement
>Reporter: Jun Rao
>Assignee: Gwen Shapira
>
> Currently, ControllerChannelManager maintains a queue per broker. If the 
> queue fills up, metadata propagation to the broker is delayed. It would be 
> useful to expose a metric on the size on the queue for monitoring.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Resolved] (KAFKA-3800) java client can`t poll msg

2017-08-22 Thread Manikumar (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-3800?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Manikumar resolved KAFKA-3800.
--
Resolution: Cannot Reproduce

 Please reopen if the issue still exists. 


> java client can`t poll msg
> --
>
> Key: KAFKA-3800
> URL: https://issues.apache.org/jira/browse/KAFKA-3800
> Project: Kafka
>  Issue Type: Bug
>  Components: consumer
>Affects Versions: 0.9.0.1
> Environment: java8,win7 64
>Reporter: frank
>Assignee: Neha Narkhede
>
> i use hump topic name, after poll msg is null.eg: Test_4 why?
> all low char is ok. i`m try nodejs,kafka-console-consumers.bat is ok



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Resolved] (KAFKA-3927) kafka broker config docs issue

2017-08-22 Thread Manikumar (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-3927?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Manikumar resolved KAFKA-3927.
--
Resolution: Later

Yes, These changes are done in KAFKA-615.  Please reopen if the issue still 
exists. 


> kafka broker config docs issue
> --
>
> Key: KAFKA-3927
> URL: https://issues.apache.org/jira/browse/KAFKA-3927
> Project: Kafka
>  Issue Type: Bug
>  Components: website
>Affects Versions: 0.10.0.0
>Reporter: Shawn Guo
>Priority: Minor
>
> https://kafka.apache.org/documentation.html#brokerconfigs
> log.flush.interval.messages 
> default value is "9223372036854775807"
> log.flush.interval.ms 
> default value is null
> log.flush.scheduler.interval.ms 
> default value is "9223372036854775807"
> etc. obviously these default values are incorrect. how these doc get 
> generated ? it looks confusing. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Comment Edited] (KAFKA-5749) Refactor SessionStore hierarchy

2017-08-22 Thread Andrey Dyachkov (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-5749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16137226#comment-16137226
 ] 

Andrey Dyachkov edited comment on KAFKA-5749 at 8/22/17 7:13 PM:
-

[~miguno] It would be very nice if someone guide me through at that point, 
becasue I do not understand how it works here. 
I can explain more: I have already made a couple contributions 
([KAFKA-4643|https://issues.apache.org/jira/browse/KAFKA-4643], 
[KAFKA-4657|https://issues.apache.org/jira/browse/KAFKA-4657]). I had decided I 
could move futher with it and took 
[KAFKA-5723|https://issues.apache.org/jira/browse/KAFKA-5723], created 
[pr|https://github.com/apache/kafka/pull/3671] and asked commiters to help me 
with that, but I have not got any response yet(7 days). I even wrote email to 
dev list asking adding me to contributors list and helping with 
[KAFKA-5723|https://issues.apache.org/jira/browse/KAFKA-5723], but no reponse. 
I think it is related to the priority of the tickets, which I do not know and 
maybe number of people supporting it. That is why I ended up commenting on the 
tickets which are subsets of improvments, which is being developed by commiters.

I am ready to spent my time on learning and contributing to client side 
code(clients, admin, streams etc.), since my main tech stack is around Java. 
Sorry for writting it here, I hope you guys can help me with that.


was (Author: andrey.dyach...@gmail.com):
[~miguno] It would be very nice if someone will guide me through at that point, 
becasue I do not understand how it works here. 
I can explain more: I have already made a couple contributions 
([KAFKA-4643|https://issues.apache.org/jira/browse/KAFKA-4643], 
[KAFKA-4657|https://issues.apache.org/jira/browse/KAFKA-4657]). I had decided I 
could move futher with it and took 
[KAFKA-5723|https://issues.apache.org/jira/browse/KAFKA-5723], created 
[pr|https://github.com/apache/kafka/pull/3671] and asked commiters to help me 
with that, but I have not got any response yet(7 days). I even wrote email to 
dev list asking adding me to contributors list and helping with 
[KAFKA-5723|https://issues.apache.org/jira/browse/KAFKA-5723], but no reponse. 
I think it is related to the priority of the tickets, which I do not know and 
maybe number of people supporting it. That is why I ended up commenting on the 
tickets which are subsets of improvments, which is being developed by commiters.

I am ready to spent my time on learning and contributing to client side 
code(clients, admin, streams etc.), since my main tech stack is around Java. 
Sorry for writting it here, I hope you guys can help me with that.

> Refactor SessionStore hierarchy
> ---
>
> Key: KAFKA-5749
> URL: https://issues.apache.org/jira/browse/KAFKA-5749
> Project: Kafka
>  Issue Type: Sub-task
>  Components: streams
>Affects Versions: 1.0.0
>Reporter: Damian Guy
>Assignee: Damian Guy
> Fix For: 1.0.0
>
>
> In order to support bytes store we need to create a MeteredSessionStore and 
> ChangeloggingSessionStore. We then need to refactor the current SessionStore 
> implementations to use this. All inner stores should by of type  byte[]>



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (KAFKA-5749) Refactor SessionStore hierarchy

2017-08-22 Thread Andrey Dyachkov (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-5749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16137226#comment-16137226
 ] 

Andrey Dyachkov commented on KAFKA-5749:


[~miguno] It would be very nice if someone will guide me through at that point, 
becasue I do not understand how it works here. 
I can explain more: I have already made a couple contributions 
([KAFKA-4643|https://issues.apache.org/jira/browse/KAFKA-4643], 
[KAFKA-4657|https://issues.apache.org/jira/browse/KAFKA-4657]). I had decided I 
could move futher with it and took 
[KAFKA-5723|https://issues.apache.org/jira/browse/KAFKA-5723], created 
[pr|https://github.com/apache/kafka/pull/3671] and asked commiters to help me 
with that, but I have not got any response yet(7 days). I even wrote email to 
dev list asking adding me to contributors list and helping with 
[KAFKA-5723|https://issues.apache.org/jira/browse/KAFKA-5723], but no reponse. 
I think it is related to the priority of the tickets, which I do not know and 
maybe number of people supporting it. That is why I ended up commenting on the 
tickets which are subsets of improvments, which is being developed by commiters.

I am ready to spent my time on learning and contributing to client side 
code(clients, admin, streams etc.), since my main tech stack is around Java. 
Sorry for writting it here, I hope you guys can help me with that.

> Refactor SessionStore hierarchy
> ---
>
> Key: KAFKA-5749
> URL: https://issues.apache.org/jira/browse/KAFKA-5749
> Project: Kafka
>  Issue Type: Sub-task
>  Components: streams
>Affects Versions: 1.0.0
>Reporter: Damian Guy
>Assignee: Damian Guy
> Fix For: 1.0.0
>
>
> In order to support bytes store we need to create a MeteredSessionStore and 
> ChangeloggingSessionStore. We then need to refactor the current SessionStore 
> implementations to use this. All inner stores should by of type  byte[]>



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (KAFKA-5732) Kafka 0.11 Consumer.Poll() hangs for consumer.subscribe()

2017-08-22 Thread Ramkumar (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-5732?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16137222#comment-16137222
 ] 

Ramkumar commented on KAFKA-5732:
-

I found if this points to old data log directory (which had Kafka 0.8 was 
using) then this issue happens. If I point to new directory , then it appears 
to work fine. Any pointer why it is not working when referring to kafka 0.8 
data log folders? how it needs to make compatible. I had set up  the message 
format and protocol as below
inter.broker.protocol.version=0.11.0
log.message.format.version=0.11.0

> Kafka 0.11 Consumer.Poll() hangs for consumer.subscribe() 
> --
>
> Key: KAFKA-5732
> URL: https://issues.apache.org/jira/browse/KAFKA-5732
> Project: Kafka
>  Issue Type: Bug
>  Components: consumer
>Affects Versions: 0.11.0.0
> Environment: Linux
>Reporter: Ramkumar
> Attachments: dumptest5
>
>
> Hi,
> I am upgraded my 3 node kafka cluster from 0.8 to 0.11 broker. I am trying to 
> test the new consumer APIs.
> Below is the code extract. consumer.poll() method goes for a toss (thread 
> dump attached) for consumer.subscribe() method . This poll returns value if I 
> use consumer.seek() methods. Please let me know what i am doing incorrectly. 
> i have the advertised.host and listeners updated okay in server.properties. 
> Thread dump attached.
>   Properties props1 = new Properties();
> props1.put("bootstrap.servers", "localhost:9092");
> props1.put("group.id", "test3");
>  props1.put("enable.auto.commit", "false");
>   props1.put("auto_offset_reset", "earliest");
> props1.put("request.timeout.ms", 3);
> props1.put("key.deserializer", 
> "org.apache.kafka.common.serialization.StringDeserializer");
> props1.put("value.deserializer", 
> "org.apache.kafka.common.serialization.StringDeserializer");
> String TestTopic="T3";
> KafkaConsumer consumer1 = new 
> KafkaConsumer<>(props1);
> consumer1.subscribe(Arrays.asList(TestTopic));
> int j = 0;
> while (j < 10) {
> j++;
> ConsumerRecords 
> records1=consumer1.poll(100);
>   for (ConsumerRecord record1 : 
> records1) {
> System.out.printf("offset = %d, key = 
> %s, value = %s", record1.offset(), record1.key(),
> record1.value());
> String t = record1.value();
>  out.write(t.getBytes());
> }



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (KAFKA-5766) Very high CPU-load of consumer when broker is down

2017-08-22 Thread Sebastian Bernauer (JIRA)
Sebastian Bernauer created KAFKA-5766:
-

 Summary: Very high CPU-load of consumer when broker is down
 Key: KAFKA-5766
 URL: https://issues.apache.org/jira/browse/KAFKA-5766
 Project: Kafka
  Issue Type: Bug
  Components: consumer
Reporter: Sebastian Bernauer


Hi,
i have a single broker instance at localhost.
I set up a Consumer with the following code:
{code:java}
Map configs = new HashMap<>();
configs.put(ConsumerConfig.BOOTSTRAP_SERVERS_CONFIG, "localhost:9092");
configs.put(ConsumerConfig.GROUP_ID_CONFIG, "gh399");
configs.put(ConsumerConfig.KEY_DESERIALIZER_CLASS_CONFIG, 
StringDeserializer.class);
configs.put(ConsumerConfig.VALUE_DESERIALIZER_CLASS_CONFIG, 
StringDeserializer.class);
KafkaConsumer consumer = new KafkaConsumer<>(configs);
consumer.assign(Collections.singletonList(new TopicPartition("foo", 
0)));
while (true) {
ConsumerRecords records = consumer.poll(1000);
System.out.println(records.count());
}
{code}
This works all fine, until i shut down the broker.
If i do so, it causes a 100% CPU-load of my application.
After starting the broker again the usage decreases back to a normal level. 

It would be very nice if you could help me!
Thanks,
Sebastian

Spring-Kafka: 2.0.0.M3
Kafka: 0.10.2.0
JDK: 1.8.0_121



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Resolved] (KAFKA-4823) Creating Kafka Producer on application running on Java older version

2017-08-22 Thread Manikumar (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-4823?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Manikumar resolved KAFKA-4823.
--
Resolution: Won't Fix

Kafka Dropped support for Java 1.6 from 0.9 release. You can try Rest 
Proxy/Other language libraries.  Please reopen if you think otherwise

> Creating Kafka Producer on application running on Java older version
> 
>
> Key: KAFKA-4823
> URL: https://issues.apache.org/jira/browse/KAFKA-4823
> Project: Kafka
>  Issue Type: Bug
>Affects Versions: 0.10.1.1
> Environment: Application running on Java 1.6
>Reporter: live2code
>
> I have an application running on Java 1.6 which cannot be upgraded.This 
> application need to have interfaces to post (producer )messages to Kafka 
> server remote box.Also receive messages as consumer .The code runs fine from 
> my local env which is on java 1.7.But the same producer and consumer fails 
> when executed within the application with the error
> Caused by: java.lang.UnsupportedClassVersionError: JVMCFRE003 bad major 
> version; class=org/apache/kafka/clients/producer/KafkaProducer, offset=6
> Is there someway I can still do it ?



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (KAFKA-5765) Move merge() from StreamsBuilder to KStream

2017-08-22 Thread Guozhang Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-5765?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Guozhang Wang updated KAFKA-5765:
-
Fix Version/s: 1.0.0

> Move merge() from StreamsBuilder to KStream
> ---
>
> Key: KAFKA-5765
> URL: https://issues.apache.org/jira/browse/KAFKA-5765
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Affects Versions: 0.11.0.0
>Reporter: Matthias J. Sax
>Assignee: Matthias J. Sax
>  Labels: needs-kip
> Fix For: 1.0.0
>
>
> Merging multiple {{KStream}} is done via {{StreamsBuilder#merge()}} (formally 
> {{KStreamBuilder#merge()}}). This is quite unnatural and should be done via 
> {{KStream#merge()}}.
> As {{StreamsBuilder}} is not released yet, this is not a backward 
> incompatible change (and KStreamBuilder is already deprecated). We still need 
> a KIP as we add a new method to a public {{KStreams}} API.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (KAFKA-5765) Move merge() from StreamsBuilder to KStream

2017-08-22 Thread Guozhang Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-5765?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16137146#comment-16137146
 ] 

Guozhang Wang commented on KAFKA-5765:
--

Sounds reasonable to me.

> Move merge() from StreamsBuilder to KStream
> ---
>
> Key: KAFKA-5765
> URL: https://issues.apache.org/jira/browse/KAFKA-5765
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Affects Versions: 0.11.0.0
>Reporter: Matthias J. Sax
>Assignee: Matthias J. Sax
>  Labels: needs-kip
> Fix For: 1.0.0
>
>
> Merging multiple {{KStream}} is done via {{StreamsBuilder#merge()}} (formally 
> {{KStreamBuilder#merge()}}). This is quite unnatural and should be done via 
> {{KStream#merge()}}.
> As {{StreamsBuilder}} is not released yet, this is not a backward 
> incompatible change (and KStreamBuilder is already deprecated). We still need 
> a KIP as we add a new method to a public {{KStreams}} API.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (KAFKA-5152) Kafka Streams keeps restoring state after shutdown is initiated during startup

2017-08-22 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-5152?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16137135#comment-16137135
 ] 

ASF GitHub Bot commented on KAFKA-5152:
---

Github user asfgit closed the pull request at:

https://github.com/apache/kafka/pull/3675


> Kafka Streams keeps restoring state after shutdown is initiated during startup
> --
>
> Key: KAFKA-5152
> URL: https://issues.apache.org/jira/browse/KAFKA-5152
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Affects Versions: 0.10.2.1
>Reporter: Xavier Léauté
>Assignee: Damian Guy
>Priority: Blocker
> Fix For: 0.11.0.1, 1.0.0
>
>
> If streams shutdown is initiated during state restore (e.g. an uncaught 
> exception is thrown) streams will not shut down until all stores are first 
> finished restoring.
> As restore progresses, stream threads appear to be taken out of service as 
> part of the shutdown sequence, causing rebalancing of tasks. This compounds 
> the problem by slowing down the restore process even further, since the 
> remaining threads now have to also restore the reassigned tasks before they 
> can shut down.
> A more severe issue is that if there is a new rebalance triggered during the 
> end of the waitingSync phase (e.g. due to a new member joining the group, or 
> some members timed out the SyncGroup response), then some consumer clients of 
> the group may already proceed with the {{onPartitionsAssigned}} and blocked 
> on trying to grab the file dir lock not yet released from other clients, 
> while the other clients holding the lock are consistently re-sending 
> {{JoinGroup}} requests while the rebalance cannot be completed because the 
> clients blocked on the file dir lock will not be kicked out of the group as 
> its heartbeat thread has been consistently sending HBRequest. Hence this is a 
> deadlock caused by not releasing the file dir locks in task suspension.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Comment Edited] (KAFKA-5696) SourceConnector does not commit offset on rebalance

2017-08-22 Thread Oleg Kuznetsov (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-5696?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16137100#comment-16137100
 ] 

Oleg Kuznetsov edited comment on KAFKA-5696 at 8/22/17 5:56 PM:


[~rhauch] Is it possible to add waiting all the tasks to finish their current 
loop (i.e. let producer to finish writing records, commit their offsets) before 
rebalancing?


was (Author: olkuznsmith):
[~rhauch] Is it possible to add waiting all the task to finish their current 
loop (i.e. let producer to finish writing records, commit their offsets) before 
rebalancing?

> SourceConnector does not commit offset on rebalance
> ---
>
> Key: KAFKA-5696
> URL: https://issues.apache.org/jira/browse/KAFKA-5696
> Project: Kafka
>  Issue Type: Bug
>  Components: KafkaConnect
>Reporter: Oleg Kuznetsov
>Priority: Critical
>  Labels: newbie
>
> I'm running SourceConnector, that reads files from storage and put data in 
> kafka. I want, in case of reconfiguration, offsets to be flushed. 
> Say, a file is completely processed, but source records are not yet committed 
> and in case of reconfiguration their offsets might be missing in store.
> Is it possible to force committing offsets on reconfiguration?



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (KAFKA-5696) SourceConnector does not commit offset on rebalance

2017-08-22 Thread Oleg Kuznetsov (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-5696?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16137100#comment-16137100
 ] 

Oleg Kuznetsov commented on KAFKA-5696:
---

[~rhauch] Is it possible to add waiting all the task to finish their current 
loop (i.e. let producer to finish writing records, commit their offsets) before 
rebalancing?

> SourceConnector does not commit offset on rebalance
> ---
>
> Key: KAFKA-5696
> URL: https://issues.apache.org/jira/browse/KAFKA-5696
> Project: Kafka
>  Issue Type: Bug
>  Components: KafkaConnect
>Reporter: Oleg Kuznetsov
>Priority: Critical
>  Labels: newbie
>
> I'm running SourceConnector, that reads files from storage and put data in 
> kafka. I want, in case of reconfiguration, offsets to be flushed. 
> Say, a file is completely processed, but source records are not yet committed 
> and in case of reconfiguration their offsets might be missing in store.
> Is it possible to force committing offsets on reconfiguration?



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Assigned] (KAFKA-5547) Return topic authorization failed if no topic describe access

2017-08-22 Thread Manikumar (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-5547?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Manikumar reassigned KAFKA-5547:


Assignee: Manikumar

> Return topic authorization failed if no topic describe access
> -
>
> Key: KAFKA-5547
> URL: https://issues.apache.org/jira/browse/KAFKA-5547
> Project: Kafka
>  Issue Type: Improvement
>Reporter: Jason Gustafson
>Assignee: Manikumar
>  Labels: security, usability
> Fix For: 1.0.0
>
>
> We previously made a change to several of the request APIs to return 
> UNKNOWN_TOPIC_OR_PARTITION if the principal does not have Describe access to 
> the topic. The thought was to avoid leaking information about which topics 
> exist. The problem with this is that a client which sees this error will just 
> keep retrying because it is usually treated as retriable. It seems, however, 
> that we could return TOPIC_AUTHORIZATION_FAILED instead and still avoid 
> leaking information as long as we ensure that the Describe authorization 
> check comes before the topic existence check. This would avoid the ambiguity 
> on the client.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (KAFKA-5603) Streams should not abort transaction when closing zombie task

2017-08-22 Thread Matthias J. Sax (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-5603?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matthias J. Sax updated KAFKA-5603:
---
Fix Version/s: (was: 0.11.0.2)
   0.11.0.1

> Streams should not abort transaction when closing zombie task
> -
>
> Key: KAFKA-5603
> URL: https://issues.apache.org/jira/browse/KAFKA-5603
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Affects Versions: 0.11.0.0
>Reporter: Matthias J. Sax
>Assignee: Matthias J. Sax
>Priority: Critical
> Fix For: 0.11.0.1
>
>
> The contract of the transactional producer API is to not call any 
> transactional method after a {{ProducerFenced}} exception was thrown.
> Streams however, does an unconditional call within {{StreamTask#close()}} to 
> {{abortTransaction()}} in case of unclean shutdown. We need to distinguish 
> between a {{ProducerFenced}} and other unclean shutdown cases.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (KAFKA-5749) Refactor SessionStore hierarchy

2017-08-22 Thread Michael Noll (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-5749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16137085#comment-16137085
 ] 

Michael Noll commented on KAFKA-5749:
-

Thanks for wanting to contribute, Andrey!  Perhaps you could work on some of 
the other open JIRAs?  (Feel free to reach out to us if you don't know which 
would be good starting points.)

> Refactor SessionStore hierarchy
> ---
>
> Key: KAFKA-5749
> URL: https://issues.apache.org/jira/browse/KAFKA-5749
> Project: Kafka
>  Issue Type: Sub-task
>  Components: streams
>Affects Versions: 1.0.0
>Reporter: Damian Guy
>Assignee: Damian Guy
> Fix For: 1.0.0
>
>
> In order to support bytes store we need to create a MeteredSessionStore and 
> ChangeloggingSessionStore. We then need to refactor the current SessionStore 
> implementations to use this. All inner stores should by of type  byte[]>



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Resolved] (KAFKA-5401) Kafka Over TLS Error - Failed to send SSL Close message - Broken Pipe

2017-08-22 Thread Manikumar (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-5401?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Manikumar resolved KAFKA-5401.
--
Resolution: Duplicate

> Kafka Over TLS Error - Failed to send SSL Close message - Broken Pipe
> -
>
> Key: KAFKA-5401
> URL: https://issues.apache.org/jira/browse/KAFKA-5401
> Project: Kafka
>  Issue Type: Bug
>  Components: KafkaConnect
>Affects Versions: 0.10.1.1
> Environment: SLES 11 , Kakaf Over TLS 
>Reporter: PaVan
>  Labels: security
>
> SLES 11 
> WARN Failed to send SSL Close message 
> (org.apache.kafka.common.network.SslTransportLayer)
> java.io.IOException: Broken pipe
>  at sun.nio.ch.FileDispatcherImpl.write0(Native Method)
>  at sun.nio.ch.SocketDispatcher.write(SocketDispatcher.java:47)
>  at sun.nio.ch.IOUtil.writeFromNativeBuffer(IOUtil.java:93)
>  at sun.nio.ch.IOUtil.write(IOUtil.java:65)
>  at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:471)
>  at 
> org.apache.kafka.common.network.SslTransportLayer.flush(SslTransportLayer.java:194)
>  at 
> org.apache.kafka.common.network.SslTransportLayer.close(SslTransportLayer.java:148)
>  at 
> org.apache.kafka.common.network.KafkaChannel.close(KafkaChannel.java:45)
>  at org.apache.kafka.common.network.Selector.close(Selector.java:442)
>  at org.apache.kafka.common.network.Selector.poll(Selector.java:310)
>  at 
> org.apache.kafka.clients.NetworkClient.poll(NetworkClient.java:256)
>  at 
> org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:216)
>  at 
> org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:128)
>  at java.lang.Thread.run(Thread.java:745)



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (KAFKA-5765) Move merge() from StreamsBuilder to KStream

2017-08-22 Thread JIRA

[ 
https://issues.apache.org/jira/browse/KAFKA-5765?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16137081#comment-16137081
 ] 

Xavier Léauté commented on KAFKA-5765:
--

I have a small request when it comes to merge(). The current varargs form 
generates a lot of compiler warnings that need to be suppressed using 
{{@SuppressWarnings("unchecked")}}.

Given that the typical merge use-case only involves only a handful of streams, 
I think it would be useful to provide a couple of overloads that take a fixed 
number of arguments, similar to what Guave does in 
[ImmutableList.of(...)|https://google.github.io/guava/releases/21.0/api/docs/com/google/common/collect/ImmutableList.html#of-E-E-E-E-E-E-E-E-E-E-E-]


> Move merge() from StreamsBuilder to KStream
> ---
>
> Key: KAFKA-5765
> URL: https://issues.apache.org/jira/browse/KAFKA-5765
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Affects Versions: 0.11.0.0
>Reporter: Matthias J. Sax
>Assignee: Matthias J. Sax
>  Labels: needs-kip
>
> Merging multiple {{KStream}} is done via {{StreamsBuilder#merge()}} (formally 
> {{KStreamBuilder#merge()}}). This is quite unnatural and should be done via 
> {{KStream#merge()}}.
> As {{StreamsBuilder}} is not released yet, this is not a backward 
> incompatible change (and KStreamBuilder is already deprecated). We still need 
> a KIP as we add a new method to a public {{KStreams}} API.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (KAFKA-5765) Move merge() from StreamsBuilder to KStream

2017-08-22 Thread Matthias J. Sax (JIRA)
Matthias J. Sax created KAFKA-5765:
--

 Summary: Move merge() from StreamsBuilder to KStream
 Key: KAFKA-5765
 URL: https://issues.apache.org/jira/browse/KAFKA-5765
 Project: Kafka
  Issue Type: Bug
  Components: streams
Affects Versions: 0.11.0.0
Reporter: Matthias J. Sax
Assignee: Matthias J. Sax


Merging multiple {{KStream}} is done via {{StreamsBuilder#merge()}} (formally 
{{KStreamBuilder#merge()}}). This is quite unnatural and should be done via 
{{KStream#merge()}}.

As {{StreamsBuilder}} is not released yet, this is not a backward incompatible 
change (and KStreamBuilder is already deprecated). We still need a KIP as we 
add a new method to a public {{KStreams}} API.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (KAFKA-5758) Reassigning a topic's partitions can adversely impact other topics

2017-08-22 Thread David van Geest (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-5758?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16136984#comment-16136984
 ] 

David van Geest commented on KAFKA-5758:


Awesome, Option 3 makes sense to me. Sorry for not communicating my ideas very 
clearly there.

> Reassigning a topic's partitions can adversely impact other topics
> --
>
> Key: KAFKA-5758
> URL: https://issues.apache.org/jira/browse/KAFKA-5758
> Project: Kafka
>  Issue Type: Bug
>  Components: core
>Affects Versions: 0.10.1.1
>Reporter: David van Geest
>  Labels: reliability
> Fix For: 1.0.0
>
>
> We've noticed that reassigning a topic's partitions seems to adversely impact 
> other topics. Specifically, followers for other topics fall out of the ISR.
> While I'm not 100% sure about why this happens, the scenario seems to be as 
> follows:
> 1. Reassignment is manually triggered on topic-partition X-Y, and broker A 
> (which used to be a follower for X-Y) is no longer a follower.
> 2. Broker A makes `FetchRequest` including topic-partition X-Y to broker B, 
> just after the reassignment.
> 3. Broker B can fulfill the `FetchRequest`, but while trying to do so it 
> tries to record the position of "follower" A. This fails, because broker A is 
> no longer a follower for X-Y (see exception below).
> 4. The entire `FetchRequest` request fails, and broker A's other followed 
> topics start falling behind.
> 5. Depending on the length of the reassignment, this sequence repeats.
> In step 3, we see exceptions like:
> {noformat}
> Error when handling request Name: FetchRequest; Version: 3; CorrelationId: 
> 46781859; ClientId: ReplicaFetcherThread-0-1001; ReplicaId: 1006; MaxWait: 
> 500 ms; MinBytes: 1 bytes; MaxBytes:10485760 bytes; RequestInfo: 
> 
> kafka.common.NotAssignedReplicaException: Leader 1001 failed to record 
> follower 1006's position -1 since the replica is not recognized to be one of 
> the assigned replicas 1001,1004,1005 for partition [topic_being_reassigned,5].
> at kafka.cluster.Partition.updateReplicaLogReadResult(Partition.scala:249)
>   at 
> kafka.server.ReplicaManager$$anonfun$updateFollowerLogReadResults$2.apply(ReplicaManager.scala:923)
>   at 
> kafka.server.ReplicaManager$$anonfun$updateFollowerLogReadResults$2.apply(ReplicaManager.scala:920)
>   at 
> scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
>   at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:48)
>   at 
> kafka.server.ReplicaManager.updateFollowerLogReadResults(ReplicaManager.scala:920)
>   at kafka.server.ReplicaManager.fetchMessages(ReplicaManager.scala:481)
>   at kafka.server.KafkaApis.handleFetchRequest(KafkaApis.scala:534)
>   at kafka.server.KafkaApis.handle(KafkaApis.scala:79)
>   at kafka.server.KafkaRequestHandler.run(KafkaRequestHandler.scala:60)
>   at java.lang.Thread.run(Thread.java:745)
> {noformat}
> Does my assessment make sense? If so, this behaviour seems problematic. A few 
> changes that might improve matters (assuming I'm on the right track):
> 1. `FetchRequest` should be able to return partial results
> 2. The broker fulfilling the `FetchRequest` could ignore the 
> `NotAssignedReplicaException`, and return results without recording the 
> not-any-longer-follower position.
> This behaviour was experienced with 0.10.1.1, although looking at the 
> changelogs and the code in question, I don't see any reason why it would have 
> changed in later versions.
> Am very interested to have some discussion on this. Thanks!



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Assigned] (KAFKA-5762) Refactor AdminClient to use LogContext

2017-08-22 Thread Kamal Chandraprakash (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-5762?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kamal Chandraprakash reassigned KAFKA-5762:
---

Assignee: Kamal Chandraprakash

> Refactor AdminClient to use LogContext
> --
>
> Key: KAFKA-5762
> URL: https://issues.apache.org/jira/browse/KAFKA-5762
> Project: Kafka
>  Issue Type: Improvement
>Reporter: Ismael Juma
>Assignee: Kamal Chandraprakash
>
> We added a LogContext object which automatically adds a log prefix to every 
> message written by loggers constructed from it (much like the Logging mixin 
> available in the server code). We use this in the consumer to ensure that 
> messages always contain the consumer group and client ids, which is very 
> helpful when multiple consumers are run on the same instance. We should do 
> something similar for the AdminClient. We should always include the client id.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (KAFKA-5758) Reassigning a topic's partitions can adversely impact other topics

2017-08-22 Thread Ismael Juma (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-5758?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16136959#comment-16136959
 ] 

Ismael Juma commented on KAFKA-5758:


Option 3 is a normal pattern for the Kafka protocol (since APIs often involve 
batches) and we don't consider it a partial result. If that's what you meant 
when you described option 1, then great.

> Reassigning a topic's partitions can adversely impact other topics
> --
>
> Key: KAFKA-5758
> URL: https://issues.apache.org/jira/browse/KAFKA-5758
> Project: Kafka
>  Issue Type: Bug
>  Components: core
>Affects Versions: 0.10.1.1
>Reporter: David van Geest
>  Labels: reliability
> Fix For: 1.0.0
>
>
> We've noticed that reassigning a topic's partitions seems to adversely impact 
> other topics. Specifically, followers for other topics fall out of the ISR.
> While I'm not 100% sure about why this happens, the scenario seems to be as 
> follows:
> 1. Reassignment is manually triggered on topic-partition X-Y, and broker A 
> (which used to be a follower for X-Y) is no longer a follower.
> 2. Broker A makes `FetchRequest` including topic-partition X-Y to broker B, 
> just after the reassignment.
> 3. Broker B can fulfill the `FetchRequest`, but while trying to do so it 
> tries to record the position of "follower" A. This fails, because broker A is 
> no longer a follower for X-Y (see exception below).
> 4. The entire `FetchRequest` request fails, and broker A's other followed 
> topics start falling behind.
> 5. Depending on the length of the reassignment, this sequence repeats.
> In step 3, we see exceptions like:
> {noformat}
> Error when handling request Name: FetchRequest; Version: 3; CorrelationId: 
> 46781859; ClientId: ReplicaFetcherThread-0-1001; ReplicaId: 1006; MaxWait: 
> 500 ms; MinBytes: 1 bytes; MaxBytes:10485760 bytes; RequestInfo: 
> 
> kafka.common.NotAssignedReplicaException: Leader 1001 failed to record 
> follower 1006's position -1 since the replica is not recognized to be one of 
> the assigned replicas 1001,1004,1005 for partition [topic_being_reassigned,5].
> at kafka.cluster.Partition.updateReplicaLogReadResult(Partition.scala:249)
>   at 
> kafka.server.ReplicaManager$$anonfun$updateFollowerLogReadResults$2.apply(ReplicaManager.scala:923)
>   at 
> kafka.server.ReplicaManager$$anonfun$updateFollowerLogReadResults$2.apply(ReplicaManager.scala:920)
>   at 
> scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
>   at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:48)
>   at 
> kafka.server.ReplicaManager.updateFollowerLogReadResults(ReplicaManager.scala:920)
>   at kafka.server.ReplicaManager.fetchMessages(ReplicaManager.scala:481)
>   at kafka.server.KafkaApis.handleFetchRequest(KafkaApis.scala:534)
>   at kafka.server.KafkaApis.handle(KafkaApis.scala:79)
>   at kafka.server.KafkaRequestHandler.run(KafkaRequestHandler.scala:60)
>   at java.lang.Thread.run(Thread.java:745)
> {noformat}
> Does my assessment make sense? If so, this behaviour seems problematic. A few 
> changes that might improve matters (assuming I'm on the right track):
> 1. `FetchRequest` should be able to return partial results
> 2. The broker fulfilling the `FetchRequest` could ignore the 
> `NotAssignedReplicaException`, and return results without recording the 
> not-any-longer-follower position.
> This behaviour was experienced with 0.10.1.1, although looking at the 
> changelogs and the code in question, I don't see any reason why it would have 
> changed in later versions.
> Am very interested to have some discussion on this. Thanks!



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (KAFKA-5758) Reassigning a topic's partitions can adversely impact other topics

2017-08-22 Thread Ismael Juma (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-5758?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16136980#comment-16136980
 ] 

Ismael Juma commented on KAFKA-5758:


Yes.

> Reassigning a topic's partitions can adversely impact other topics
> --
>
> Key: KAFKA-5758
> URL: https://issues.apache.org/jira/browse/KAFKA-5758
> Project: Kafka
>  Issue Type: Bug
>  Components: core
>Affects Versions: 0.10.1.1
>Reporter: David van Geest
>  Labels: reliability
> Fix For: 1.0.0
>
>
> We've noticed that reassigning a topic's partitions seems to adversely impact 
> other topics. Specifically, followers for other topics fall out of the ISR.
> While I'm not 100% sure about why this happens, the scenario seems to be as 
> follows:
> 1. Reassignment is manually triggered on topic-partition X-Y, and broker A 
> (which used to be a follower for X-Y) is no longer a follower.
> 2. Broker A makes `FetchRequest` including topic-partition X-Y to broker B, 
> just after the reassignment.
> 3. Broker B can fulfill the `FetchRequest`, but while trying to do so it 
> tries to record the position of "follower" A. This fails, because broker A is 
> no longer a follower for X-Y (see exception below).
> 4. The entire `FetchRequest` request fails, and broker A's other followed 
> topics start falling behind.
> 5. Depending on the length of the reassignment, this sequence repeats.
> In step 3, we see exceptions like:
> {noformat}
> Error when handling request Name: FetchRequest; Version: 3; CorrelationId: 
> 46781859; ClientId: ReplicaFetcherThread-0-1001; ReplicaId: 1006; MaxWait: 
> 500 ms; MinBytes: 1 bytes; MaxBytes:10485760 bytes; RequestInfo: 
> 
> kafka.common.NotAssignedReplicaException: Leader 1001 failed to record 
> follower 1006's position -1 since the replica is not recognized to be one of 
> the assigned replicas 1001,1004,1005 for partition [topic_being_reassigned,5].
> at kafka.cluster.Partition.updateReplicaLogReadResult(Partition.scala:249)
>   at 
> kafka.server.ReplicaManager$$anonfun$updateFollowerLogReadResults$2.apply(ReplicaManager.scala:923)
>   at 
> kafka.server.ReplicaManager$$anonfun$updateFollowerLogReadResults$2.apply(ReplicaManager.scala:920)
>   at 
> scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
>   at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:48)
>   at 
> kafka.server.ReplicaManager.updateFollowerLogReadResults(ReplicaManager.scala:920)
>   at kafka.server.ReplicaManager.fetchMessages(ReplicaManager.scala:481)
>   at kafka.server.KafkaApis.handleFetchRequest(KafkaApis.scala:534)
>   at kafka.server.KafkaApis.handle(KafkaApis.scala:79)
>   at kafka.server.KafkaRequestHandler.run(KafkaRequestHandler.scala:60)
>   at java.lang.Thread.run(Thread.java:745)
> {noformat}
> Does my assessment make sense? If so, this behaviour seems problematic. A few 
> changes that might improve matters (assuming I'm on the right track):
> 1. `FetchRequest` should be able to return partial results
> 2. The broker fulfilling the `FetchRequest` could ignore the 
> `NotAssignedReplicaException`, and return results without recording the 
> not-any-longer-follower position.
> This behaviour was experienced with 0.10.1.1, although looking at the 
> changelogs and the code in question, I don't see any reason why it would have 
> changed in later versions.
> Am very interested to have some discussion on this. Thanks!



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (KAFKA-5696) SourceConnector does not commit offset on rebalance

2017-08-22 Thread Randall Hauch (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-5696?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16136976#comment-16136976
 ] 

Randall Hauch commented on KAFKA-5696:
--

[~olkuznsmith], adding, updating, or removing a connector configuration will 
cause Kafka Connect worker cluster to perform a _rebalance_ that will stop, 
redistribute, and restart all of the remaining connectors.

> SourceConnector does not commit offset on rebalance
> ---
>
> Key: KAFKA-5696
> URL: https://issues.apache.org/jira/browse/KAFKA-5696
> Project: Kafka
>  Issue Type: Bug
>  Components: KafkaConnect
>Reporter: Oleg Kuznetsov
>Priority: Critical
>  Labels: newbie
>
> I'm running SourceConnector, that reads files from storage and put data in 
> kafka. I want, in case of reconfiguration, offsets to be flushed. 
> Say, a file is completely processed, but source records are not yet committed 
> and in case of reconfiguration their offsets might be missing in store.
> Is it possible to force committing offsets on reconfiguration?



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (KAFKA-5758) Reassigning a topic's partitions can adversely impact other topics

2017-08-22 Thread David van Geest (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-5758?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16136974#comment-16136974
 ] 

David van Geest commented on KAFKA-5758:


Ah OK. So Option 3 still returns data for the other partitions then? If so, 
that's what I meant.

> Reassigning a topic's partitions can adversely impact other topics
> --
>
> Key: KAFKA-5758
> URL: https://issues.apache.org/jira/browse/KAFKA-5758
> Project: Kafka
>  Issue Type: Bug
>  Components: core
>Affects Versions: 0.10.1.1
>Reporter: David van Geest
>  Labels: reliability
> Fix For: 1.0.0
>
>
> We've noticed that reassigning a topic's partitions seems to adversely impact 
> other topics. Specifically, followers for other topics fall out of the ISR.
> While I'm not 100% sure about why this happens, the scenario seems to be as 
> follows:
> 1. Reassignment is manually triggered on topic-partition X-Y, and broker A 
> (which used to be a follower for X-Y) is no longer a follower.
> 2. Broker A makes `FetchRequest` including topic-partition X-Y to broker B, 
> just after the reassignment.
> 3. Broker B can fulfill the `FetchRequest`, but while trying to do so it 
> tries to record the position of "follower" A. This fails, because broker A is 
> no longer a follower for X-Y (see exception below).
> 4. The entire `FetchRequest` request fails, and broker A's other followed 
> topics start falling behind.
> 5. Depending on the length of the reassignment, this sequence repeats.
> In step 3, we see exceptions like:
> {noformat}
> Error when handling request Name: FetchRequest; Version: 3; CorrelationId: 
> 46781859; ClientId: ReplicaFetcherThread-0-1001; ReplicaId: 1006; MaxWait: 
> 500 ms; MinBytes: 1 bytes; MaxBytes:10485760 bytes; RequestInfo: 
> 
> kafka.common.NotAssignedReplicaException: Leader 1001 failed to record 
> follower 1006's position -1 since the replica is not recognized to be one of 
> the assigned replicas 1001,1004,1005 for partition [topic_being_reassigned,5].
> at kafka.cluster.Partition.updateReplicaLogReadResult(Partition.scala:249)
>   at 
> kafka.server.ReplicaManager$$anonfun$updateFollowerLogReadResults$2.apply(ReplicaManager.scala:923)
>   at 
> kafka.server.ReplicaManager$$anonfun$updateFollowerLogReadResults$2.apply(ReplicaManager.scala:920)
>   at 
> scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
>   at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:48)
>   at 
> kafka.server.ReplicaManager.updateFollowerLogReadResults(ReplicaManager.scala:920)
>   at kafka.server.ReplicaManager.fetchMessages(ReplicaManager.scala:481)
>   at kafka.server.KafkaApis.handleFetchRequest(KafkaApis.scala:534)
>   at kafka.server.KafkaApis.handle(KafkaApis.scala:79)
>   at kafka.server.KafkaRequestHandler.run(KafkaRequestHandler.scala:60)
>   at java.lang.Thread.run(Thread.java:745)
> {noformat}
> Does my assessment make sense? If so, this behaviour seems problematic. A few 
> changes that might improve matters (assuming I'm on the right track):
> 1. `FetchRequest` should be able to return partial results
> 2. The broker fulfilling the `FetchRequest` could ignore the 
> `NotAssignedReplicaException`, and return results without recording the 
> not-any-longer-follower position.
> This behaviour was experienced with 0.10.1.1, although looking at the 
> changelogs and the code in question, I don't see any reason why it would have 
> changed in later versions.
> Am very interested to have some discussion on this. Thanks!



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (KAFKA-5748) Fix console producer to set timestamp and partition

2017-08-22 Thread Ismael Juma (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-5748?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ismael Juma updated KAFKA-5748:
---
Fix Version/s: (was: 0.11.0.1)
   1.0.0

> Fix console producer to set timestamp and partition
> ---
>
> Key: KAFKA-5748
> URL: https://issues.apache.org/jira/browse/KAFKA-5748
> Project: Kafka
>  Issue Type: Improvement
>Reporter: Ran Ma
> Fix For: 1.0.0
>
>
> https://github.com/apache/kafka/pull/3689



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (KAFKA-5758) Reassigning a topic's partitions can adversely impact other topics

2017-08-22 Thread David van Geest (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-5758?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16136933#comment-16136933
 ] 

David van Geest commented on KAFKA-5758:


[~ijuma], thanks for the response!

I'm not sure I understand the distinction between my option 1 and your option 
3. In both, we're talking about returning partial results (along with an error 
of sorts for the partition that is no longer being followed) in the response to 
`FetchRequest` right? 

> Reassigning a topic's partitions can adversely impact other topics
> --
>
> Key: KAFKA-5758
> URL: https://issues.apache.org/jira/browse/KAFKA-5758
> Project: Kafka
>  Issue Type: Bug
>  Components: core
>Affects Versions: 0.10.1.1
>Reporter: David van Geest
>  Labels: reliability
> Fix For: 1.0.0
>
>
> We've noticed that reassigning a topic's partitions seems to adversely impact 
> other topics. Specifically, followers for other topics fall out of the ISR.
> While I'm not 100% sure about why this happens, the scenario seems to be as 
> follows:
> 1. Reassignment is manually triggered on topic-partition X-Y, and broker A 
> (which used to be a follower for X-Y) is no longer a follower.
> 2. Broker A makes `FetchRequest` including topic-partition X-Y to broker B, 
> just after the reassignment.
> 3. Broker B can fulfill the `FetchRequest`, but while trying to do so it 
> tries to record the position of "follower" A. This fails, because broker A is 
> no longer a follower for X-Y (see exception below).
> 4. The entire `FetchRequest` request fails, and broker A's other followed 
> topics start falling behind.
> 5. Depending on the length of the reassignment, this sequence repeats.
> In step 3, we see exceptions like:
> {noformat}
> Error when handling request Name: FetchRequest; Version: 3; CorrelationId: 
> 46781859; ClientId: ReplicaFetcherThread-0-1001; ReplicaId: 1006; MaxWait: 
> 500 ms; MinBytes: 1 bytes; MaxBytes:10485760 bytes; RequestInfo: 
> 
> kafka.common.NotAssignedReplicaException: Leader 1001 failed to record 
> follower 1006's position -1 since the replica is not recognized to be one of 
> the assigned replicas 1001,1004,1005 for partition [topic_being_reassigned,5].
> at kafka.cluster.Partition.updateReplicaLogReadResult(Partition.scala:249)
>   at 
> kafka.server.ReplicaManager$$anonfun$updateFollowerLogReadResults$2.apply(ReplicaManager.scala:923)
>   at 
> kafka.server.ReplicaManager$$anonfun$updateFollowerLogReadResults$2.apply(ReplicaManager.scala:920)
>   at 
> scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
>   at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:48)
>   at 
> kafka.server.ReplicaManager.updateFollowerLogReadResults(ReplicaManager.scala:920)
>   at kafka.server.ReplicaManager.fetchMessages(ReplicaManager.scala:481)
>   at kafka.server.KafkaApis.handleFetchRequest(KafkaApis.scala:534)
>   at kafka.server.KafkaApis.handle(KafkaApis.scala:79)
>   at kafka.server.KafkaRequestHandler.run(KafkaRequestHandler.scala:60)
>   at java.lang.Thread.run(Thread.java:745)
> {noformat}
> Does my assessment make sense? If so, this behaviour seems problematic. A few 
> changes that might improve matters (assuming I'm on the right track):
> 1. `FetchRequest` should be able to return partial results
> 2. The broker fulfilling the `FetchRequest` could ignore the 
> `NotAssignedReplicaException`, and return results without recording the 
> not-any-longer-follower position.
> This behaviour was experienced with 0.10.1.1, although looking at the 
> changelogs and the code in question, I don't see any reason why it would have 
> changed in later versions.
> Am very interested to have some discussion on this. Thanks!



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (KAFKA-5465) FetchResponse v0 does not return any messages when max_bytes smaller than v2 message set

2017-08-22 Thread Ismael Juma (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-5465?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ismael Juma updated KAFKA-5465:
---
Fix Version/s: (was: 0.11.0.1)

> FetchResponse v0 does not return any messages when max_bytes smaller than v2 
> message set 
> -
>
> Key: KAFKA-5465
> URL: https://issues.apache.org/jira/browse/KAFKA-5465
> Project: Kafka
>  Issue Type: Bug
>Affects Versions: 0.11.0.0
>Reporter: Dana Powers
>Assignee: Jason Gustafson
>Priority: Blocker
>
> In prior releases, when consuming uncompressed messages, FetchResponse v0 
> will return a message if it is smaller than the max_bytes sent in the 
> FetchRequest. In 0.11.0.0 RC0, when messages are stored as v2 internally, the 
> response will be empty unless the full MessageSet is smaller than max_bytes. 
> In some configurations, this may cause old consumers to get stuck on large 
> messages where previously they were able to make progress one message at a 
> time.
> For example, when I produce 10 5KB messages using ProduceRequest v0 and then 
> attempt FetchRequest v0 with partition max bytes = 6KB (larger than a single 
> message but smaller than all 10 messages together), I get an empty message 
> set from 0.11.0.0. Previous brokers would have returned a single message.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (KAFKA-5681) jarAll does not build all scala versions anymore.

2017-08-22 Thread Ismael Juma (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-5681?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ismael Juma updated KAFKA-5681:
---
Fix Version/s: (was: 0.11.0.1)

> jarAll does not build all scala versions anymore.
> -
>
> Key: KAFKA-5681
> URL: https://issues.apache.org/jira/browse/KAFKA-5681
> Project: Kafka
>  Issue Type: Bug
>  Components: build
>Affects Versions: 0.11.0.0
>Reporter: Jiangjie Qin
>Assignee: Jiangjie Qin
>
> ./gradlew jarAll no longer builds jars for all scala versions. We should use 
> {{availableScalaVersions}} instead of {{defaultScalaVersions}} when build. We 
> probably should consider backporting the fix to 0.11.0.0.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (KAFKA-5764) KafkaShortnamer should allow for case inensitive matches

2017-08-22 Thread Ryan P (JIRA)
Ryan P created KAFKA-5764:
-

 Summary: KafkaShortnamer should allow for case inensitive matches 
 Key: KAFKA-5764
 URL: https://issues.apache.org/jira/browse/KAFKA-5764
 Project: Kafka
  Issue Type: Improvement
  Components: security
Affects Versions: 0.11.0.0
Reporter: Ryan P


Currently it does not appear that the KafkaShortnamer allows for case 
insensitive search and replace rules. 

It would be good to match the functionality provided by HDFS as operators are 
familiar with this. This also makes it easier to port auth_to_local rules from 
your existing hdfs configurations to your new kafka configuration. 

HWX auth_to_local guide for reference

https://community.hortonworks.com/articles/14463/auth-to-local-rules-syntax.html




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (KAFKA-5752) Delete topic and re-create topic immediate will delete the new topic's timeindex

2017-08-22 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-5752?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16136846#comment-16136846
 ] 

ASF GitHub Bot commented on KAFKA-5752:
---

Github user asfgit closed the pull request at:

https://github.com/apache/kafka/pull/3700


> Delete topic and re-create topic immediate will delete the new topic's 
> timeindex 
> -
>
> Key: KAFKA-5752
> URL: https://issues.apache.org/jira/browse/KAFKA-5752
> Project: Kafka
>  Issue Type: Bug
>Affects Versions: 0.10.2.0, 0.11.0.0
>Reporter: Pengwei
>Assignee: Manikumar
>Priority: Critical
>  Labels: reliability
> Fix For: 0.11.0.1, 1.0.0
>
>
> When we delete the topic and re-create the topic with the same name, we will 
> find after the 
> async delete topic is finished,  async delete will remove the newly created 
> topic's time index.
> This is because in the LogManager's asyncDelete, it will change the log and 
> index's file pointer to the renamed log and index, but missing the time 
> index. So will cause this issue



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Resolved] (KAFKA-4856) Calling KafkaProducer.close() from multiple threads may cause spurious error

2017-08-22 Thread Ismael Juma (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-4856?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ismael Juma resolved KAFKA-4856.

   Resolution: Fixed
Fix Version/s: 1.0.0
   0.11.0.1

> Calling KafkaProducer.close() from multiple threads may cause spurious error
> 
>
> Key: KAFKA-4856
> URL: https://issues.apache.org/jira/browse/KAFKA-4856
> Project: Kafka
>  Issue Type: Bug
>  Components: clients
>Affects Versions: 0.9.0.0, 0.10.0.0, 0.10.2.0
>Reporter: Xavier Léauté
>Assignee: Manikumar
>Priority: Minor
> Fix For: 0.11.0.1, 1.0.0
>
>
> Calling {{KafkaProducer.close()}} from multiple threads simultaneously may 
> cause the following harmless error message to be logged. There appears to be 
> a race-condition in {{AppInfoParser.unregisterAppInfo}} that we don't guard 
> against. 
> {noformat}
> WARN Error unregistering AppInfo mbean 
> (org.apache.kafka.common.utils.AppInfoParser:71)
> javax.management.InstanceNotFoundException: 
> kafka.producer:type=app-info,id=
> at 
> com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.getMBean(DefaultMBeanServerInterceptor.java:1095)
> at 
> com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.exclusiveUnregisterMBean(DefaultMBeanServerInterceptor.java:427)
> at 
> com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.unregisterMBean(DefaultMBeanServerInterceptor.java:415)
> at 
> com.sun.jmx.mbeanserver.JmxMBeanServer.unregisterMBean(JmxMBeanServer.java:546)
> at 
> org.apache.kafka.common.utils.AppInfoParser.unregisterAppInfo(AppInfoParser.java:69)
> at 
> org.apache.kafka.clients.producer.KafkaProducer.close(KafkaProducer.java:735)
> at 
> org.apache.kafka.clients.producer.KafkaProducer.close(KafkaProducer.java:686)
> at 
> org.apache.kafka.clients.producer.KafkaProducer.close(KafkaProducer.java:665)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (KAFKA-4856) Calling KafkaProducer.close() from multiple threads may cause spurious error

2017-08-22 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-4856?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16136837#comment-16136837
 ] 

ASF GitHub Bot commented on KAFKA-4856:
---

Github user asfgit closed the pull request at:

https://github.com/apache/kafka/pull/3696


> Calling KafkaProducer.close() from multiple threads may cause spurious error
> 
>
> Key: KAFKA-4856
> URL: https://issues.apache.org/jira/browse/KAFKA-4856
> Project: Kafka
>  Issue Type: Bug
>  Components: clients
>Affects Versions: 0.9.0.0, 0.10.0.0, 0.10.2.0
>Reporter: Xavier Léauté
>Assignee: Manikumar
>Priority: Minor
>
> Calling {{KafkaProducer.close()}} from multiple threads simultaneously may 
> cause the following harmless error message to be logged. There appears to be 
> a race-condition in {{AppInfoParser.unregisterAppInfo}} that we don't guard 
> against. 
> {noformat}
> WARN Error unregistering AppInfo mbean 
> (org.apache.kafka.common.utils.AppInfoParser:71)
> javax.management.InstanceNotFoundException: 
> kafka.producer:type=app-info,id=
> at 
> com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.getMBean(DefaultMBeanServerInterceptor.java:1095)
> at 
> com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.exclusiveUnregisterMBean(DefaultMBeanServerInterceptor.java:427)
> at 
> com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.unregisterMBean(DefaultMBeanServerInterceptor.java:415)
> at 
> com.sun.jmx.mbeanserver.JmxMBeanServer.unregisterMBean(JmxMBeanServer.java:546)
> at 
> org.apache.kafka.common.utils.AppInfoParser.unregisterAppInfo(AppInfoParser.java:69)
> at 
> org.apache.kafka.clients.producer.KafkaProducer.close(KafkaProducer.java:735)
> at 
> org.apache.kafka.clients.producer.KafkaProducer.close(KafkaProducer.java:686)
> at 
> org.apache.kafka.clients.producer.KafkaProducer.close(KafkaProducer.java:665)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Comment Edited] (KAFKA-5153) KAFKA Cluster : 0.10.2.0 : Servers Getting disconnected : Service Impacting

2017-08-22 Thread Arpan (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-5153?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16136769#comment-16136769
 ] 

Arpan edited comment on KAFKA-5153 at 8/22/17 1:10 PM:
---

Hi [~arthurk] - Not sure yet what is the solution and we are also stuck and it 
is quite strange as well.

You may also want to have a look at 
https://issues.apache.org/jira/browse/KAFKA-2729 KAFKA-2729 once. This looks to 
be similar to the problem we are facing.

Regards,
Arpan Khagram
+91 8308993200


was (Author: arpan.khagram0...@gmail.com):
Hi [~arthurk] - Not sure yet what is the solution and we are also stuck and it 
is quite strange as well.

You may also want to have a look at 
https://issues.apache.org/jira/browse/KAFKA-2729 once. This looks to be similar 
to the problem we are facing.

Regards,
Arpan Khagram
+91 8308993200

> KAFKA Cluster : 0.10.2.0 : Servers Getting disconnected : Service Impacting
> ---
>
> Key: KAFKA-5153
> URL: https://issues.apache.org/jira/browse/KAFKA-5153
> Project: Kafka
>  Issue Type: Bug
>Affects Versions: 0.10.2.0, 0.11.0.0
> Environment: RHEL 6
> Java Version  1.8.0_91-b14
>Reporter: Arpan
>Priority: Critical
> Attachments: server_1_72server.log, server_2_73_server.log, 
> server_3_74Server.log, server.properties, ThreadDump_1493564142.dump, 
> ThreadDump_1493564177.dump, ThreadDump_1493564249.dump
>
>
> Hi Team, 
> I was earlier referring to issue KAFKA-4477 because the problem i am facing 
> is similar. I tried to search the same reference in release docs as well but 
> did not get anything in 0.10.1.1 or 0.10.2.0. I am currently using 
> 2.11_0.10.2.0.
> I am have 3 node cluster for KAFKA and cluster for ZK as well on the same set 
> of servers in cluster mode. We are having around 240GB of data getting 
> transferred through KAFKA everyday. What we are observing is disconnect of 
> the server from cluster and ISR getting reduced and it starts impacting 
> service.
> I have also observed file descriptor count getting increased a bit, in normal 
> circumstances we have not observed FD count more than 500 but when issue 
> started we were observing it in the range of 650-700 on all 3 servers. 
> Attaching thread dumps of all 3 servers when we started facing the issue 
> recently.
> The issue get vanished once you bounce the nodes and the set up is not 
> working more than 5 days without this issue. Attaching server logs as well.
> Kindly let me know if you need any additional information. Attaching 
> server.properties as well for one of the server (It's similar on all 3 
> serversP)



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (KAFKA-5153) KAFKA Cluster : 0.10.2.0 : Servers Getting disconnected : Service Impacting

2017-08-22 Thread Arpan (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-5153?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16136769#comment-16136769
 ] 

Arpan commented on KAFKA-5153:
--

Hi [~arthurk] - Not sure yet what is the solution and we are also stuck and it 
is quite strange as well.

You may also want to have a look at 
https://issues.apache.org/jira/browse/KAFKA-2729 once. This looks to be similar 
to the problem we are facing.

Regards,
Arpan Khagram
+91 8308993200

> KAFKA Cluster : 0.10.2.0 : Servers Getting disconnected : Service Impacting
> ---
>
> Key: KAFKA-5153
> URL: https://issues.apache.org/jira/browse/KAFKA-5153
> Project: Kafka
>  Issue Type: Bug
>Affects Versions: 0.10.2.0, 0.11.0.0
> Environment: RHEL 6
> Java Version  1.8.0_91-b14
>Reporter: Arpan
>Priority: Critical
> Attachments: server_1_72server.log, server_2_73_server.log, 
> server_3_74Server.log, server.properties, ThreadDump_1493564142.dump, 
> ThreadDump_1493564177.dump, ThreadDump_1493564249.dump
>
>
> Hi Team, 
> I was earlier referring to issue KAFKA-4477 because the problem i am facing 
> is similar. I tried to search the same reference in release docs as well but 
> did not get anything in 0.10.1.1 or 0.10.2.0. I am currently using 
> 2.11_0.10.2.0.
> I am have 3 node cluster for KAFKA and cluster for ZK as well on the same set 
> of servers in cluster mode. We are having around 240GB of data getting 
> transferred through KAFKA everyday. What we are observing is disconnect of 
> the server from cluster and ISR getting reduced and it starts impacting 
> service.
> I have also observed file descriptor count getting increased a bit, in normal 
> circumstances we have not observed FD count more than 500 but when issue 
> started we were observing it in the range of 650-700 on all 3 servers. 
> Attaching thread dumps of all 3 servers when we started facing the issue 
> recently.
> The issue get vanished once you bounce the nodes and the set up is not 
> working more than 5 days without this issue. Attaching server logs as well.
> Kindly let me know if you need any additional information. Attaching 
> server.properties as well for one of the server (It's similar on all 3 
> serversP)



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Assigned] (KAFKA-2254) The shell script should be optimized , even kafka-run-class.sh has a syntax error.

2017-08-22 Thread Manikumar (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-2254?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Manikumar reassigned KAFKA-2254:


Assignee: Manikumar

> The shell script should be optimized , even kafka-run-class.sh has a syntax 
> error.
> --
>
> Key: KAFKA-2254
> URL: https://issues.apache.org/jira/browse/KAFKA-2254
> Project: Kafka
>  Issue Type: Bug
>  Components: build
>Affects Versions: 0.8.2.1
> Environment: linux
>Reporter: Bo Wang
>Assignee: Manikumar
>  Labels: client-script, kafka-run-class.sh, shell-script
> Attachments: kafka-shell-script.patch
>
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
>  kafka-run-class.sh 128 line has a syntax error(missing a space):
> 127-loggc)
> 128 if [ -z "$KAFKA_GC_LOG_OPTS"] ; then
> 129GC_LOG_ENABLED="true"
> 130 fi
> And use the ShellCheck to check the shell scripts, the results shows some 
> errors 、 warnings and notes:
> https://github.com/koalaman/shellcheck/wiki/SC2068
> https://github.com/koalaman/shellcheck/wiki/Sc2046
> https://github.com/koalaman/shellcheck/wiki/Sc2086



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Assigned] (KAFKA-4856) Calling KafkaProducer.close() from multiple threads may cause spurious error

2017-08-22 Thread Manikumar (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-4856?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Manikumar reassigned KAFKA-4856:


Assignee: Manikumar

> Calling KafkaProducer.close() from multiple threads may cause spurious error
> 
>
> Key: KAFKA-4856
> URL: https://issues.apache.org/jira/browse/KAFKA-4856
> Project: Kafka
>  Issue Type: Bug
>  Components: clients
>Affects Versions: 0.9.0.0, 0.10.0.0, 0.10.2.0
>Reporter: Xavier Léauté
>Assignee: Manikumar
>Priority: Minor
>
> Calling {{KafkaProducer.close()}} from multiple threads simultaneously may 
> cause the following harmless error message to be logged. There appears to be 
> a race-condition in {{AppInfoParser.unregisterAppInfo}} that we don't guard 
> against. 
> {noformat}
> WARN Error unregistering AppInfo mbean 
> (org.apache.kafka.common.utils.AppInfoParser:71)
> javax.management.InstanceNotFoundException: 
> kafka.producer:type=app-info,id=
> at 
> com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.getMBean(DefaultMBeanServerInterceptor.java:1095)
> at 
> com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.exclusiveUnregisterMBean(DefaultMBeanServerInterceptor.java:427)
> at 
> com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.unregisterMBean(DefaultMBeanServerInterceptor.java:415)
> at 
> com.sun.jmx.mbeanserver.JmxMBeanServer.unregisterMBean(JmxMBeanServer.java:546)
> at 
> org.apache.kafka.common.utils.AppInfoParser.unregisterAppInfo(AppInfoParser.java:69)
> at 
> org.apache.kafka.clients.producer.KafkaProducer.close(KafkaProducer.java:735)
> at 
> org.apache.kafka.clients.producer.KafkaProducer.close(KafkaProducer.java:686)
> at 
> org.apache.kafka.clients.producer.KafkaProducer.close(KafkaProducer.java:665)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (KAFKA-5763) Refactor NetworkClient to use LogContext

2017-08-22 Thread Ismael Juma (JIRA)
Ismael Juma created KAFKA-5763:
--

 Summary: Refactor NetworkClient to use LogContext
 Key: KAFKA-5763
 URL: https://issues.apache.org/jira/browse/KAFKA-5763
 Project: Kafka
  Issue Type: Improvement
Reporter: Ismael Juma


We added a LogContext object which automatically adds a log prefix to every 
message written by loggers constructed from it (much like the Logging mixin 
available in the server code). We use this in the consumer to ensure that 
messages always contain the consumer group and client ids, which is very 
helpful when multiple consumers are run on the same instance. We should do 
something similar for the NetworkClient. We should always include the client id.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (KAFKA-5762) Refactor AdminClient to use LogContext

2017-08-22 Thread Ismael Juma (JIRA)
Ismael Juma created KAFKA-5762:
--

 Summary: Refactor AdminClient to use LogContext
 Key: KAFKA-5762
 URL: https://issues.apache.org/jira/browse/KAFKA-5762
 Project: Kafka
  Issue Type: Improvement
Reporter: Ismael Juma


We added a LogContext object which automatically adds a log prefix to every 
message written by loggers constructed from it (much like the Logging mixin 
available in the server code). We use this in the consumer to ensure that 
messages always contain the consumer group and client ids, which is very 
helpful when multiple consumers are run on the same instance. We should do 
something similar for the AdminClient. We should always include the client id.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (KAFKA-5758) Reassigning a topic's partitions can adversely impact other topics

2017-08-22 Thread Ismael Juma (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-5758?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16136708#comment-16136708
 ] 

Ismael Juma commented on KAFKA-5758:


[~junrao], what do you think we should do in this case? Option 1 doesn't seem 
right. Option 2 could work although it's a bit wasteful. An additional option:

3. Return an error for that partition (`NOT_FOLLOWER` would be appropriate, but 
we don't have that, so we we'd have to reuse an existing error)

> Reassigning a topic's partitions can adversely impact other topics
> --
>
> Key: KAFKA-5758
> URL: https://issues.apache.org/jira/browse/KAFKA-5758
> Project: Kafka
>  Issue Type: Bug
>  Components: core
>Affects Versions: 0.10.1.1
>Reporter: David van Geest
>  Labels: reliability
> Fix For: 1.0.0
>
>
> We've noticed that reassigning a topic's partitions seems to adversely impact 
> other topics. Specifically, followers for other topics fall out of the ISR.
> While I'm not 100% sure about why this happens, the scenario seems to be as 
> follows:
> 1. Reassignment is manually triggered on topic-partition X-Y, and broker A 
> (which used to be a follower for X-Y) is no longer a follower.
> 2. Broker A makes `FetchRequest` including topic-partition X-Y to broker B, 
> just after the reassignment.
> 3. Broker B can fulfill the `FetchRequest`, but while trying to do so it 
> tries to record the position of "follower" A. This fails, because broker A is 
> no longer a follower for X-Y (see exception below).
> 4. The entire `FetchRequest` request fails, and broker A's other followed 
> topics start falling behind.
> 5. Depending on the length of the reassignment, this sequence repeats.
> In step 3, we see exceptions like:
> {noformat}
> Error when handling request Name: FetchRequest; Version: 3; CorrelationId: 
> 46781859; ClientId: ReplicaFetcherThread-0-1001; ReplicaId: 1006; MaxWait: 
> 500 ms; MinBytes: 1 bytes; MaxBytes:10485760 bytes; RequestInfo: 
> 
> kafka.common.NotAssignedReplicaException: Leader 1001 failed to record 
> follower 1006's position -1 since the replica is not recognized to be one of 
> the assigned replicas 1001,1004,1005 for partition [topic_being_reassigned,5].
> at kafka.cluster.Partition.updateReplicaLogReadResult(Partition.scala:249)
>   at 
> kafka.server.ReplicaManager$$anonfun$updateFollowerLogReadResults$2.apply(ReplicaManager.scala:923)
>   at 
> kafka.server.ReplicaManager$$anonfun$updateFollowerLogReadResults$2.apply(ReplicaManager.scala:920)
>   at 
> scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
>   at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:48)
>   at 
> kafka.server.ReplicaManager.updateFollowerLogReadResults(ReplicaManager.scala:920)
>   at kafka.server.ReplicaManager.fetchMessages(ReplicaManager.scala:481)
>   at kafka.server.KafkaApis.handleFetchRequest(KafkaApis.scala:534)
>   at kafka.server.KafkaApis.handle(KafkaApis.scala:79)
>   at kafka.server.KafkaRequestHandler.run(KafkaRequestHandler.scala:60)
>   at java.lang.Thread.run(Thread.java:745)
> {noformat}
> Does my assessment make sense? If so, this behaviour seems problematic. A few 
> changes that might improve matters (assuming I'm on the right track):
> 1. `FetchRequest` should be able to return partial results
> 2. The broker fulfilling the `FetchRequest` could ignore the 
> `NotAssignedReplicaException`, and return results without recording the 
> not-any-longer-follower position.
> This behaviour was experienced with 0.10.1.1, although looking at the 
> changelogs and the code in question, I don't see any reason why it would have 
> changed in later versions.
> Am very interested to have some discussion on this. Thanks!



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (KAFKA-5758) Reassigning a topic's partitions can adversely impact other topics

2017-08-22 Thread Ismael Juma (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-5758?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ismael Juma updated KAFKA-5758:
---
Fix Version/s: (was: 0.11.0.1)

> Reassigning a topic's partitions can adversely impact other topics
> --
>
> Key: KAFKA-5758
> URL: https://issues.apache.org/jira/browse/KAFKA-5758
> Project: Kafka
>  Issue Type: Bug
>  Components: core
>Affects Versions: 0.10.1.1
>Reporter: David van Geest
>  Labels: reliability
> Fix For: 1.0.0
>
>
> We've noticed that reassigning a topic's partitions seems to adversely impact 
> other topics. Specifically, followers for other topics fall out of the ISR.
> While I'm not 100% sure about why this happens, the scenario seems to be as 
> follows:
> 1. Reassignment is manually triggered on topic-partition X-Y, and broker A 
> (which used to be a follower for X-Y) is no longer a follower.
> 2. Broker A makes `FetchRequest` including topic-partition X-Y to broker B, 
> just after the reassignment.
> 3. Broker B can fulfill the `FetchRequest`, but while trying to do so it 
> tries to record the position of "follower" A. This fails, because broker A is 
> no longer a follower for X-Y (see exception below).
> 4. The entire `FetchRequest` request fails, and broker A's other followed 
> topics start falling behind.
> 5. Depending on the length of the reassignment, this sequence repeats.
> In step 3, we see exceptions like:
> {noformat}
> Error when handling request Name: FetchRequest; Version: 3; CorrelationId: 
> 46781859; ClientId: ReplicaFetcherThread-0-1001; ReplicaId: 1006; MaxWait: 
> 500 ms; MinBytes: 1 bytes; MaxBytes:10485760 bytes; RequestInfo: 
> 
> kafka.common.NotAssignedReplicaException: Leader 1001 failed to record 
> follower 1006's position -1 since the replica is not recognized to be one of 
> the assigned replicas 1001,1004,1005 for partition [topic_being_reassigned,5].
> at kafka.cluster.Partition.updateReplicaLogReadResult(Partition.scala:249)
>   at 
> kafka.server.ReplicaManager$$anonfun$updateFollowerLogReadResults$2.apply(ReplicaManager.scala:923)
>   at 
> kafka.server.ReplicaManager$$anonfun$updateFollowerLogReadResults$2.apply(ReplicaManager.scala:920)
>   at 
> scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
>   at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:48)
>   at 
> kafka.server.ReplicaManager.updateFollowerLogReadResults(ReplicaManager.scala:920)
>   at kafka.server.ReplicaManager.fetchMessages(ReplicaManager.scala:481)
>   at kafka.server.KafkaApis.handleFetchRequest(KafkaApis.scala:534)
>   at kafka.server.KafkaApis.handle(KafkaApis.scala:79)
>   at kafka.server.KafkaRequestHandler.run(KafkaRequestHandler.scala:60)
>   at java.lang.Thread.run(Thread.java:745)
> {noformat}
> Does my assessment make sense? If so, this behaviour seems problematic. A few 
> changes that might improve matters (assuming I'm on the right track):
> 1. `FetchRequest` should be able to return partial results
> 2. The broker fulfilling the `FetchRequest` could ignore the 
> `NotAssignedReplicaException`, and return results without recording the 
> not-any-longer-follower position.
> This behaviour was experienced with 0.10.1.1, although looking at the 
> changelogs and the code in question, I don't see any reason why it would have 
> changed in later versions.
> Am very interested to have some discussion on this. Thanks!



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Comment Edited] (KAFKA-5758) Reassigning a topic's partitions can adversely impact other topics

2017-08-22 Thread Ismael Juma (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-5758?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16136702#comment-16136702
 ] 

Ismael Juma edited comment on KAFKA-5758 at 8/22/17 12:28 PM:
--

[~dwvangeest], looks like a good find. We should not be failing the whole fetch 
request, we should only fail the relevant partition.


was (Author: ijuma):
[~dwvangeest], looks like a good fine. We should not be failing the whole fetch 
request, we should only fail the relevant partition.

> Reassigning a topic's partitions can adversely impact other topics
> --
>
> Key: KAFKA-5758
> URL: https://issues.apache.org/jira/browse/KAFKA-5758
> Project: Kafka
>  Issue Type: Bug
>  Components: core
>Affects Versions: 0.10.1.1
>Reporter: David van Geest
>  Labels: reliability
> Fix For: 0.11.0.1, 1.0.0
>
>
> We've noticed that reassigning a topic's partitions seems to adversely impact 
> other topics. Specifically, followers for other topics fall out of the ISR.
> While I'm not 100% sure about why this happens, the scenario seems to be as 
> follows:
> 1. Reassignment is manually triggered on topic-partition X-Y, and broker A 
> (which used to be a follower for X-Y) is no longer a follower.
> 2. Broker A makes `FetchRequest` including topic-partition X-Y to broker B, 
> just after the reassignment.
> 3. Broker B can fulfill the `FetchRequest`, but while trying to do so it 
> tries to record the position of "follower" A. This fails, because broker A is 
> no longer a follower for X-Y (see exception below).
> 4. The entire `FetchRequest` request fails, and broker A's other followed 
> topics start falling behind.
> 5. Depending on the length of the reassignment, this sequence repeats.
> In step 3, we see exceptions like:
> {noformat}
> Error when handling request Name: FetchRequest; Version: 3; CorrelationId: 
> 46781859; ClientId: ReplicaFetcherThread-0-1001; ReplicaId: 1006; MaxWait: 
> 500 ms; MinBytes: 1 bytes; MaxBytes:10485760 bytes; RequestInfo: 
> 
> kafka.common.NotAssignedReplicaException: Leader 1001 failed to record 
> follower 1006's position -1 since the replica is not recognized to be one of 
> the assigned replicas 1001,1004,1005 for partition [topic_being_reassigned,5].
> at kafka.cluster.Partition.updateReplicaLogReadResult(Partition.scala:249)
>   at 
> kafka.server.ReplicaManager$$anonfun$updateFollowerLogReadResults$2.apply(ReplicaManager.scala:923)
>   at 
> kafka.server.ReplicaManager$$anonfun$updateFollowerLogReadResults$2.apply(ReplicaManager.scala:920)
>   at 
> scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
>   at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:48)
>   at 
> kafka.server.ReplicaManager.updateFollowerLogReadResults(ReplicaManager.scala:920)
>   at kafka.server.ReplicaManager.fetchMessages(ReplicaManager.scala:481)
>   at kafka.server.KafkaApis.handleFetchRequest(KafkaApis.scala:534)
>   at kafka.server.KafkaApis.handle(KafkaApis.scala:79)
>   at kafka.server.KafkaRequestHandler.run(KafkaRequestHandler.scala:60)
>   at java.lang.Thread.run(Thread.java:745)
> {noformat}
> Does my assessment make sense? If so, this behaviour seems problematic. A few 
> changes that might improve matters (assuming I'm on the right track):
> 1. `FetchRequest` should be able to return partial results
> 2. The broker fulfilling the `FetchRequest` could ignore the 
> `NotAssignedReplicaException`, and return results without recording the 
> not-any-longer-follower position.
> This behaviour was experienced with 0.10.1.1, although looking at the 
> changelogs and the code in question, I don't see any reason why it would have 
> changed in later versions.
> Am very interested to have some discussion on this. Thanks!



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (KAFKA-5758) Reassigning a topic's partitions can adversely impact other topics

2017-08-22 Thread Ismael Juma (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-5758?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16136702#comment-16136702
 ] 

Ismael Juma commented on KAFKA-5758:


[~dwvangeest], looks like a good fine. We should not be failing the whole fetch 
request, we should only fail the relevant partition.

> Reassigning a topic's partitions can adversely impact other topics
> --
>
> Key: KAFKA-5758
> URL: https://issues.apache.org/jira/browse/KAFKA-5758
> Project: Kafka
>  Issue Type: Bug
>  Components: core
>Affects Versions: 0.10.1.1
>Reporter: David van Geest
>  Labels: reliability
> Fix For: 0.11.0.1, 1.0.0
>
>
> We've noticed that reassigning a topic's partitions seems to adversely impact 
> other topics. Specifically, followers for other topics fall out of the ISR.
> While I'm not 100% sure about why this happens, the scenario seems to be as 
> follows:
> 1. Reassignment is manually triggered on topic-partition X-Y, and broker A 
> (which used to be a follower for X-Y) is no longer a follower.
> 2. Broker A makes `FetchRequest` including topic-partition X-Y to broker B, 
> just after the reassignment.
> 3. Broker B can fulfill the `FetchRequest`, but while trying to do so it 
> tries to record the position of "follower" A. This fails, because broker A is 
> no longer a follower for X-Y (see exception below).
> 4. The entire `FetchRequest` request fails, and broker A's other followed 
> topics start falling behind.
> 5. Depending on the length of the reassignment, this sequence repeats.
> In step 3, we see exceptions like:
> {noformat}
> Error when handling request Name: FetchRequest; Version: 3; CorrelationId: 
> 46781859; ClientId: ReplicaFetcherThread-0-1001; ReplicaId: 1006; MaxWait: 
> 500 ms; MinBytes: 1 bytes; MaxBytes:10485760 bytes; RequestInfo: 
> 
> kafka.common.NotAssignedReplicaException: Leader 1001 failed to record 
> follower 1006's position -1 since the replica is not recognized to be one of 
> the assigned replicas 1001,1004,1005 for partition [topic_being_reassigned,5].
> at kafka.cluster.Partition.updateReplicaLogReadResult(Partition.scala:249)
>   at 
> kafka.server.ReplicaManager$$anonfun$updateFollowerLogReadResults$2.apply(ReplicaManager.scala:923)
>   at 
> kafka.server.ReplicaManager$$anonfun$updateFollowerLogReadResults$2.apply(ReplicaManager.scala:920)
>   at 
> scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
>   at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:48)
>   at 
> kafka.server.ReplicaManager.updateFollowerLogReadResults(ReplicaManager.scala:920)
>   at kafka.server.ReplicaManager.fetchMessages(ReplicaManager.scala:481)
>   at kafka.server.KafkaApis.handleFetchRequest(KafkaApis.scala:534)
>   at kafka.server.KafkaApis.handle(KafkaApis.scala:79)
>   at kafka.server.KafkaRequestHandler.run(KafkaRequestHandler.scala:60)
>   at java.lang.Thread.run(Thread.java:745)
> {noformat}
> Does my assessment make sense? If so, this behaviour seems problematic. A few 
> changes that might improve matters (assuming I'm on the right track):
> 1. `FetchRequest` should be able to return partial results
> 2. The broker fulfilling the `FetchRequest` could ignore the 
> `NotAssignedReplicaException`, and return results without recording the 
> not-any-longer-follower position.
> This behaviour was experienced with 0.10.1.1, although looking at the 
> changelogs and the code in question, I don't see any reason why it would have 
> changed in later versions.
> Am very interested to have some discussion on this. Thanks!



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (KAFKA-5758) Reassigning a topic's partitions can adversely impact other topics

2017-08-22 Thread Ismael Juma (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-5758?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ismael Juma updated KAFKA-5758:
---
Fix Version/s: 1.0.0
   0.11.0.1

> Reassigning a topic's partitions can adversely impact other topics
> --
>
> Key: KAFKA-5758
> URL: https://issues.apache.org/jira/browse/KAFKA-5758
> Project: Kafka
>  Issue Type: Bug
>  Components: core
>Affects Versions: 0.10.1.1
>Reporter: David van Geest
>  Labels: reliability
> Fix For: 0.11.0.1, 1.0.0
>
>
> We've noticed that reassigning a topic's partitions seems to adversely impact 
> other topics. Specifically, followers for other topics fall out of the ISR.
> While I'm not 100% sure about why this happens, the scenario seems to be as 
> follows:
> 1. Reassignment is manually triggered on topic-partition X-Y, and broker A 
> (which used to be a follower for X-Y) is no longer a follower.
> 2. Broker A makes `FetchRequest` including topic-partition X-Y to broker B, 
> just after the reassignment.
> 3. Broker B can fulfill the `FetchRequest`, but while trying to do so it 
> tries to record the position of "follower" A. This fails, because broker A is 
> no longer a follower for X-Y (see exception below).
> 4. The entire `FetchRequest` request fails, and broker A's other followed 
> topics start falling behind.
> 5. Depending on the length of the reassignment, this sequence repeats.
> In step 3, we see exceptions like:
> {noformat}
> Error when handling request Name: FetchRequest; Version: 3; CorrelationId: 
> 46781859; ClientId: ReplicaFetcherThread-0-1001; ReplicaId: 1006; MaxWait: 
> 500 ms; MinBytes: 1 bytes; MaxBytes:10485760 bytes; RequestInfo: 
> 
> kafka.common.NotAssignedReplicaException: Leader 1001 failed to record 
> follower 1006's position -1 since the replica is not recognized to be one of 
> the assigned replicas 1001,1004,1005 for partition [topic_being_reassigned,5].
> at kafka.cluster.Partition.updateReplicaLogReadResult(Partition.scala:249)
>   at 
> kafka.server.ReplicaManager$$anonfun$updateFollowerLogReadResults$2.apply(ReplicaManager.scala:923)
>   at 
> kafka.server.ReplicaManager$$anonfun$updateFollowerLogReadResults$2.apply(ReplicaManager.scala:920)
>   at 
> scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
>   at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:48)
>   at 
> kafka.server.ReplicaManager.updateFollowerLogReadResults(ReplicaManager.scala:920)
>   at kafka.server.ReplicaManager.fetchMessages(ReplicaManager.scala:481)
>   at kafka.server.KafkaApis.handleFetchRequest(KafkaApis.scala:534)
>   at kafka.server.KafkaApis.handle(KafkaApis.scala:79)
>   at kafka.server.KafkaRequestHandler.run(KafkaRequestHandler.scala:60)
>   at java.lang.Thread.run(Thread.java:745)
> {noformat}
> Does my assessment make sense? If so, this behaviour seems problematic. A few 
> changes that might improve matters (assuming I'm on the right track):
> 1. `FetchRequest` should be able to return partial results
> 2. The broker fulfilling the `FetchRequest` could ignore the 
> `NotAssignedReplicaException`, and return results without recording the 
> not-any-longer-follower position.
> This behaviour was experienced with 0.10.1.1, although looking at the 
> changelogs and the code in question, I don't see any reason why it would have 
> changed in later versions.
> Am very interested to have some discussion on this. Thanks!



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (KAFKA-5758) Reassigning a topic's partitions can adversely impact other topics

2017-08-22 Thread Ismael Juma (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-5758?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ismael Juma updated KAFKA-5758:
---
Labels: reliability  (was: )

> Reassigning a topic's partitions can adversely impact other topics
> --
>
> Key: KAFKA-5758
> URL: https://issues.apache.org/jira/browse/KAFKA-5758
> Project: Kafka
>  Issue Type: Bug
>  Components: core
>Affects Versions: 0.10.1.1
>Reporter: David van Geest
>  Labels: reliability
> Fix For: 0.11.0.1, 1.0.0
>
>
> We've noticed that reassigning a topic's partitions seems to adversely impact 
> other topics. Specifically, followers for other topics fall out of the ISR.
> While I'm not 100% sure about why this happens, the scenario seems to be as 
> follows:
> 1. Reassignment is manually triggered on topic-partition X-Y, and broker A 
> (which used to be a follower for X-Y) is no longer a follower.
> 2. Broker A makes `FetchRequest` including topic-partition X-Y to broker B, 
> just after the reassignment.
> 3. Broker B can fulfill the `FetchRequest`, but while trying to do so it 
> tries to record the position of "follower" A. This fails, because broker A is 
> no longer a follower for X-Y (see exception below).
> 4. The entire `FetchRequest` request fails, and broker A's other followed 
> topics start falling behind.
> 5. Depending on the length of the reassignment, this sequence repeats.
> In step 3, we see exceptions like:
> {noformat}
> Error when handling request Name: FetchRequest; Version: 3; CorrelationId: 
> 46781859; ClientId: ReplicaFetcherThread-0-1001; ReplicaId: 1006; MaxWait: 
> 500 ms; MinBytes: 1 bytes; MaxBytes:10485760 bytes; RequestInfo: 
> 
> kafka.common.NotAssignedReplicaException: Leader 1001 failed to record 
> follower 1006's position -1 since the replica is not recognized to be one of 
> the assigned replicas 1001,1004,1005 for partition [topic_being_reassigned,5].
> at kafka.cluster.Partition.updateReplicaLogReadResult(Partition.scala:249)
>   at 
> kafka.server.ReplicaManager$$anonfun$updateFollowerLogReadResults$2.apply(ReplicaManager.scala:923)
>   at 
> kafka.server.ReplicaManager$$anonfun$updateFollowerLogReadResults$2.apply(ReplicaManager.scala:920)
>   at 
> scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
>   at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:48)
>   at 
> kafka.server.ReplicaManager.updateFollowerLogReadResults(ReplicaManager.scala:920)
>   at kafka.server.ReplicaManager.fetchMessages(ReplicaManager.scala:481)
>   at kafka.server.KafkaApis.handleFetchRequest(KafkaApis.scala:534)
>   at kafka.server.KafkaApis.handle(KafkaApis.scala:79)
>   at kafka.server.KafkaRequestHandler.run(KafkaRequestHandler.scala:60)
>   at java.lang.Thread.run(Thread.java:745)
> {noformat}
> Does my assessment make sense? If so, this behaviour seems problematic. A few 
> changes that might improve matters (assuming I'm on the right track):
> 1. `FetchRequest` should be able to return partial results
> 2. The broker fulfilling the `FetchRequest` could ignore the 
> `NotAssignedReplicaException`, and return results without recording the 
> not-any-longer-follower position.
> This behaviour was experienced with 0.10.1.1, although looking at the 
> changelogs and the code in question, I don't see any reason why it would have 
> changed in later versions.
> Am very interested to have some discussion on this. Thanks!



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (KAFKA-5720) In Jenkins, kafka.api.SaslSslAdminClientIntegrationTest failed with org.apache.kafka.common.errors.TimeoutException

2017-08-22 Thread Ismael Juma (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-5720?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ismael Juma updated KAFKA-5720:
---
Fix Version/s: 1.0.0

> In Jenkins, kafka.api.SaslSslAdminClientIntegrationTest failed with 
> org.apache.kafka.common.errors.TimeoutException
> ---
>
> Key: KAFKA-5720
> URL: https://issues.apache.org/jira/browse/KAFKA-5720
> Project: Kafka
>  Issue Type: Bug
>Reporter: Colin P. McCabe
>Priority: Minor
> Fix For: 1.0.0
>
>
> In Jenkins, kafka.api.SaslSslAdminClientIntegrationTest failed with 
> org.apache.kafka.common.errors.TimeoutException.
> {code}
> java.util.concurrent.ExecutionException: 
> org.apache.kafka.common.errors.TimeoutException: Aborted due to timeout.
>   at 
> org.apache.kafka.common.internals.KafkaFutureImpl.wrapAndThrow(KafkaFutureImpl.java:45)
>   at 
> org.apache.kafka.common.internals.KafkaFutureImpl.access$000(KafkaFutureImpl.java:32)
>   at 
> org.apache.kafka.common.internals.KafkaFutureImpl$SingleWaiter.await(KafkaFutureImpl.java:89)
>   at 
> org.apache.kafka.common.internals.KafkaFutureImpl.get(KafkaFutureImpl.java:213)
>   at 
> kafka.api.AdminClientIntegrationTest.testCallInFlightTimeouts(AdminClientIntegrationTest.scala:399)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
>   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>   at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
>   at 
> org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
>   at 
> org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:298)
>   at 
> org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:292)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>   at java.lang.Thread.run(Thread.java:748)
> Caused by: org.apache.kafka.common.errors.TimeoutException: Aborted due to 
> timeout.
> {code}
> It's unclear whether this was an environment error or test bug.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (KAFKA-5342) Distinguish abortable failures in transactional producer

2017-08-22 Thread Ismael Juma (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-5342?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16136686#comment-16136686
 ] 

Ismael Juma commented on KAFKA-5342:


[~hachikuji], do you think you can submit the PR today?

> Distinguish abortable failures in transactional producer
> 
>
> Key: KAFKA-5342
> URL: https://issues.apache.org/jira/browse/KAFKA-5342
> Project: Kafka
>  Issue Type: Sub-task
>  Components: clients, core, producer 
>Reporter: Jason Gustafson
>Assignee: Jason Gustafson
> Fix For: 0.11.0.1
>
>
> The transactional producer distinguishes two classes of user-visible errors:
> 1. Abortable errors: these are errors which are fatal to the ongoing 
> transaction, but which can be successfully aborted. Essentially any error in 
> which the producer can still expect to successfully send EndTxn to the 
> transaction coordinator is abortable.
> 2. Fatal errors: any error which is not abortable is fatal. For example, a 
> transactionalId authorization error is fatal because it would also prevent 
> the TC from receiving the EndTxn request.
> At the moment, it's not clear how the user would know how they should handle 
> a given failure. One option is to add an exception type to indicate which 
> errors are abortable (e.g. AbortableKafkaException). Then any other exception 
> could be considered fatal.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (KAFKA-5749) Refactor SessionStore hierarchy

2017-08-22 Thread Damian Guy (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-5749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16136644#comment-16136644
 ] 

Damian Guy commented on KAFKA-5749:
---

Sorry [~adyachkov] these tasks i've assigned to myself as the are part of the 
KIP i'm working on. In most cases i already have a plan and/or have done part 
of the task.

> Refactor SessionStore hierarchy
> ---
>
> Key: KAFKA-5749
> URL: https://issues.apache.org/jira/browse/KAFKA-5749
> Project: Kafka
>  Issue Type: Sub-task
>  Components: streams
>Affects Versions: 1.0.0
>Reporter: Damian Guy
>Assignee: Damian Guy
> Fix For: 1.0.0
>
>
> In order to support bytes store we need to create a MeteredSessionStore and 
> ChangeloggingSessionStore. We then need to refactor the current SessionStore 
> implementations to use this. All inner stores should by of type  byte[]>



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Assigned] (KAFKA-5686) Documentation inconsistency on the "Compression"

2017-08-22 Thread Manikumar (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-5686?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Manikumar reassigned KAFKA-5686:


Assignee: Manikumar

> Documentation inconsistency on the "Compression"
> 
>
> Key: KAFKA-5686
> URL: https://issues.apache.org/jira/browse/KAFKA-5686
> Project: Kafka
>  Issue Type: Bug
>  Components: documentation
>Affects Versions: 0.11.0.0
>Reporter: Seweryn Habdank-Wojewodzki
>Assignee: Manikumar
>Priority: Minor
>
> At the page:
> https://kafka.apache.org/documentation/
> There is a sentence:
> {{Kafka supports GZIP, Snappy and LZ4 compression protocols. More details on 
> compression can be found here.}}
> Especially link under the word *here* is describing very old compression 
> settings, which is false in case of version 0.11.x.y.
> JAVA API:
> Also it would be nice to clearly state if *compression.type* uses only case 
> sensitive String as a value or if it is recommended to use e.g. 
> {{CompressionType.GZIP.name}} for JAVA API.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Resolved] (KAFKA-5751) Kafka cannot start; corrupted index file(s)

2017-08-22 Thread Manikumar (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-5751?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Manikumar resolved KAFKA-5751.
--
Resolution: Duplicate

> Kafka cannot start; corrupted index file(s)
> ---
>
> Key: KAFKA-5751
> URL: https://issues.apache.org/jira/browse/KAFKA-5751
> Project: Kafka
>  Issue Type: Bug
>  Components: log
>Affects Versions: 0.11.0.0
> Environment: Linux (RedHat 7)
>Reporter: Martin M
>Priority: Critical
>
> A system was running Kafka 0.11.0 and some applications that produce and 
> consume events.
> During the runtime, a power outage was experienced. Upon restart, Kafka did 
> not recover.
> Logs show repeatedly the messages below:
> *server.log*
> {noformat}
> [2017-08-15 15:02:26,374] FATAL [Kafka Server 1001], Fatal error during 
> KafkaServer startup. Prepare to shutdown (kafka.server.KafkaServer)
> org.apache.kafka.common.protocol.types.SchemaException: Error reading field 
> 'version': java.nio.BufferUnderflowException
>   at org.apache.kafka.common.protocol.types.Schema.read(Schema.java:75)
>   at 
> kafka.log.ProducerStateManager$.readSnapshot(ProducerStateManager.scala:289)
>   at 
> kafka.log.ProducerStateManager.loadFromSnapshot(ProducerStateManager.scala:440)
>   at 
> kafka.log.ProducerStateManager.truncateAndReload(ProducerStateManager.scala:499)
>   at kafka.log.Log.kafka$log$Log$$recoverSegment(Log.scala:327)
>   at kafka.log.Log$$anonfun$loadSegmentFiles$3.apply(Log.scala:314)
>   at kafka.log.Log$$anonfun$loadSegmentFiles$3.apply(Log.scala:272)
>   at 
> scala.collection.TraversableLike$WithFilter$$anonfun$foreach$1.apply(TraversableLike.scala:733)
>   at 
> scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:33)
>   at scala.collection.mutable.ArrayOps$ofRef.foreach(ArrayOps.scala:186)
>   at 
> scala.collection.TraversableLike$WithFilter.foreach(TraversableLike.scala:732)
>   at kafka.log.Log.loadSegmentFiles(Log.scala:272)
>   at kafka.log.Log.loadSegments(Log.scala:376)
>   at kafka.log.Log.(Log.scala:179)
>   at kafka.log.Log$.apply(Log.scala:1580)
>   at 
> kafka.log.LogManager$$anonfun$loadLogs$2$$anonfun$5$$anonfun$apply$12$$anonfun$apply$1.apply$mcV$sp(LogManager.scala:172)
>   at kafka.utils.CoreUtils$$anon$1.run(CoreUtils.scala:57)
>   at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>   at java.lang.Thread.run(Thread.java:748)
> {noformat}
> *kafkaServer.out*
> {noformat}
> [2017-08-15 16:03:50,927] WARN Found a corrupted index file due to 
> requirement failed: Corrupt index found, index file 
> (/opt/nsp/os/kafka/data/session-manager.revoke_token_topic-7/.index)
>  has non-zero size but the last offset is 0 which is no larger than the base 
> offset 0.}. deleting 
> /opt/nsp/os/kafka/data/session-manager.revoke_token_topic-7/.timeindex,
>  
> /opt/nsp/os/kafka/data/session-manager.revoke_token_topic-7/.index,
>  and 
> /opt/nsp/os/kafka/data/session-manager.revoke_token_topic-7/.txnindex
>  and rebuilding index... (kafka.log.Log)
> [2017-08-15 16:03:50,931] INFO [Kafka Server 1001], shutting down 
> (kafka.server.KafkaServer)
> [2017-08-15 16:03:50,932] INFO Recovering unflushed segment 0 in log 
> session-manager.revoke_token_topic-7. (kafka.log.Log)
> [2017-08-15 16:03:50,935] INFO Terminate ZkClient event thread. 
> (org.I0Itec.zkclient.ZkEventThread)
> [2017-08-15 16:03:50,936] INFO Loading producer state from offset 0 for 
> partition session-manager.revoke_token_topic-7 with message format version 2 
> (kafka.log.Log)
> [2017-08-15 16:03:50,937] INFO Completed load of log 
> session-manager.revoke_token_topic-7 with 1 log segments, log start offset 0 
> and log end offset 0 in 10 ms (kafka.log.Log)
> [2017-08-15 16:03:50,938] INFO Session: 0x1000f772d26063b closed 
> (org.apache.zookeeper.ZooKeeper)
> [2017-08-15 16:03:50,938] INFO EventThread shut down for session: 
> 0x1000f772d26063b (org.apache.zookeeper.ClientCnxn)
> {noformat}
>  



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Assigned] (KAFKA-5714) Allow whitespaces in the principal name

2017-08-22 Thread Manikumar (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-5714?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Manikumar reassigned KAFKA-5714:


Assignee: (was: Manikumar)

> Allow whitespaces in the principal name
> ---
>
> Key: KAFKA-5714
> URL: https://issues.apache.org/jira/browse/KAFKA-5714
> Project: Kafka
>  Issue Type: Improvement
>  Components: security
>Affects Versions: 0.10.2.1
>Reporter: Alla Tumarkin
>
> Request
> Improve parser behavior to allow whitespaces in the principal name in the 
> config file, as in:
> {code}
> super.users=User:CN=Unknown, OU=Unknown, O=Unknown, L=Unknown, ST=Unknown, 
> C=Unknown
> {code}
> Background
> Current implementation requires that there are no whitespaces after commas, 
> i.e.
> {code}
> super.users=User:CN=Unknown,OU=Unknown,O=Unknown,L=Unknown,ST=Unknown,C=Unknown
> {code}
> Note: having a semicolon at the end doesn't help, i.e. this does not work 
> either
> {code}
> super.users=User:CN=Unknown, OU=Unknown, O=Unknown, L=Unknown, ST=Unknown, 
> C=Unknown;
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (KAFKA-5689) Refactor WindowStore hierarchy so that Metered Store is the outermost store

2017-08-22 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-5689?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16136607#comment-16136607
 ] 

ASF GitHub Bot commented on KAFKA-5689:
---

Github user asfgit closed the pull request at:

https://github.com/apache/kafka/pull/3692


> Refactor  WindowStore hierarchy so that Metered Store is the outermost store
> 
>
> Key: KAFKA-5689
> URL: https://issues.apache.org/jira/browse/KAFKA-5689
> Project: Kafka
>  Issue Type: Sub-task
>  Components: streams
>Reporter: Damian Guy
>Assignee: Damian Guy
> Fix For: 1.0.0
>
>
> MeteredWinowStore is currently not the outermost store. Further it needs to 
> have the inner store as  to allow easy plugability of custom 
> storage engines.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (KAFKA-5503) Idempotent producer ignores shutdown while fetching ProducerId

2017-08-22 Thread Ismael Juma (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-5503?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ismael Juma updated KAFKA-5503:
---
Fix Version/s: (was: 0.11.0.2, 1.0.0)
   0.11.0.2
   1.0.0

> Idempotent producer ignores shutdown while fetching ProducerId
> --
>
> Key: KAFKA-5503
> URL: https://issues.apache.org/jira/browse/KAFKA-5503
> Project: Kafka
>  Issue Type: Sub-task
>  Components: clients, core, producer 
>Affects Versions: 0.11.0.0
>Reporter: Jason Gustafson
>Assignee: Evgeny Veretennikov
> Fix For: 1.0.0, 0.11.0.2
>
>
> When using the idempotent producer, we initially block the sender thread 
> while we attempt to get the ProducerId. During this time, a concurrent call 
> to close() will be ignored.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (KAFKA-5503) Idempotent producer ignores shutdown while fetching ProducerId

2017-08-22 Thread Ismael Juma (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-5503?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ismael Juma updated KAFKA-5503:
---
Fix Version/s: (was: 0.11.0.1)
   0.11.0.2, 1.0.0

> Idempotent producer ignores shutdown while fetching ProducerId
> --
>
> Key: KAFKA-5503
> URL: https://issues.apache.org/jira/browse/KAFKA-5503
> Project: Kafka
>  Issue Type: Sub-task
>  Components: clients, core, producer 
>Affects Versions: 0.11.0.0
>Reporter: Jason Gustafson
>Assignee: Evgeny Veretennikov
> Fix For: 0.11.0.2, 1.0.0
>
>
> When using the idempotent producer, we initially block the sender thread 
> while we attempt to get the ProducerId. During this time, a concurrent call 
> to close() will be ignored.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Assigned] (KAFKA-5752) Delete topic and re-create topic immediate will delete the new topic's timeindex

2017-08-22 Thread Ismael Juma (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-5752?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ismael Juma reassigned KAFKA-5752:
--

Assignee: Manikumar

> Delete topic and re-create topic immediate will delete the new topic's 
> timeindex 
> -
>
> Key: KAFKA-5752
> URL: https://issues.apache.org/jira/browse/KAFKA-5752
> Project: Kafka
>  Issue Type: Bug
>Affects Versions: 0.10.2.0, 0.11.0.0
>Reporter: Pengwei
>Assignee: Manikumar
>Priority: Critical
>  Labels: reliability
> Fix For: 0.11.0.1, 1.0.0
>
>
> When we delete the topic and re-create the topic with the same name, we will 
> find after the 
> async delete topic is finished,  async delete will remove the newly created 
> topic's time index.
> This is because in the LogManager's asyncDelete, it will change the log and 
> index's file pointer to the renamed log and index, but missing the time 
> index. So will cause this issue



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (KAFKA-5761) Serializer API should support ByteBuffer

2017-08-22 Thread Bhaskar Gollapudi (JIRA)
Bhaskar Gollapudi created KAFKA-5761:


 Summary: Serializer API should support ByteBuffer
 Key: KAFKA-5761
 URL: https://issues.apache.org/jira/browse/KAFKA-5761
 Project: Kafka
  Issue Type: Improvement
  Components: clients
Affects Versions: 0.11.0.0
Reporter: Bhaskar Gollapudi


Consider the Serializer : Its main method is :

byte[] serialize(String topic, T data);

Producer applications create a implementation that takes in an instance (
of T ) and convert that to a byte[]. This byte array is allocated a new for
this message.This byte array then is handed over to Kafka Producer API
internals that write the bytes to buffer/ network socket. When the next
message arrives , the serializer instead of creating a new byte[] , should
try to reuse the existing byte[] for the new message. This requires two
things :

1. The process of handing off the bytes to the buffer/socket and reusing
the byte[] must happen on the same thread.

2 There should be a way for marking the end of available bytes in the
byte[].

The first is reasonably simple to understand. If this does not happen , and
without other necessary synchrinization , the byte[] get corrupted and so
is the message written to buffer/socket.However , this requirement is easy
to meet for a producer application , because it controls the threads on
which the serializer is invoked.

The second is where the problem lies with the current API. It does not
allow a variable size of bytes to be read from a container. It is limited
by the byte[]'s length. This forces the producer to

1 either create a new byte[] for a message that is bigger than the previous
one.
OR
2. Decide a max size and use a padding .

Both are cumbersome and error prone, and may cause wasting of network
bandwidth.

Instead , if there is an Serializer with this method :

ByteBuffer serialize(String topic, T data);

This helps to implements a reusable bytes container for  clients to avoid
allocations for each message.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (KAFKA-3866) KerberosLogin refresh time bug and other improvements

2017-08-22 Thread Ismael Juma (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-3866?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ismael Juma updated KAFKA-3866:
---
Fix Version/s: 1.0.0

> KerberosLogin refresh time bug and other improvements
> -
>
> Key: KAFKA-3866
> URL: https://issues.apache.org/jira/browse/KAFKA-3866
> Project: Kafka
>  Issue Type: Bug
>  Components: security
>Affects Versions: 0.10.0.0
>Reporter: Ismael Juma
>Assignee: Ismael Juma
> Fix For: 1.0.0, 0.11.0.2
>
>
> ZOOKEEPER-2295 describes a bug in the Kerberos refresh time logic that is 
> also present in our KerberosLogin class. While looking at the code, I found a 
> number of things that could be improved. More details in the PR.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (KAFKA-3866) KerberosLogin refresh time bug and other improvements

2017-08-22 Thread Ismael Juma (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-3866?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ismael Juma updated KAFKA-3866:
---
Fix Version/s: (was: 0.11.0.1)
   0.11.0.2

> KerberosLogin refresh time bug and other improvements
> -
>
> Key: KAFKA-3866
> URL: https://issues.apache.org/jira/browse/KAFKA-3866
> Project: Kafka
>  Issue Type: Bug
>  Components: security
>Affects Versions: 0.10.0.0
>Reporter: Ismael Juma
>Assignee: Ismael Juma
> Fix For: 1.0.0, 0.11.0.2
>
>
> ZOOKEEPER-2295 describes a bug in the Kerberos refresh time logic that is 
> also present in our KerberosLogin class. While looking at the code, I found a 
> number of things that could be improved. More details in the PR.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (KAFKA-4741) Memory leak in RecordAccumulator.append

2017-08-22 Thread JIRA

[ 
https://issues.apache.org/jira/browse/KAFKA-4741?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16136549#comment-16136549
 ] 

Julius Žaromskis commented on KAFKA-4741:
-

I'm having this problem: 
https://stackoverflow.com/questions/45813477/kafka-off-heap-memory-leak

My questions is this - would it cause leaking on producer or on server?

> Memory leak in RecordAccumulator.append
> ---
>
> Key: KAFKA-4741
> URL: https://issues.apache.org/jira/browse/KAFKA-4741
> Project: Kafka
>  Issue Type: Bug
>  Components: clients
>Reporter: Satish Duggana
>Assignee: Satish Duggana
> Fix For: 0.11.0.0
>
>
> RecordAccumulator creates a `ByteBuffer` from free memory pool. This should 
> be deallocated when invocations encounter an exception or throwing any 
> exceptions. 
> I added todo comment lines in the below code for cases to deallocate that 
> buffer.
> {code:title=RecordProducer.java|borderStyle=solid}
> ByteBuffer buffer = free.allocate(size, maxTimeToBlock);
> synchronized (dq) {
> // Need to check if producer is closed again after grabbing 
> the dequeue lock.
> if (closed)
>// todo buffer should be cleared.
> throw new IllegalStateException("Cannot send after the 
> producer is closed.");
> // todo buffer should be cleared up when tryAppend throws an 
> Exception
> RecordAppendResult appendResult = tryAppend(timestamp, key, 
> value, callback, dq);
> if (appendResult != null) {
> // Somebody else found us a batch, return the one we 
> waited for! Hopefully this doesn't happen often...
> free.deallocate(buffer);
> return appendResult;
> }
> {code}
> I will raise PR for the same soon.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)