[jira] [Updated] (KAFKA-9210) kafka stream loss data

2019-11-18 Thread panpan.liu (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-9210?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

panpan.liu updated KAFKA-9210:
--
Description: 
kafka broker: 2.0.1

kafka stream client: 2.1.0
 # two applications run at the same time
 # after some days,I stop one application(in k8s)
 # The flollowing log occured and I check the data and find that value is less 
than what I expected.

 

```

 

 

Partitions [flash-app-xmc-worker-share-store-minute-repartition-1]Partitions 
[flash-app-xmc-worker-share-store-minute-repartition-1] for changelog 
flash-app-xmc-worker-share-store-minute-changelog-12019-11-19 05:50:49.816|WARN 
|flash-client-xmc-StreamThread-3|o.a.k.s.p.i.StoreChangelogReader|101|stream-thread
 [flash-client-xmc-StreamThread-3] Restoring StreamTasks failed. Deleting 
StreamTasks stores to recreate from 
scratch.org.apache.kafka.clients.consumer.OffsetOutOfRangeException: Offsets 
out of range with no configured reset policy for partitions: 
\{flash-app-xmc-worker-share-store-minute-changelog-1=6128684} at 
org.apache.kafka.clients.consumer.internals.Fetcher.parseCompletedFetch(Fetcher.java:987)2019-11-19
 05:50:49.817|INFO 
|flash-client-xmc-StreamThread-3|o.a.k.s.p.i.StoreChangelogReader|105|stream-thread
 [flash-client-xmc-StreamThread-3] Reinitializing StreamTask TaskId: 10_1 
ProcessorTopology: KSTREAM-SOURCE-70: topics: 
[flash-app-xmc-worker-share-store-minute-repartition] children: 
[KSTREAM-AGGREGATE-67] KSTREAM-AGGREGATE-67: states: 
[worker-share-store-minute] children: [KTABLE-TOSTREAM-71] 
KTABLE-TOSTREAM-71: children: [KSTREAM-SINK-72] 
KSTREAM-SINK-72: topic: 
StaticTopicNameExtractor(xmc-worker-share-minute)Partitions 
[flash-app-xmc-worker-share-store-minute-repartition-1] for changelog 
flash-app-xmc-worker-share-store-minute-changelog-12019-11-19 05:50:49.842|WARN 
|flash-client-xmc-StreamThread-3|o.a.k.s.p.i.StoreChangelogReader|101|stream-thread
 [flash-client-xmc-StreamThread-3] Restoring StreamTasks failed. Deleting 
StreamTasks stores to recreate from 
scratch.org.apache.kafka.clients.consumer.OffsetOutOfRangeException: Offsets 
out of range with no configured reset policy for partitions: 
\{flash-app-xmc-worker-share-store-minute-changelog-1=6128684} at 
org.apache.kafka.clients.consumer.internals.Fetcher.parseCompletedFetch(Fetcher.java:987)2019-11-19
 05:50:49.842|INFO 
|flash-client-xmc-StreamThread-3|o.a.k.s.p.i.StoreChangelogReader|105|stream-thread
 [flash-client-xmc-StreamThread-3] Reinitializing StreamTask TaskId: 10_1 
ProcessorTopology: KSTREAM-SOURCE-70: topics: 
[flash-app-xmc-worker-share-store-minute-repartition] children: 
[KSTREAM-AGGREGATE-67] KSTREAM-AGGREGATE-67: states: 
[worker-share-store-minute] children: [KTABLE-TOSTREAM-71] 
KTABLE-TOSTREAM-71: children: [KSTREAM-SINK-72] 
KSTREAM-SINK-72: topic: 
StaticTopicNameExtractor(xmc-worker-share-minute)Partitions 
[flash-app-xmc-worker-share-store-minute-repartition-1] for changelog 
flash-app-xmc-worker-share-store-minute-changelog-12019-11-19 05:50:49.905|WARN 
|flash-client-xmc-StreamThread-3|o.a.k.s.p.i.StoreChangelogReader|101|stream-thread
 [flash-client-xmc-StreamThread-3] Restoring StreamTasks failed. Deleting 
StreamTasks stores to recreate from 
scratch.org.apache.kafka.clients.consumer.OffsetOutOfRangeException: Offsets 
out of range with no configured reset policy for partitions: 
\{flash-app-xmc-worker-share-store-minute-changelog-1=6128684} at 
org.apache.kafka.clients.consumer.internals.Fetcher.parseCompletedFetch(Fetcher.java:987)2019-11-19
 05:50:49.906|INFO 
|flash-client-xmc-StreamThread-3|o.a.k.s.p.i.StoreChangelogReader|105|stream-thread
 [flash-client-xmc-StreamThread-3] Reinitializing StreamTask TaskId: 10_1 
ProcessorTopology: KSTREAM-SOURCE-70: topics: 
[flash-app-xmc-worker-share-store-minute-repartition] children: 
[KSTREAM-AGGREGATE-67] KSTREAM-AGGREGATE-67: states: 
[worker-share-store-minute]

``` 

 

  was:
kafka broker: 2.0.1

kafka stream client: 2.1.0
 # two applications run at the same time
 # after some days,I stop one application(in k8s)
 # The flollowing log occured and I check the data and find that value is less 
than what I expected.

 
{noformat}
Partitions [flash-app-xmc-worker-share-store-minute-repartition-1]Partitions 
[flash-app-xmc-worker-share-store-minute-repartition-1] for changelog 
flash-app-xmc-worker-share-store-minute-changelog-12019-11-19 05:50:49.816|WARN 
|flash-client-xmc-StreamThread-3|o.a.k.s.p.i.StoreChangelogReader|101|stream-thread
 [flash-client-xmc-StreamThread-3] Restoring StreamTasks failed. Deleting 
StreamTasks stores to recreate from 
scratch.org.apache.kafka.clients.consumer.OffsetOutOfRangeException: Offsets 
out of range with no configured reset policy for partitions: 
{flash-app-xmc-worker-share-store-minute-changelog-1=6128684} 

[jira] [Updated] (KAFKA-9210) kafka stream loss data

2019-11-18 Thread panpan.liu (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-9210?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

panpan.liu updated KAFKA-9210:
--
Attachment: screenshot-1.png

> kafka stream loss data
> --
>
> Key: KAFKA-9210
> URL: https://issues.apache.org/jira/browse/KAFKA-9210
> Project: Kafka
>  Issue Type: Bug
>Affects Versions: 2.0.1
>Reporter: panpan.liu
>Priority: Major
> Attachments: app.log, screenshot-1.png
>
>
> kafka broker: 2.0.1
> kafka stream client: 2.1.0
>  # two applications run at the same time
>  # after some days,I stop one application(in k8s)
>  # The flollowing log occured and I check the data and find that value is 
> less than what I expected.
>  
> {noformat}
> Partitions [flash-app-xmc-worker-share-store-minute-repartition-1]Partitions 
> [flash-app-xmc-worker-share-store-minute-repartition-1] for changelog 
> flash-app-xmc-worker-share-store-minute-changelog-12019-11-19 
> 05:50:49.816|WARN 
> |flash-client-xmc-StreamThread-3|o.a.k.s.p.i.StoreChangelogReader|101|stream-thread
>  [flash-client-xmc-StreamThread-3] Restoring StreamTasks failed. Deleting 
> StreamTasks stores to recreate from 
> scratch.org.apache.kafka.clients.consumer.OffsetOutOfRangeException: Offsets 
> out of range with no configured reset policy for partitions: 
> {flash-app-xmc-worker-share-store-minute-changelog-1=6128684} at 
> org.apache.kafka.clients.consumer.internals.Fetcher.parseCompletedFetch(Fetcher.java:987)2019-11-19
>  05:50:49.817|INFO 
> |flash-client-xmc-StreamThread-3|o.a.k.s.p.i.StoreChangelogReader|105|stream-thread
>  [flash-client-xmc-StreamThread-3] Reinitializing StreamTask TaskId: 10_1 
> ProcessorTopology: KSTREAM-SOURCE-70: topics: 
> [flash-app-xmc-worker-share-store-minute-repartition] children: 
> [KSTREAM-AGGREGATE-67] KSTREAM-AGGREGATE-67: states: 
> [worker-share-store-minute] children: [KTABLE-TOSTREAM-71] 
> KTABLE-TOSTREAM-71: children: [KSTREAM-SINK-72] 
> KSTREAM-SINK-72: topic: 
> StaticTopicNameExtractor(xmc-worker-share-minute)Partitions 
> [flash-app-xmc-worker-share-store-minute-repartition-1] for changelog 
> flash-app-xmc-worker-share-store-minute-changelog-12019-11-19 
> 05:50:49.842|WARN 
> |flash-client-xmc-StreamThread-3|o.a.k.s.p.i.StoreChangelogReader|101|stream-thread
>  [flash-client-xmc-StreamThread-3] Restoring StreamTasks failed. Deleting 
> StreamTasks stores to recreate from 
> scratch.org.apache.kafka.clients.consumer.OffsetOutOfRangeException: Offsets 
> out of range with no configured reset policy for partitions: 
> {flash-app-xmc-worker-share-store-minute-changelog-1=6128684} at 
> org.apache.kafka.clients.consumer.internals.Fetcher.parseCompletedFetch(Fetcher.java:987)2019-11-19
>  05:50:49.842|INFO 
> |flash-client-xmc-StreamThread-3|o.a.k.s.p.i.StoreChangelogReader|105|stream-thread
>  [flash-client-xmc-StreamThread-3] Reinitializing StreamTask TaskId: 10_1 
> ProcessorTopology: KSTREAM-SOURCE-70: topics: 
> [flash-app-xmc-worker-share-store-minute-repartition] children: 
> [KSTREAM-AGGREGATE-67] KSTREAM-AGGREGATE-67: states: 
> [worker-share-store-minute] children: [KTABLE-TOSTREAM-71] 
> KTABLE-TOSTREAM-71: children: [KSTREAM-SINK-72] 
> KSTREAM-SINK-72: topic: 
> StaticTopicNameExtractor(xmc-worker-share-minute)Partitions 
> [flash-app-xmc-worker-share-store-minute-repartition-1] for changelog 
> flash-app-xmc-worker-share-store-minute-changelog-12019-11-19 
> 05:50:49.905|WARN 
> |flash-client-xmc-StreamThread-3|o.a.k.s.p.i.StoreChangelogReader|101|stream-thread
>  [flash-client-xmc-StreamThread-3] Restoring StreamTasks failed. Deleting 
> StreamTasks stores to recreate from 
> scratch.org.apache.kafka.clients.consumer.OffsetOutOfRangeException: Offsets 
> out of range with no configured reset policy for partitions: 
> {flash-app-xmc-worker-share-store-minute-changelog-1=6128684} at 
> org.apache.kafka.clients.consumer.internals.Fetcher.parseCompletedFetch(Fetcher.java:987)2019-11-19
>  05:50:49.906|INFO 
> |flash-client-xmc-StreamThread-3|o.a.k.s.p.i.StoreChangelogReader|105|stream-thread
>  [flash-client-xmc-StreamThread-3] Reinitializing StreamTask TaskId: 10_1 
> ProcessorTopology: KSTREAM-SOURCE-70: topics: 
> [flash-app-xmc-worker-share-store-minute-repartition] children: 
> [KSTREAM-AGGREGATE-67] KSTREAM-AGGREGATE-67: states: 
> [worker-share-store-minute]
>  {noformat}
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (KAFKA-9210) kafka stream loss data

2019-11-18 Thread panpan.liu (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-9210?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

panpan.liu updated KAFKA-9210:
--
Attachment: app.log

> kafka stream loss data
> --
>
> Key: KAFKA-9210
> URL: https://issues.apache.org/jira/browse/KAFKA-9210
> Project: Kafka
>  Issue Type: Bug
>Affects Versions: 2.0.1
>Reporter: panpan.liu
>Priority: Major
> Attachments: app.log
>
>
> kafka broker: 2.0.1
> kafka stream client: 2.1.0
>  # two applications run at the same time
>  # after some days,I stop one application(in k8s)
>  # The flollowing log occured and I check the data and find that value is 
> less than what I expected.
>  
> {noformat}
> Partitions [flash-app-xmc-worker-share-store-minute-repartition-1]Partitions 
> [flash-app-xmc-worker-share-store-minute-repartition-1] for changelog 
> flash-app-xmc-worker-share-store-minute-changelog-12019-11-19 
> 05:50:49.816|WARN 
> |flash-client-xmc-StreamThread-3|o.a.k.s.p.i.StoreChangelogReader|101|stream-thread
>  [flash-client-xmc-StreamThread-3] Restoring StreamTasks failed. Deleting 
> StreamTasks stores to recreate from 
> scratch.org.apache.kafka.clients.consumer.OffsetOutOfRangeException: Offsets 
> out of range with no configured reset policy for partitions: 
> {flash-app-xmc-worker-share-store-minute-changelog-1=6128684} at 
> org.apache.kafka.clients.consumer.internals.Fetcher.parseCompletedFetch(Fetcher.java:987)2019-11-19
>  05:50:49.817|INFO 
> |flash-client-xmc-StreamThread-3|o.a.k.s.p.i.StoreChangelogReader|105|stream-thread
>  [flash-client-xmc-StreamThread-3] Reinitializing StreamTask TaskId: 10_1 
> ProcessorTopology: KSTREAM-SOURCE-70: topics: 
> [flash-app-xmc-worker-share-store-minute-repartition] children: 
> [KSTREAM-AGGREGATE-67] KSTREAM-AGGREGATE-67: states: 
> [worker-share-store-minute] children: [KTABLE-TOSTREAM-71] 
> KTABLE-TOSTREAM-71: children: [KSTREAM-SINK-72] 
> KSTREAM-SINK-72: topic: 
> StaticTopicNameExtractor(xmc-worker-share-minute)Partitions 
> [flash-app-xmc-worker-share-store-minute-repartition-1] for changelog 
> flash-app-xmc-worker-share-store-minute-changelog-12019-11-19 
> 05:50:49.842|WARN 
> |flash-client-xmc-StreamThread-3|o.a.k.s.p.i.StoreChangelogReader|101|stream-thread
>  [flash-client-xmc-StreamThread-3] Restoring StreamTasks failed. Deleting 
> StreamTasks stores to recreate from 
> scratch.org.apache.kafka.clients.consumer.OffsetOutOfRangeException: Offsets 
> out of range with no configured reset policy for partitions: 
> {flash-app-xmc-worker-share-store-minute-changelog-1=6128684} at 
> org.apache.kafka.clients.consumer.internals.Fetcher.parseCompletedFetch(Fetcher.java:987)2019-11-19
>  05:50:49.842|INFO 
> |flash-client-xmc-StreamThread-3|o.a.k.s.p.i.StoreChangelogReader|105|stream-thread
>  [flash-client-xmc-StreamThread-3] Reinitializing StreamTask TaskId: 10_1 
> ProcessorTopology: KSTREAM-SOURCE-70: topics: 
> [flash-app-xmc-worker-share-store-minute-repartition] children: 
> [KSTREAM-AGGREGATE-67] KSTREAM-AGGREGATE-67: states: 
> [worker-share-store-minute] children: [KTABLE-TOSTREAM-71] 
> KTABLE-TOSTREAM-71: children: [KSTREAM-SINK-72] 
> KSTREAM-SINK-72: topic: 
> StaticTopicNameExtractor(xmc-worker-share-minute)Partitions 
> [flash-app-xmc-worker-share-store-minute-repartition-1] for changelog 
> flash-app-xmc-worker-share-store-minute-changelog-12019-11-19 
> 05:50:49.905|WARN 
> |flash-client-xmc-StreamThread-3|o.a.k.s.p.i.StoreChangelogReader|101|stream-thread
>  [flash-client-xmc-StreamThread-3] Restoring StreamTasks failed. Deleting 
> StreamTasks stores to recreate from 
> scratch.org.apache.kafka.clients.consumer.OffsetOutOfRangeException: Offsets 
> out of range with no configured reset policy for partitions: 
> {flash-app-xmc-worker-share-store-minute-changelog-1=6128684} at 
> org.apache.kafka.clients.consumer.internals.Fetcher.parseCompletedFetch(Fetcher.java:987)2019-11-19
>  05:50:49.906|INFO 
> |flash-client-xmc-StreamThread-3|o.a.k.s.p.i.StoreChangelogReader|105|stream-thread
>  [flash-client-xmc-StreamThread-3] Reinitializing StreamTask TaskId: 10_1 
> ProcessorTopology: KSTREAM-SOURCE-70: topics: 
> [flash-app-xmc-worker-share-store-minute-repartition] children: 
> [KSTREAM-AGGREGATE-67] KSTREAM-AGGREGATE-67: states: 
> [worker-share-store-minute]
>  {noformat}
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (KAFKA-9210) kafka stream loss data

2019-11-18 Thread panpan.liu (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-9210?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

panpan.liu updated KAFKA-9210:
--
Description: 
kafka broker: 2.0.1

kafka stream client: 2.1.0
 # two applications run at the same time
 # after some days,I stop one application(in k8s)
 # The flollowing log occured and I check the data and find that value is less 
than what I expected.

 
{noformat}
Partitions [flash-app-xmc-worker-share-store-minute-repartition-1]Partitions 
[flash-app-xmc-worker-share-store-minute-repartition-1] for changelog 
flash-app-xmc-worker-share-store-minute-changelog-12019-11-19 05:50:49.816|WARN 
|flash-client-xmc-StreamThread-3|o.a.k.s.p.i.StoreChangelogReader|101|stream-thread
 [flash-client-xmc-StreamThread-3] Restoring StreamTasks failed. Deleting 
StreamTasks stores to recreate from 
scratch.org.apache.kafka.clients.consumer.OffsetOutOfRangeException: Offsets 
out of range with no configured reset policy for partitions: 
{flash-app-xmc-worker-share-store-minute-changelog-1=6128684} at 
org.apache.kafka.clients.consumer.internals.Fetcher.parseCompletedFetch(Fetcher.java:987)2019-11-19
 05:50:49.817|INFO 
|flash-client-xmc-StreamThread-3|o.a.k.s.p.i.StoreChangelogReader|105|stream-thread
 [flash-client-xmc-StreamThread-3] Reinitializing StreamTask TaskId: 10_1 
ProcessorTopology: KSTREAM-SOURCE-70: topics: 
[flash-app-xmc-worker-share-store-minute-repartition] children: 
[KSTREAM-AGGREGATE-67] KSTREAM-AGGREGATE-67: states: 
[worker-share-store-minute] children: [KTABLE-TOSTREAM-71] 
KTABLE-TOSTREAM-71: children: [KSTREAM-SINK-72] 
KSTREAM-SINK-72: topic: 
StaticTopicNameExtractor(xmc-worker-share-minute)Partitions 
[flash-app-xmc-worker-share-store-minute-repartition-1] for changelog 
flash-app-xmc-worker-share-store-minute-changelog-12019-11-19 05:50:49.842|WARN 
|flash-client-xmc-StreamThread-3|o.a.k.s.p.i.StoreChangelogReader|101|stream-thread
 [flash-client-xmc-StreamThread-3] Restoring StreamTasks failed. Deleting 
StreamTasks stores to recreate from 
scratch.org.apache.kafka.clients.consumer.OffsetOutOfRangeException: Offsets 
out of range with no configured reset policy for partitions: 
{flash-app-xmc-worker-share-store-minute-changelog-1=6128684} at 
org.apache.kafka.clients.consumer.internals.Fetcher.parseCompletedFetch(Fetcher.java:987)2019-11-19
 05:50:49.842|INFO 
|flash-client-xmc-StreamThread-3|o.a.k.s.p.i.StoreChangelogReader|105|stream-thread
 [flash-client-xmc-StreamThread-3] Reinitializing StreamTask TaskId: 10_1 
ProcessorTopology: KSTREAM-SOURCE-70: topics: 
[flash-app-xmc-worker-share-store-minute-repartition] children: 
[KSTREAM-AGGREGATE-67] KSTREAM-AGGREGATE-67: states: 
[worker-share-store-minute] children: [KTABLE-TOSTREAM-71] 
KTABLE-TOSTREAM-71: children: [KSTREAM-SINK-72] 
KSTREAM-SINK-72: topic: 
StaticTopicNameExtractor(xmc-worker-share-minute)Partitions 
[flash-app-xmc-worker-share-store-minute-repartition-1] for changelog 
flash-app-xmc-worker-share-store-minute-changelog-12019-11-19 05:50:49.905|WARN 
|flash-client-xmc-StreamThread-3|o.a.k.s.p.i.StoreChangelogReader|101|stream-thread
 [flash-client-xmc-StreamThread-3] Restoring StreamTasks failed. Deleting 
StreamTasks stores to recreate from 
scratch.org.apache.kafka.clients.consumer.OffsetOutOfRangeException: Offsets 
out of range with no configured reset policy for partitions: 
{flash-app-xmc-worker-share-store-minute-changelog-1=6128684} at 
org.apache.kafka.clients.consumer.internals.Fetcher.parseCompletedFetch(Fetcher.java:987)2019-11-19
 05:50:49.906|INFO 
|flash-client-xmc-StreamThread-3|o.a.k.s.p.i.StoreChangelogReader|105|stream-thread
 [flash-client-xmc-StreamThread-3] Reinitializing StreamTask TaskId: 10_1 
ProcessorTopology: KSTREAM-SOURCE-70: topics: 
[flash-app-xmc-worker-share-store-minute-repartition] children: 
[KSTREAM-AGGREGATE-67] KSTREAM-AGGREGATE-67: states: 
[worker-share-store-minute]
 {noformat}
 

  was:
```

Partitions [flash-app-xmc-worker-share-store-minute-repartition-1]Partitions 
[flash-app-xmc-worker-share-store-minute-repartition-1] for changelog 
flash-app-xmc-worker-share-store-minute-changelog-12019-11-19 05:50:49.816|WARN 
|flash-client-xmc-StreamThread-3|o.a.k.s.p.i.StoreChangelogReader|101|stream-thread
 [flash-client-xmc-StreamThread-3] Restoring StreamTasks failed. Deleting 
StreamTasks stores to recreate from 
scratch.org.apache.kafka.clients.consumer.OffsetOutOfRangeException: Offsets 
out of range with no configured reset policy for partitions: 
\{flash-app-xmc-worker-share-store-minute-changelog-1=6128684} at 
org.apache.kafka.clients.consumer.internals.Fetcher.parseCompletedFetch(Fetcher.java:987)2019-11-19
 05:50:49.817|INFO 
|flash-client-xmc-StreamThread-3|o.a.k.s.p.i.StoreChangelogReader|105|stream-thread
 [flash-client-xmc-StreamThread-3] 

[jira] [Created] (KAFKA-9210) kafka stream loss data

2019-11-18 Thread panpan.liu (Jira)
panpan.liu created KAFKA-9210:
-

 Summary: kafka stream loss data
 Key: KAFKA-9210
 URL: https://issues.apache.org/jira/browse/KAFKA-9210
 Project: Kafka
  Issue Type: Bug
Affects Versions: 2.0.1
Reporter: panpan.liu


```

Partitions [flash-app-xmc-worker-share-store-minute-repartition-1]Partitions 
[flash-app-xmc-worker-share-store-minute-repartition-1] for changelog 
flash-app-xmc-worker-share-store-minute-changelog-12019-11-19 05:50:49.816|WARN 
|flash-client-xmc-StreamThread-3|o.a.k.s.p.i.StoreChangelogReader|101|stream-thread
 [flash-client-xmc-StreamThread-3] Restoring StreamTasks failed. Deleting 
StreamTasks stores to recreate from 
scratch.org.apache.kafka.clients.consumer.OffsetOutOfRangeException: Offsets 
out of range with no configured reset policy for partitions: 
\{flash-app-xmc-worker-share-store-minute-changelog-1=6128684} at 
org.apache.kafka.clients.consumer.internals.Fetcher.parseCompletedFetch(Fetcher.java:987)2019-11-19
 05:50:49.817|INFO 
|flash-client-xmc-StreamThread-3|o.a.k.s.p.i.StoreChangelogReader|105|stream-thread
 [flash-client-xmc-StreamThread-3] Reinitializing StreamTask TaskId: 10_1 
ProcessorTopology: KSTREAM-SOURCE-70: topics: 
[flash-app-xmc-worker-share-store-minute-repartition] children: 
[KSTREAM-AGGREGATE-67] KSTREAM-AGGREGATE-67: states: 
[worker-share-store-minute] children: [KTABLE-TOSTREAM-71] 
KTABLE-TOSTREAM-71: children: [KSTREAM-SINK-72] 
KSTREAM-SINK-72: topic: 
StaticTopicNameExtractor(xmc-worker-share-minute)Partitions 
[flash-app-xmc-worker-share-store-minute-repartition-1] for changelog 
flash-app-xmc-worker-share-store-minute-changelog-12019-11-19 05:50:49.842|WARN 
|flash-client-xmc-StreamThread-3|o.a.k.s.p.i.StoreChangelogReader|101|stream-thread
 [flash-client-xmc-StreamThread-3] Restoring StreamTasks failed. Deleting 
StreamTasks stores to recreate from 
scratch.org.apache.kafka.clients.consumer.OffsetOutOfRangeException: Offsets 
out of range with no configured reset policy for partitions: 
\{flash-app-xmc-worker-share-store-minute-changelog-1=6128684} at 
org.apache.kafka.clients.consumer.internals.Fetcher.parseCompletedFetch(Fetcher.java:987)2019-11-19
 05:50:49.842|INFO 
|flash-client-xmc-StreamThread-3|o.a.k.s.p.i.StoreChangelogReader|105|stream-thread
 [flash-client-xmc-StreamThread-3] Reinitializing StreamTask TaskId: 10_1 
ProcessorTopology: KSTREAM-SOURCE-70: topics: 
[flash-app-xmc-worker-share-store-minute-repartition] children: 
[KSTREAM-AGGREGATE-67] KSTREAM-AGGREGATE-67: states: 
[worker-share-store-minute] children: [KTABLE-TOSTREAM-71] 
KTABLE-TOSTREAM-71: children: [KSTREAM-SINK-72] 
KSTREAM-SINK-72: topic: 
StaticTopicNameExtractor(xmc-worker-share-minute)Partitions 
[flash-app-xmc-worker-share-store-minute-repartition-1] for changelog 
flash-app-xmc-worker-share-store-minute-changelog-12019-11-19 05:50:49.905|WARN 
|flash-client-xmc-StreamThread-3|o.a.k.s.p.i.StoreChangelogReader|101|stream-thread
 [flash-client-xmc-StreamThread-3] Restoring StreamTasks failed. Deleting 
StreamTasks stores to recreate from 
scratch.org.apache.kafka.clients.consumer.OffsetOutOfRangeException: Offsets 
out of range with no configured reset policy for partitions: 
\{flash-app-xmc-worker-share-store-minute-changelog-1=6128684} at 
org.apache.kafka.clients.consumer.internals.Fetcher.parseCompletedFetch(Fetcher.java:987)2019-11-19
 05:50:49.906|INFO 
|flash-client-xmc-StreamThread-3|o.a.k.s.p.i.StoreChangelogReader|105|stream-thread
 [flash-client-xmc-StreamThread-3] Reinitializing StreamTask TaskId: 10_1 
ProcessorTopology: KSTREAM-SOURCE-70: topics: 
[flash-app-xmc-worker-share-store-minute-repartition] children: 
[KSTREAM-AGGREGATE-67] KSTREAM-AGGREGATE-67: states: 
[worker-share-store-minute]

```



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (KAFKA-9205) Add an option to enforce rack-aware partition reassignment

2019-11-18 Thread sats (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-9205?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16977155#comment-16977155
 ] 

sats commented on KAFKA-9205:
-

[~vahid] can you please create the KIP (Sorry i am newbie not aware of the 
process), i can work on the code part.

> Add an option to enforce rack-aware partition reassignment
> --
>
> Key: KAFKA-9205
> URL: https://issues.apache.org/jira/browse/KAFKA-9205
> Project: Kafka
>  Issue Type: Improvement
>  Components: admin, tools
>Reporter: Vahid Hashemian
>Priority: Minor
>
> One regularly used healing operation on Kafka clusters is replica 
> reassignments for topic partitions. For example, when there is a skew in 
> inbound/outbound traffic of a broker replica reassignment can be used to move 
> some leaders/followers from the broker; or if there is a skew in disk usage 
> of brokers, replica reassignment can more some partitions to other brokers 
> that have more disk space available.
> In Kafka clusters that span across multiple data centers (or availability 
> zones), high availability is a priority; in the sense that when a data center 
> goes offline the cluster should be able to resume normal operation by 
> guaranteeing partition replicas in all data centers.
> This guarantee is currently the responsibility of the on-call engineer that 
> performs the reassignment or the tool that automatically generates the 
> reassignment plan for improving the cluster health (e.g. by considering the 
> rack configuration value of each broker in the cluster). the former, is quite 
> error-prone, and the latter, would lead to duplicate code in all such admin 
> tools (which are not error free either). Not all use cases can make use the 
> default assignment strategy that is used by --generate option; and current 
> rack aware enforcement applies to this option only.
> It would be great for the built-in replica assignment API and tool provided 
> by Kafka to support a rack aware verification option for --execute scenario 
> that would simply return an error when [some] brokers in any replica set 
> share a common rack. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (KAFKA-7356) Add gradle task for dependency listing

2019-11-18 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-7356?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16977133#comment-16977133
 ] 

ASF GitHub Bot commented on KAFKA-7356:
---

omkreddy commented on pull request #5589: KAFKA-7356 Added allDeps task to 
generate complete dependency report.
URL: https://github.com/apache/kafka/pull/5589
 
 
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Add gradle task for dependency listing
> --
>
> Key: KAFKA-7356
> URL: https://issues.apache.org/jira/browse/KAFKA-7356
> Project: Kafka
>  Issue Type: New Feature
>  Components: build, packaging
>Reporter: Andy LoPresto
>Priority: Minor
>
> I needed to examine the dependency list to confirm/deny use of a specific 
> dependency. Running {{gradle -q dependencies}} in the root directory only 
> lists the {{rat}} dependencies. Adding a custom section to *build.gradle* 
> allows for a complete listing of the dependencies from the command line. 
> {code}
> subprojects {
> task allDeps(type: DependencyReportTask) {}
> }
> {code}
> To invoke: {{gradle allDeps}}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (KAFKA-9157) logcleaner could generate empty segment files after cleaning

2019-11-18 Thread huxihx (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-9157?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

huxihx reassigned KAFKA-9157:
-

Assignee: huxihx

> logcleaner could generate empty segment files after cleaning
> 
>
> Key: KAFKA-9157
> URL: https://issues.apache.org/jira/browse/KAFKA-9157
> Project: Kafka
>  Issue Type: Improvement
>  Components: core
>Affects Versions: 2.3.0
>Reporter: Jun Rao
>Assignee: huxihx
>Priority: Major
>
> Currently, the log cleaner could only combine segments within a 2-billion 
> offset range. If all records in that range are deleted, an empty segment 
> could be generated. It would be useful to avoid generating such empty 
> segments.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (KAFKA-9209) Avoid sending unnecessary offset updates from consumer after KIP-211

2019-11-18 Thread Michael Bingham (Jira)
Michael Bingham created KAFKA-9209:
--

 Summary: Avoid sending unnecessary offset updates from consumer 
after KIP-211
 Key: KAFKA-9209
 URL: https://issues.apache.org/jira/browse/KAFKA-9209
 Project: Kafka
  Issue Type: Improvement
  Components: consumer
Affects Versions: 2.3.0
Reporter: Michael Bingham


With KIP-211 
([https://cwiki.apache.org/confluence/display/KAFKA/KIP-211%3A+Revise+Expiration+Semantics+of+Consumer+Group+Offsets]),
 offsets will no longer expire as long as the consumer group is active.

If the consumer has {{enable.auto.commit=true}}, and if no new events are 
arriving on subscribed partition(s), the consumer still sends offsets 
(unchanged) to the group coordinator just to keep them from expiring. This is 
no longer necessary, and an optimization could potentially be implemented to 
only send offsets with auto commit when there are actual updates to be made 
(i.e., when new events have been processed). 

This would require detecting whether the broker supports the new expiration 
semantics in KIP-211, and only apply the optimization when it does.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (KAFKA-8981) Add bandwidth limits to DegradedNetworkFault

2019-11-18 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-8981?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16977051#comment-16977051
 ] 

ASF GitHub Bot commented on KAFKA-8981:
---

mumrah commented on pull request #7446: KAFKA-8981 Add rate limiting to 
NetworkDegradeSpec
URL: https://github.com/apache/kafka/pull/7446
 
 
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Add bandwidth limits to DegradedNetworkFault
> 
>
> Key: KAFKA-8981
> URL: https://issues.apache.org/jira/browse/KAFKA-8981
> Project: Kafka
>  Issue Type: Improvement
>Reporter: David Arthur
>Priority: Minor
>
> We are currently only using Traffic Control (tc) to introduce packet latency. 
> It also has the ability to limit the egress bandwidth. It would be nice to 
> include this as an option when running system tests that need to simulate 
> long/slow links.
>  
> {{tc}} references
> * [https://netbeez.net/blog/how-to-use-the-linux-traffic-control/]
> * [https://www.badunetworks.com/traffic-shaping-with-tc/]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (KAFKA-9188) Flaky Test SslAdminClientIntegrationTest.testSynchronousAuthorizerAclUpdatesBlockRequestThreads

2019-11-18 Thread Matthias J. Sax (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-9188?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16977025#comment-16977025
 ] 

Matthias J. Sax commented on KAFKA-9188:


[https://builds.apache.org/job/kafka-pr-jdk11-scala2.13/3444/testReport/junit/kafka.api/SslAdminIntegrationTest/testSynchronousAuthorizerAclUpdatesBlockRequestThreads/]

> Flaky Test 
> SslAdminClientIntegrationTest.testSynchronousAuthorizerAclUpdatesBlockRequestThreads
> ---
>
> Key: KAFKA-9188
> URL: https://issues.apache.org/jira/browse/KAFKA-9188
> Project: Kafka
>  Issue Type: Test
>  Components: core
>Reporter: Bill Bejeck
>Priority: Major
>  Labels: flaky-test
>
> Failed in 
> [https://builds.apache.org/job/kafka-pr-jdk11-scala2.12/9373/testReport/junit/kafka.api/SslAdminClientIntegrationTest/testSynchronousAuthorizerAclUpdatesBlockRequestThreads/]
>  
> {noformat}
> Error Messagejava.util.concurrent.ExecutionException: 
> org.apache.kafka.common.errors.TimeoutException: Aborted due to 
> timeout.Stacktracejava.util.concurrent.ExecutionException: 
> org.apache.kafka.common.errors.TimeoutException: Aborted due to timeout.
>   at 
> org.apache.kafka.common.internals.KafkaFutureImpl.wrapAndThrow(KafkaFutureImpl.java:45)
>   at 
> org.apache.kafka.common.internals.KafkaFutureImpl.access$000(KafkaFutureImpl.java:32)
>   at 
> org.apache.kafka.common.internals.KafkaFutureImpl$SingleWaiter.await(KafkaFutureImpl.java:89)
>   at 
> org.apache.kafka.common.internals.KafkaFutureImpl.get(KafkaFutureImpl.java:260)
>   at 
> kafka.api.SslAdminClientIntegrationTest.$anonfun$testSynchronousAuthorizerAclUpdatesBlockRequestThreads$1(SslAdminClientIntegrationTest.scala:201)
>   at 
> scala.collection.mutable.ResizableArray.foreach(ResizableArray.scala:62)
>   at 
> scala.collection.mutable.ResizableArray.foreach$(ResizableArray.scala:55)
>   at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:49)
>   at 
> kafka.api.SslAdminClientIntegrationTest.testSynchronousAuthorizerAclUpdatesBlockRequestThreads(SslAdminClientIntegrationTest.scala:201)
>   at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.base/java.lang.reflect.Method.invoke(Method.java:566)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
>   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>   at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
>   at 
> org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
>   at 
> org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:288)
>   at 
> org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:282)
>   at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
>   at java.base/java.lang.Thread.run(Thread.java:834)
> Caused by: org.apache.kafka.common.errors.TimeoutException: Aborted due to 
> timeout.
> Standard Output[2019-11-14 15:13:51,489] ERROR [ReplicaFetcher replicaId=1, 
> leaderId=2, fetcherId=0] Error for partition mytopic1-1 at offset 0 
> (kafka.server.ReplicaFetcherThread:76)
> org.apache.kafka.common.errors.UnknownTopicOrPartitionException: This server 
> does not host this topic-partition.
> [2019-11-14 15:13:51,490] ERROR [ReplicaFetcher replicaId=1, leaderId=0, 
> fetcherId=0] Error for partition mytopic1-0 at offset 0 
> (kafka.server.ReplicaFetcherThread:76)
> org.apache.kafka.common.errors.UnknownTopicOrPartitionException: This server 
> does not host this topic-partition.
> [2019-11-14 15:14:04,686] ERROR [KafkaApi-2] Error when handling request: 
> clientId=adminclient-644, correlationId=4, api=CREATE_ACLS, version=1, 
> body={creations=[{resource_type=2,resource_name=foobar,resource_pattern_type=3,principal=User:ANONYMOUS,host=*,operation=3,permission_type=3},{resource_type=5,resource_name=transactional_id,resource_pattern_type=3,principal=User:ANONYMOUS,host=*,operation=4,permission_type=3}]}
>  (kafka.server.KafkaApis:76)
> org.apache.kafka.common.errors.ClusterAuthorizationException: Request 
> Request(processor=1, connectionId=127.0.0.1:41993-127.0.0.1:34770-0, 
> 

[jira] [Comment Edited] (KAFKA-9207) Replica Out of Sync as creating ReplicaFetcher thread failed with connection to leader

2019-11-18 Thread Xue Liu (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-9207?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16976957#comment-16976957
 ] 

Xue Liu edited comment on KAFKA-9207 at 11/18/19 11:38 PM:
---

We have some further discovery:

When creating that thread, the follower had connection error to the leader. See 
attachment error-connection.jpg

I feel like we can at least add retry for these temporary network error and 
also add log to catch this error.  

 

 


was (Author: xuel1):
We have some further discovery:

When creating that thread, the follower had connection error to the leader. See 
attachment error-connection.jpg

 

 

> Replica Out of Sync as creating ReplicaFetcher thread failed with connection 
> to leader
> --
>
> Key: KAFKA-9207
> URL: https://issues.apache.org/jira/browse/KAFKA-9207
> Project: Kafka
>  Issue Type: Bug
>  Components: replication
>Affects Versions: 2.3.0
>Reporter: Xue Liu
>Priority: Major
> Attachments: Capture.PNG, error-connection.jpg
>
>
> We sometimes see a replica for a partition is out of sync. When the issue 
> happens, it seems that we just lost that replica (would never catch-up), 
> unless we restart that broker.
> It appears that ReplicaFetcher thread for that partition is dead and broker 
> will not restart that thread. We didn't see any exception in server or 
> controller logs.
> The screen capture is taken from the broker that has that replica. Leader is 
> 2017.
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (KAFKA-8571) Not complete delayed produce requests when processing StopReplicaRequest causing high produce latency for acks=all

2019-11-18 Thread Jason Gustafson (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-8571?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Gustafson resolved KAFKA-8571.

Resolution: Fixed

> Not complete delayed produce requests when processing StopReplicaRequest 
> causing high produce latency for acks=all
> --
>
> Key: KAFKA-8571
> URL: https://issues.apache.org/jira/browse/KAFKA-8571
> Project: Kafka
>  Issue Type: Bug
>Reporter: Zhanxiang (Patrick) Huang
>Assignee: Zhanxiang (Patrick) Huang
>Priority: Major
>
> Currently a broker will only attempt to complete delayed requests upon 
> highwater mark changes and receiving LeaderAndIsrRequest. When a broker 
> receives StopReplicaRequest, it will not try to complete delayed operations 
> including delayed produce for acks=all, which can cause the producer to 
> timeout even though the producer should have attempted to talk to the new 
> leader faster if a NotLeaderForPartition error is sent.
> This can happen during partition reassignment when controller is trying to 
> kick the previous leader out of the replica set. It this case, controller 
> will only send StopReplicaRequest (not LeaderAndIsrRequest) to the previous 
> leader in the replica set shrink phase. Here is an example:
> {noformat}
> During Reassign the replica set of partition A from [B1, B2] to [B2, B3]:
> t0: Controller expands the replica set to [B1, B2, B3]
> t1: B1 receives produce request PR on partition A with acks=all and timetout 
> T. B1 puts PR into the DelayedProducePurgatory with timeout T.
> t2: Controller elects B2 as the new leader and shrinks the replica set fo 
> [B2, B3]. LeaderAndIsrRequests are sent to B2 and B3. StopReplicaRequest is 
> sent to B!.
> t3: B1 receives StopReplicaRequest but doesn't try to comeplete PR.
> If PR cannot be fullfilled by t3, and t1 + T > t3, PR will eventually time 
> out in the purgatory and producer will eventually time out the produce 
> request.{noformat}
> Since it is possible for the leader to receive only a StopReplicaRequest 
> (without receiving any LeaderAndIsrRequest) to leave the replica set, a fix 
> for this issue is to also try to complete delay operations in processing 
> StopReplicaRequest.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (KAFKA-9198) StopReplica handler should complete pending purgatory operations

2019-11-18 Thread Jason Gustafson (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-9198?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Gustafson resolved KAFKA-9198.

Fix Version/s: 2.4.0
   Resolution: Fixed

The PR has been merged. I meant to change the title to reference KAFKA-8571, 
but forgot about it. So instead I'll just resolve this one as fixed and mark 
the other as a dup. Sorry for the noise.

> StopReplica handler should complete pending purgatory operations
> 
>
> Key: KAFKA-9198
> URL: https://issues.apache.org/jira/browse/KAFKA-9198
> Project: Kafka
>  Issue Type: Bug
>Reporter: Jason Gustafson
>Assignee: Jason Gustafson
>Priority: Major
> Fix For: 2.4.0
>
>
> When a reassignment completes, the current leader may need to be shutdown 
> with a StopReplica request. It may still have fetch/produce requests in 
> purgatory when this happens. We do not have logic currently to force 
> completion of these requests which means they are doomed to eventually 
> timeout. This is mostly an issue for produce requests which use the default 
> request timeout of 30s.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Reopened] (KAFKA-9198) StopReplica handler should complete pending purgatory operations

2019-11-18 Thread Jason Gustafson (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-9198?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Gustafson reopened KAFKA-9198:


> StopReplica handler should complete pending purgatory operations
> 
>
> Key: KAFKA-9198
> URL: https://issues.apache.org/jira/browse/KAFKA-9198
> Project: Kafka
>  Issue Type: Bug
>Reporter: Jason Gustafson
>Assignee: Jason Gustafson
>Priority: Major
>
> When a reassignment completes, the current leader may need to be shutdown 
> with a StopReplica request. It may still have fetch/produce requests in 
> purgatory when this happens. We do not have logic currently to force 
> completion of these requests which means they are doomed to eventually 
> timeout. This is mostly an issue for produce requests which use the default 
> request timeout of 30s.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (KAFKA-8571) Not complete delayed produce requests when processing StopReplicaRequest causing high produce latency for acks=all

2019-11-18 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-8571?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16976965#comment-16976965
 ] 

ASF GitHub Bot commented on KAFKA-8571:
---

hachikuji commented on pull request #7069: KAFKA-8571: Clean up purgatory when 
leader replica is kicked out of replica list.
URL: https://github.com/apache/kafka/pull/7069
 
 
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Not complete delayed produce requests when processing StopReplicaRequest 
> causing high produce latency for acks=all
> --
>
> Key: KAFKA-8571
> URL: https://issues.apache.org/jira/browse/KAFKA-8571
> Project: Kafka
>  Issue Type: Bug
>Reporter: Zhanxiang (Patrick) Huang
>Assignee: Zhanxiang (Patrick) Huang
>Priority: Major
>
> Currently a broker will only attempt to complete delayed requests upon 
> highwater mark changes and receiving LeaderAndIsrRequest. When a broker 
> receives StopReplicaRequest, it will not try to complete delayed operations 
> including delayed produce for acks=all, which can cause the producer to 
> timeout even though the producer should have attempted to talk to the new 
> leader faster if a NotLeaderForPartition error is sent.
> This can happen during partition reassignment when controller is trying to 
> kick the previous leader out of the replica set. It this case, controller 
> will only send StopReplicaRequest (not LeaderAndIsrRequest) to the previous 
> leader in the replica set shrink phase. Here is an example:
> {noformat}
> During Reassign the replica set of partition A from [B1, B2] to [B2, B3]:
> t0: Controller expands the replica set to [B1, B2, B3]
> t1: B1 receives produce request PR on partition A with acks=all and timetout 
> T. B1 puts PR into the DelayedProducePurgatory with timeout T.
> t2: Controller elects B2 as the new leader and shrinks the replica set fo 
> [B2, B3]. LeaderAndIsrRequests are sent to B2 and B3. StopReplicaRequest is 
> sent to B!.
> t3: B1 receives StopReplicaRequest but doesn't try to comeplete PR.
> If PR cannot be fullfilled by t3, and t1 + T > t3, PR will eventually time 
> out in the purgatory and producer will eventually time out the produce 
> request.{noformat}
> Since it is possible for the leader to receive only a StopReplicaRequest 
> (without receiving any LeaderAndIsrRequest) to leave the replica set, a fix 
> for this issue is to also try to complete delay operations in processing 
> StopReplicaRequest.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (KAFKA-9198) StopReplica handler should complete pending purgatory operations

2019-11-18 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-9198?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16976964#comment-16976964
 ] 

ASF GitHub Bot commented on KAFKA-9198:
---

hachikuji commented on pull request #7701: KAFKA-9198; Complete purgatory 
operations on receiving StopReplica
URL: https://github.com/apache/kafka/pull/7701
 
 
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> StopReplica handler should complete pending purgatory operations
> 
>
> Key: KAFKA-9198
> URL: https://issues.apache.org/jira/browse/KAFKA-9198
> Project: Kafka
>  Issue Type: Bug
>Reporter: Jason Gustafson
>Assignee: Jason Gustafson
>Priority: Major
>
> When a reassignment completes, the current leader may need to be shutdown 
> with a StopReplica request. It may still have fetch/produce requests in 
> purgatory when this happens. We do not have logic currently to force 
> completion of these requests which means they are doomed to eventually 
> timeout. This is mostly an issue for produce requests which use the default 
> request timeout of 30s.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Comment Edited] (KAFKA-9207) Replica Out of Sync as creating ReplicaFetcher thread failed with connection to leader

2019-11-18 Thread Xue Liu (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-9207?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16976957#comment-16976957
 ] 

Xue Liu edited comment on KAFKA-9207 at 11/18/19 10:50 PM:
---

We have some further discovery:

When creating that thread, the follower had connection error to the leader. See 
attachment error-connection.jpg

 

 


was (Author: xuel1):
We have some further discovery:

 

When creating that thread, the follower had connection error to the leader. See 
attachment error-connection.jpg

 

 

> Replica Out of Sync as creating ReplicaFetcher thread failed with connection 
> to leader
> --
>
> Key: KAFKA-9207
> URL: https://issues.apache.org/jira/browse/KAFKA-9207
> Project: Kafka
>  Issue Type: Bug
>  Components: replication
>Affects Versions: 2.3.0
>Reporter: Xue Liu
>Priority: Major
> Attachments: Capture.PNG, error-connection.jpg
>
>
> We sometimes see a replica for a partition is out of sync. When the issue 
> happens, it seems that we just lost that replica (would never catch-up), 
> unless we restart that broker.
> It appears that ReplicaFetcher thread for that partition is dead and broker 
> will not restart that thread. We didn't see any exception in server or 
> controller logs.
> The screen capture is taken from the broker that has that replica. Leader is 
> 2017.
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (KAFKA-9207) Replica Out of Sync as creating ReplicaFetcher thread failed with connection to leader

2019-11-18 Thread Xue Liu (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-9207?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xue Liu updated KAFKA-9207:
---
Attachment: error-connection.jpg

> Replica Out of Sync as creating ReplicaFetcher thread failed with connection 
> to leader
> --
>
> Key: KAFKA-9207
> URL: https://issues.apache.org/jira/browse/KAFKA-9207
> Project: Kafka
>  Issue Type: Bug
>  Components: replication
>Affects Versions: 2.3.0
>Reporter: Xue Liu
>Priority: Major
> Attachments: Capture.PNG, error-connection.jpg
>
>
> We sometimes see a replica for a partition is out of sync. When the issue 
> happens, it seems that we just lost that replica (would never catch-up), 
> unless we restart that broker.
> It appears that ReplicaFetcher thread for that partition is dead and broker 
> will not restart that thread. We didn't see any exception in server or 
> controller logs.
> The screen capture is taken from the broker that has that replica. Leader is 
> 2017.
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (KAFKA-9207) Replica Out of Sync as creating ReplicaFetcher thread failed with connection to leader

2019-11-18 Thread Xue Liu (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-9207?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16976957#comment-16976957
 ] 

Xue Liu commented on KAFKA-9207:


We have some further discovery:

 

When creating that thread, the follower had connection error to the leader. See 
attachment error-connection.jpg

 

 

> Replica Out of Sync as creating ReplicaFetcher thread failed with connection 
> to leader
> --
>
> Key: KAFKA-9207
> URL: https://issues.apache.org/jira/browse/KAFKA-9207
> Project: Kafka
>  Issue Type: Bug
>  Components: replication
>Affects Versions: 2.3.0
>Reporter: Xue Liu
>Priority: Major
> Attachments: Capture.PNG
>
>
> We sometimes see a replica for a partition is out of sync. When the issue 
> happens, it seems that we just lost that replica (would never catch-up), 
> unless we restart that broker.
> It appears that ReplicaFetcher thread for that partition is dead and broker 
> will not restart that thread. We didn't see any exception in server or 
> controller logs.
> The screen capture is taken from the broker that has that replica. Leader is 
> 2017.
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (KAFKA-9207) Replica Out of Sync as creating ReplicaFetcher thread failed with connection to leader

2019-11-18 Thread Xue Liu (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-9207?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xue Liu updated KAFKA-9207:
---
Summary: Replica Out of Sync as creating ReplicaFetcher thread failed with 
connection to leader  (was: Replica Out of Sync as ReplicaFetcher thread is 
dead)

> Replica Out of Sync as creating ReplicaFetcher thread failed with connection 
> to leader
> --
>
> Key: KAFKA-9207
> URL: https://issues.apache.org/jira/browse/KAFKA-9207
> Project: Kafka
>  Issue Type: Bug
>  Components: replication
>Affects Versions: 2.3.0
>Reporter: Xue Liu
>Priority: Major
> Attachments: Capture.PNG
>
>
> We sometimes see a replica for a partition is out of sync. When the issue 
> happens, it seems that we just lost that replica (would never catch-up), 
> unless we restart that broker.
> It appears that ReplicaFetcher thread for that partition is dead and broker 
> will not restart that thread. We didn't see any exception in server or 
> controller logs.
> The screen capture is taken from the broker that has that replica. Leader is 
> 2017.
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (KAFKA-9205) Add an option to enforce rack-aware partition reassignment

2019-11-18 Thread Vahid Hashemian (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-9205?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16976895#comment-16976895
 ] 

Vahid Hashemian commented on KAFKA-9205:


This will still likely require a KIP since the default behavior could change.

> Add an option to enforce rack-aware partition reassignment
> --
>
> Key: KAFKA-9205
> URL: https://issues.apache.org/jira/browse/KAFKA-9205
> Project: Kafka
>  Issue Type: Improvement
>  Components: admin, tools
>Reporter: Vahid Hashemian
>Priority: Minor
>
> One regularly used healing operation on Kafka clusters is replica 
> reassignments for topic partitions. For example, when there is a skew in 
> inbound/outbound traffic of a broker replica reassignment can be used to move 
> some leaders/followers from the broker; or if there is a skew in disk usage 
> of brokers, replica reassignment can more some partitions to other brokers 
> that have more disk space available.
> In Kafka clusters that span across multiple data centers (or availability 
> zones), high availability is a priority; in the sense that when a data center 
> goes offline the cluster should be able to resume normal operation by 
> guaranteeing partition replicas in all data centers.
> This guarantee is currently the responsibility of the on-call engineer that 
> performs the reassignment or the tool that automatically generates the 
> reassignment plan for improving the cluster health (e.g. by considering the 
> rack configuration value of each broker in the cluster). the former, is quite 
> error-prone, and the latter, would lead to duplicate code in all such admin 
> tools (which are not error free either). Not all use cases can make use the 
> default assignment strategy that is used by --generate option; and current 
> rack aware enforcement applies to this option only.
> It would be great for the built-in replica assignment API and tool provided 
> by Kafka to support a rack aware verification option for --execute scenario 
> that would simply return an error when [some] brokers in any replica set 
> share a common rack. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (KAFKA-8843) Zookeeper migration tool support for TLS

2019-11-18 Thread Kelly Schoenhofen (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-8843?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16976893#comment-16976893
 ] 

Kelly Schoenhofen commented on KAFKA-8843:
--

Question, does ZK 3.5.6 allow for SSL (TLS, but let's say SSL to keep in line 
with the documentation) from Kafka? Not SASL_SSL, just plain SSL. Is that what 
this Jira is for? I have quorum TLS working in ZK 3.5.6, I added a tls-secured 
listener, but as of yet I can't quite get Kafka to connect to it:

{{[2019-11-18 15:03:11,545] INFO Opening socket connection to server 
xxx/x.x.x.x:2182. Will not attempt to authenticate using SASL (unknown error) 
(org.apache.zookeeper.ClientCnxn)}}

is the closest I have come, but I didn't want do to SASL_SSL, I just want to 
secure the traffic between Kafka and ZooKeeper using TLS 1.2 and a specific 
class of cipher, like TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384, and enforce the CN 
name on each side to match each other's cert & trusted cert stores (like how 
ZooKeeper Quorum TLS works). 

> Zookeeper migration tool support for TLS
> 
>
> Key: KAFKA-8843
> URL: https://issues.apache.org/jira/browse/KAFKA-8843
> Project: Kafka
>  Issue Type: Bug
>Reporter: Pere Urbon-Bayes
>Assignee: Pere Urbon-Bayes
>Priority: Minor
>
> Currently zookeeper-migration tool works based on SASL authentication. What 
> means only digest and kerberos authentication is supported.
>  
> With the introduction of ZK 3.5, TLS is added, including a new X509 
> authentication provider. 
>  
> To support this great future and utilise the TLS principals, the 
> zookeeper-migration-tool script should support the X509 authentication as 
> well.
>  
> In my newbie view, this should mean adding a new parameter to allow other 
> ways of authentication around 
> [https://github.com/apache/kafka/blob/trunk/core/src/main/scala/kafka/admin/ZkSecurityMigrator.scala#L65.
>  
> |https://github.com/apache/kafka/blob/trunk/core/src/main/scala/kafka/admin/ZkSecurityMigrator.scala#L65]
>  
> If I understand the process correct, this will require a KIP, right?
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (KAFKA-8326) Add Serde> support

2019-11-18 Thread Daniyar Yeralin (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-8326?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniyar Yeralin updated KAFKA-8326:
---
Description: 
_This ticket proposes adding new {color:#4c9aff}ListSerializer{color} and 
{color:#4c9aff}ListDeserializer{color} classes as well as support for the new 
classes into the Serdes class. This will allow using List Serde of type_ 
{color:#4c9aff}_Serde>_{color} _directly from Consumers, Producers 
and Streams._

_{color:#4c9aff}Serde>{color} serialization and deserialization 
will be done through repeatedly calling a serializer/deserializer for each 
entry provided by passed generic {color:#4c9aff}Inner{color}'s Serde. For 
example, if you want to create List of Strings serde, then 
serializer/deserializer of StringSerde will be used to serialize/deserialize 
each entry in {color:#4c9aff}List{color}._

I believe there are many use cases where List Serde could be used. Ex. 
[https://stackoverflow.com/questions/41427174/aggregate-java-objects-in-a-list-with-kafka-streams-dsl-windows],
 
[https://stackoverflow.com/questions/46365884/issue-with-arraylist-serde-in-kafka-streams-api]

For instance, aggregate grouped (by key) values together in a list to do other 
subsequent operations on the collection.

KIP Link: 
[https://cwiki.apache.org/confluence/display/KAFKA/KIP-466%3A+Add+support+for+List%3CT%3E+serialization+and+deserialization]

  was:
_This ticket proposes adding new {color:#4c9aff}ListSerializer{color} and 
{color:#4c9aff}ListDeserializer{color} classes as well as support for the new 
classes into the Serdes class. This will allow using List Serde of type_ 
{color:#4c9aff}_, T>_{color} _directly from Consumers, 
Producers and Streams._

_{color:#4c9aff}List{color} serialization and deserialization will be done 
through repeatedly calling a serializer/deserializer for each entry provided by 
passed generic {color:#4c9aff}T{color}'s Serde. For example, if you want to 
create List of Strings serde, then serializer/deserializer of StringSerde will 
be used to serialize/deserialize each entry in 
{color:#4c9aff}List{color}._

I believe there are many use cases where List Serde could be used. Ex. 
[https://stackoverflow.com/questions/41427174/aggregate-java-objects-in-a-list-with-kafka-streams-dsl-windows],
 
[https://stackoverflow.com/questions/46365884/issue-with-arraylist-serde-in-kafka-streams-api]

For instance, aggregate grouped (by key) values together in a list to do other 
subsequent operations on the collection.

KIP Link: 
[https://cwiki.apache.org/confluence/display/KAFKA/KIP-466%3A+Add+support+for+List%3CT%3E+serialization+and+deserialization]


> Add Serde> support
> --
>
> Key: KAFKA-8326
> URL: https://issues.apache.org/jira/browse/KAFKA-8326
> Project: Kafka
>  Issue Type: Improvement
>  Components: clients, streams
>Reporter: Daniyar Yeralin
>Assignee: Daniyar Yeralin
>Priority: Minor
>  Labels: kip
>
> _This ticket proposes adding new {color:#4c9aff}ListSerializer{color} and 
> {color:#4c9aff}ListDeserializer{color} classes as well as support for the new 
> classes into the Serdes class. This will allow using List Serde of type_ 
> {color:#4c9aff}_Serde>_{color} _directly from Consumers, 
> Producers and Streams._
> _{color:#4c9aff}Serde>{color} serialization and deserialization 
> will be done through repeatedly calling a serializer/deserializer for each 
> entry provided by passed generic {color:#4c9aff}Inner{color}'s Serde. For 
> example, if you want to create List of Strings serde, then 
> serializer/deserializer of StringSerde will be used to serialize/deserialize 
> each entry in {color:#4c9aff}List{color}._
> I believe there are many use cases where List Serde could be used. Ex. 
> [https://stackoverflow.com/questions/41427174/aggregate-java-objects-in-a-list-with-kafka-streams-dsl-windows],
>  
> [https://stackoverflow.com/questions/46365884/issue-with-arraylist-serde-in-kafka-streams-api]
> For instance, aggregate grouped (by key) values together in a list to do 
> other subsequent operations on the collection.
> KIP Link: 
> [https://cwiki.apache.org/confluence/display/KAFKA/KIP-466%3A+Add+support+for+List%3CT%3E+serialization+and+deserialization]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (KAFKA-8326) Add Serde> support

2019-11-18 Thread Daniyar Yeralin (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-8326?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniyar Yeralin updated KAFKA-8326:
---
Summary: Add Serde> support  (was: Add List Serde)

> Add Serde> support
> --
>
> Key: KAFKA-8326
> URL: https://issues.apache.org/jira/browse/KAFKA-8326
> Project: Kafka
>  Issue Type: Improvement
>  Components: clients, streams
>Reporter: Daniyar Yeralin
>Assignee: Daniyar Yeralin
>Priority: Minor
>  Labels: kip
>
> _This ticket proposes adding new {color:#4c9aff}ListSerializer{color} and 
> {color:#4c9aff}ListDeserializer{color} classes as well as support for the new 
> classes into the Serdes class. This will allow using List Serde of type_ 
> {color:#4c9aff}_, T>_{color} _directly from Consumers, 
> Producers and Streams._
> _{color:#4c9aff}List{color} serialization and deserialization will be done 
> through repeatedly calling a serializer/deserializer for each entry provided 
> by passed generic {color:#4c9aff}T{color}'s Serde. For example, if you want 
> to create List of Strings serde, then serializer/deserializer of StringSerde 
> will be used to serialize/deserialize each entry in 
> {color:#4c9aff}List{color}._
> I believe there are many use cases where List Serde could be used. Ex. 
> [https://stackoverflow.com/questions/41427174/aggregate-java-objects-in-a-list-with-kafka-streams-dsl-windows],
>  
> [https://stackoverflow.com/questions/46365884/issue-with-arraylist-serde-in-kafka-streams-api]
> For instance, aggregate grouped (by key) values together in a list to do 
> other subsequent operations on the collection.
> KIP Link: 
> [https://cwiki.apache.org/confluence/display/KAFKA/KIP-466%3A+Add+support+for+List%3CT%3E+serialization+and+deserialization]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (KAFKA-9205) Add an option to enforce rack-aware partition reassignment

2019-11-18 Thread Vahid Hashemian (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-9205?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vahid Hashemian updated KAFKA-9205:
---
Description: 
One regularly used healing operation on Kafka clusters is replica reassignments 
for topic partitions. For example, when there is a skew in inbound/outbound 
traffic of a broker replica reassignment can be used to move some 
leaders/followers from the broker; or if there is a skew in disk usage of 
brokers, replica reassignment can more some partitions to other brokers that 
have more disk space available.

In Kafka clusters that span across multiple data centers (or availability 
zones), high availability is a priority; in the sense that when a data center 
goes offline the cluster should be able to resume normal operation by 
guaranteeing partition replicas in all data centers.

This guarantee is currently the responsibility of the on-call engineer that 
performs the reassignment or the tool that automatically generates the 
reassignment plan for improving the cluster health (e.g. by considering the 
rack configuration value of each broker in the cluster). the former, is quite 
error-prone, and the latter, would lead to duplicate code in all such admin 
tools (which are not error free either). Not all use cases can make use the 
default assignment strategy that is used by --generate option; and current rack 
aware enforcement applies to this option only.

It would be great for the built-in replica assignment API and tool provided by 
Kafka to support a rack aware verification option for --execute scenario that 
would simply return an error when [some] brokers in any replica set share a 
common rack. 

  was:
One regularly used healing operation on Kafka clusters is replica reassignments 
for topic partitions. For example, when there is a skew in inbound/outbound 
traffic of a broker replica reassignment can be used to move some 
leaders/followers from the broker; or if there is a skew in disk usage of 
brokers, replica reassignment can more some partitions to other brokers that 
have more disk space available.

In Kafka clusters that span across multiple data centers (or availability 
zones), high availability is a priority; in the sense that when a data center 
goes offline the cluster should be able to resume normal operation by 
guaranteeing partition replicas in all data centers.

This guarantee is currently the responsibility of the on-call engineer that 
performs the reassignment or the tool that automatically generates the 
reassignment plan for improving the cluster health (e.g. by considering the 
rack configuration value of each broker in the cluster). the former, is quite 
error-prone, and the latter, would lead to duplicate code in all such admin 
tools (which are not error free either).

It would be great for the built-in replica assignment API and tool provided by 
Kafka to support a rack aware verification option that would simply return an 
error when [some] brokers in any replica set share a common rack. 


> Add an option to enforce rack-aware partition reassignment
> --
>
> Key: KAFKA-9205
> URL: https://issues.apache.org/jira/browse/KAFKA-9205
> Project: Kafka
>  Issue Type: Improvement
>  Components: admin, tools
>Reporter: Vahid Hashemian
>Priority: Minor
>  Labels: needs-kip
>
> One regularly used healing operation on Kafka clusters is replica 
> reassignments for topic partitions. For example, when there is a skew in 
> inbound/outbound traffic of a broker replica reassignment can be used to move 
> some leaders/followers from the broker; or if there is a skew in disk usage 
> of brokers, replica reassignment can more some partitions to other brokers 
> that have more disk space available.
> In Kafka clusters that span across multiple data centers (or availability 
> zones), high availability is a priority; in the sense that when a data center 
> goes offline the cluster should be able to resume normal operation by 
> guaranteeing partition replicas in all data centers.
> This guarantee is currently the responsibility of the on-call engineer that 
> performs the reassignment or the tool that automatically generates the 
> reassignment plan for improving the cluster health (e.g. by considering the 
> rack configuration value of each broker in the cluster). the former, is quite 
> error-prone, and the latter, would lead to duplicate code in all such admin 
> tools (which are not error free either). Not all use cases can make use the 
> default assignment strategy that is used by --generate option; and current 
> rack aware enforcement applies to this option only.
> It would be great for the built-in replica assignment API and tool provided 
> by Kafka to support a rack aware verification option for --execute scenario 

[jira] [Commented] (KAFKA-9205) Add an option to enforce rack-aware partition reassignment

2019-11-18 Thread Vahid Hashemian (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-9205?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16976888#comment-16976888
 ] 

Vahid Hashemian commented on KAFKA-9205:


Thanks [~sbellapu] for the pointer. KIP-36 and the current implementation 
enforces rack aware assignment when generating an assignment (using the 
--generate option). If a custom reassignment algorithm is used to generate the 
assignment, or if the reassignment is manually generated on ad-hoc basic, the 
tool does not enforce rack awareness when run with --execute option. It would 
be great if enforcement can be implemented in --execute scenario too. I updated 
the description too. 

> Add an option to enforce rack-aware partition reassignment
> --
>
> Key: KAFKA-9205
> URL: https://issues.apache.org/jira/browse/KAFKA-9205
> Project: Kafka
>  Issue Type: Improvement
>  Components: admin, tools
>Reporter: Vahid Hashemian
>Priority: Minor
>
> One regularly used healing operation on Kafka clusters is replica 
> reassignments for topic partitions. For example, when there is a skew in 
> inbound/outbound traffic of a broker replica reassignment can be used to move 
> some leaders/followers from the broker; or if there is a skew in disk usage 
> of brokers, replica reassignment can more some partitions to other brokers 
> that have more disk space available.
> In Kafka clusters that span across multiple data centers (or availability 
> zones), high availability is a priority; in the sense that when a data center 
> goes offline the cluster should be able to resume normal operation by 
> guaranteeing partition replicas in all data centers.
> This guarantee is currently the responsibility of the on-call engineer that 
> performs the reassignment or the tool that automatically generates the 
> reassignment plan for improving the cluster health (e.g. by considering the 
> rack configuration value of each broker in the cluster). the former, is quite 
> error-prone, and the latter, would lead to duplicate code in all such admin 
> tools (which are not error free either). Not all use cases can make use the 
> default assignment strategy that is used by --generate option; and current 
> rack aware enforcement applies to this option only.
> It would be great for the built-in replica assignment API and tool provided 
> by Kafka to support a rack aware verification option for --execute scenario 
> that would simply return an error when [some] brokers in any replica set 
> share a common rack. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (KAFKA-9205) Add an option to enforce rack-aware partition reassignment

2019-11-18 Thread Vahid Hashemian (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-9205?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vahid Hashemian updated KAFKA-9205:
---
Labels:   (was: needs-kip)

> Add an option to enforce rack-aware partition reassignment
> --
>
> Key: KAFKA-9205
> URL: https://issues.apache.org/jira/browse/KAFKA-9205
> Project: Kafka
>  Issue Type: Improvement
>  Components: admin, tools
>Reporter: Vahid Hashemian
>Priority: Minor
>
> One regularly used healing operation on Kafka clusters is replica 
> reassignments for topic partitions. For example, when there is a skew in 
> inbound/outbound traffic of a broker replica reassignment can be used to move 
> some leaders/followers from the broker; or if there is a skew in disk usage 
> of brokers, replica reassignment can more some partitions to other brokers 
> that have more disk space available.
> In Kafka clusters that span across multiple data centers (or availability 
> zones), high availability is a priority; in the sense that when a data center 
> goes offline the cluster should be able to resume normal operation by 
> guaranteeing partition replicas in all data centers.
> This guarantee is currently the responsibility of the on-call engineer that 
> performs the reassignment or the tool that automatically generates the 
> reassignment plan for improving the cluster health (e.g. by considering the 
> rack configuration value of each broker in the cluster). the former, is quite 
> error-prone, and the latter, would lead to duplicate code in all such admin 
> tools (which are not error free either). Not all use cases can make use the 
> default assignment strategy that is used by --generate option; and current 
> rack aware enforcement applies to this option only.
> It would be great for the built-in replica assignment API and tool provided 
> by Kafka to support a rack aware verification option for --execute scenario 
> that would simply return an error when [some] brokers in any replica set 
> share a common rack. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (KAFKA-9205) Add an option to enforce rack-aware partition reassignment

2019-11-18 Thread sats (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-9205?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16976855#comment-16976855
 ] 

sats commented on KAFKA-9205:
-

[~vahid] do you have new KIP ? or this can be a extension to 
[KIP-36|[https://cwiki.apache.org/confluence/display/KAFKA/KIP-36+Rack+aware+replica+assignment]]
 please let me know so that i can give a shot on the implementation.

> Add an option to enforce rack-aware partition reassignment
> --
>
> Key: KAFKA-9205
> URL: https://issues.apache.org/jira/browse/KAFKA-9205
> Project: Kafka
>  Issue Type: Improvement
>  Components: admin, tools
>Reporter: Vahid Hashemian
>Priority: Minor
>  Labels: needs-kip
>
> One regularly used healing operation on Kafka clusters is replica 
> reassignments for topic partitions. For example, when there is a skew in 
> inbound/outbound traffic of a broker replica reassignment can be used to move 
> some leaders/followers from the broker; or if there is a skew in disk usage 
> of brokers, replica reassignment can more some partitions to other brokers 
> that have more disk space available.
> In Kafka clusters that span across multiple data centers (or availability 
> zones), high availability is a priority; in the sense that when a data center 
> goes offline the cluster should be able to resume normal operation by 
> guaranteeing partition replicas in all data centers.
> This guarantee is currently the responsibility of the on-call engineer that 
> performs the reassignment or the tool that automatically generates the 
> reassignment plan for improving the cluster health (e.g. by considering the 
> rack configuration value of each broker in the cluster). the former, is quite 
> error-prone, and the latter, would lead to duplicate code in all such admin 
> tools (which are not error free either).
> It would be great for the built-in replica assignment API and tool provided 
> by Kafka to support a rack aware verification option that would simply return 
> an error when [some] brokers in any replica set share a common rack. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (KAFKA-9208) Flaky Test SslAdminClientIntegrationTest.testCreatePartitions

2019-11-18 Thread Sophie Blee-Goldman (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-9208?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16976843#comment-16976843
 ] 

Sophie Blee-Goldman commented on KAFKA-9208:


Results cleaned up already but I saw this fail again (on a different 
2.4-targeted PR, also Java 8)

> Flaky Test SslAdminClientIntegrationTest.testCreatePartitions
> -
>
> Key: KAFKA-9208
> URL: https://issues.apache.org/jira/browse/KAFKA-9208
> Project: Kafka
>  Issue Type: Bug
>  Components: admin
>Affects Versions: 2.4.0
>Reporter: Sophie Blee-Goldman
>Priority: Major
>  Labels: flaky-test
>
> Java 8 build failed on 2.4-targeted PR
> h3. Stacktrace
> java.lang.AssertionError: validateOnly expected:<3> but was:<1> at 
> org.junit.Assert.fail(Assert.java:89) at 
> org.junit.Assert.failNotEquals(Assert.java:835) at 
> org.junit.Assert.assertEquals(Assert.java:647) at 
> kafka.api.AdminClientIntegrationTest$$anonfun$testCreatePartitions$6.apply(AdminClientIntegrationTest.scala:625)
>  at 
> kafka.api.AdminClientIntegrationTest$$anonfun$testCreatePartitions$6.apply(AdminClientIntegrationTest.scala:599)
>  at scala.collection.immutable.List.foreach(List.scala:392) at 
> kafka.api.AdminClientIntegrationTest.testCreatePartitions(AdminClientIntegrationTest.scala:599)
>  at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) 
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>  at java.lang.reflect.Method.invoke(Method.java:498) at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
>  at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>  at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
>  at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>  at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26) 
> at 
> org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27) 
> at 
> org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:288)
>  at 
> org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:282)
>  at java.util.concurrent.FutureTask.run(FutureTask.java:266) at 
> java.lang.Thread.run(Thread.java:748)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (KAFKA-9207) Replica Out of Sync as ReplicaFetcher thread is dead

2019-11-18 Thread Xue Liu (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-9207?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xue Liu updated KAFKA-9207:
---
Description: 
We sometimes see a replica for a partition is out of sync. When the issue 
happens, it seems that we just lost that replica (would never catch-up), unless 
we restart that broker.

It appears that ReplicaFetcher thread for that partition is dead and broker 
will not restart that thread. We didn't see any exception in server or 
controller logs.

The screen capture is taken from the broker that has that replica. Leader is 
2017.

 

 

  was:
We sometimes see a replica for a partition is out of sync. When the issue 
happens, it seems that we just lost that replica (would never catch-up), unless 
we restart that broker.

It appears that ReplicaFetcher thread for that partition is dead and broker 
will not restart that thread. We didn't see any exception in server or 
controller logs.

 

 


> Replica Out of Sync as ReplicaFetcher thread is dead
> 
>
> Key: KAFKA-9207
> URL: https://issues.apache.org/jira/browse/KAFKA-9207
> Project: Kafka
>  Issue Type: Bug
>  Components: replication
>Affects Versions: 2.3.0
>Reporter: Xue Liu
>Priority: Major
> Attachments: Capture.PNG
>
>
> We sometimes see a replica for a partition is out of sync. When the issue 
> happens, it seems that we just lost that replica (would never catch-up), 
> unless we restart that broker.
> It appears that ReplicaFetcher thread for that partition is dead and broker 
> will not restart that thread. We didn't see any exception in server or 
> controller logs.
> The screen capture is taken from the broker that has that replica. Leader is 
> 2017.
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (KAFKA-9208) Flaky Test SslAdminClientIntegrationTest.testCreatePartitions

2019-11-18 Thread Sophie Blee-Goldman (Jira)
Sophie Blee-Goldman created KAFKA-9208:
--

 Summary: Flaky Test 
SslAdminClientIntegrationTest.testCreatePartitions
 Key: KAFKA-9208
 URL: https://issues.apache.org/jira/browse/KAFKA-9208
 Project: Kafka
  Issue Type: Bug
  Components: admin
Affects Versions: 2.4.0
Reporter: Sophie Blee-Goldman


Java 8 build failed on 2.4-targeted PR
h3. Stacktrace

java.lang.AssertionError: validateOnly expected:<3> but was:<1> at 
org.junit.Assert.fail(Assert.java:89) at 
org.junit.Assert.failNotEquals(Assert.java:835) at 
org.junit.Assert.assertEquals(Assert.java:647) at 
kafka.api.AdminClientIntegrationTest$$anonfun$testCreatePartitions$6.apply(AdminClientIntegrationTest.scala:625)
 at 
kafka.api.AdminClientIntegrationTest$$anonfun$testCreatePartitions$6.apply(AdminClientIntegrationTest.scala:599)
 at scala.collection.immutable.List.foreach(List.scala:392) at 
kafka.api.AdminClientIntegrationTest.testCreatePartitions(AdminClientIntegrationTest.scala:599)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) 
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 at java.lang.reflect.Method.invoke(Method.java:498) at 
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
 at 
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
 at 
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
 at 
org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
 at 
org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26) 
at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27) 
at 
org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:288)
 at 
org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:282)
 at java.util.concurrent.FutureTask.run(FutureTask.java:266) at 
java.lang.Thread.run(Thread.java:748)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (KAFKA-9207) Replica Out of Sync as ReplicaFetcher thread is dead

2019-11-18 Thread Xue Liu (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-9207?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xue Liu updated KAFKA-9207:
---
Description: 
We sometimes see a replica for a partition is out of sync. When the issue 
happens, it seems that we just lost that replica (would never catch-up), unless 
we restart that broker.

It appears that ReplicaFetcher thread for that partition is dead and broker 
will not restart that thread. We didn't see any exception in server or 
controller logs.

 

 

  was:
We sometimes see a replica for a partition is out of sync. When the issue 
happens, it seems that we just lost that replica (would never catch-up), unless 
we restart that broker.

It appears that ReplicaFetcher thread for that partition is dead. We didn't see 
any exception in server or controller logs.

 

 


> Replica Out of Sync as ReplicaFetcher thread is dead
> 
>
> Key: KAFKA-9207
> URL: https://issues.apache.org/jira/browse/KAFKA-9207
> Project: Kafka
>  Issue Type: Bug
>  Components: replication
>Affects Versions: 2.3.0
>Reporter: Xue Liu
>Priority: Major
> Attachments: Capture.PNG
>
>
> We sometimes see a replica for a partition is out of sync. When the issue 
> happens, it seems that we just lost that replica (would never catch-up), 
> unless we restart that broker.
> It appears that ReplicaFetcher thread for that partition is dead and broker 
> will not restart that thread. We didn't see any exception in server or 
> controller logs.
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (KAFKA-9207) Replica Out of Sync as ReplicaFetcher thread is dead

2019-11-18 Thread Xue Liu (Jira)
Xue Liu created KAFKA-9207:
--

 Summary: Replica Out of Sync as ReplicaFetcher thread is dead
 Key: KAFKA-9207
 URL: https://issues.apache.org/jira/browse/KAFKA-9207
 Project: Kafka
  Issue Type: Bug
  Components: replication
Affects Versions: 2.3.0
Reporter: Xue Liu
 Attachments: Capture.PNG

We sometimes see a replica for a partition is out of sync. When the issue 
happens, it seems that we just lost that replica (would never catch-up), unless 
we restart that broker.

It appears that ReplicaFetcher thread for that partition is dead.

 

 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (KAFKA-9207) Replica Out of Sync as ReplicaFetcher thread is dead

2019-11-18 Thread Xue Liu (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-9207?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xue Liu updated KAFKA-9207:
---
Description: 
We sometimes see a replica for a partition is out of sync. When the issue 
happens, it seems that we just lost that replica (would never catch-up), unless 
we restart that broker.

It appears that ReplicaFetcher thread for that partition is dead. We didn't see 
any exception in server or controller logs.

 

 

  was:
We sometimes see a replica for a partition is out of sync. When the issue 
happens, it seems that we just lost that replica (would never catch-up), unless 
we restart that broker.

It appears that ReplicaFetcher thread for that partition is dead.

 

 


> Replica Out of Sync as ReplicaFetcher thread is dead
> 
>
> Key: KAFKA-9207
> URL: https://issues.apache.org/jira/browse/KAFKA-9207
> Project: Kafka
>  Issue Type: Bug
>  Components: replication
>Affects Versions: 2.3.0
>Reporter: Xue Liu
>Priority: Major
> Attachments: Capture.PNG
>
>
> We sometimes see a replica for a partition is out of sync. When the issue 
> happens, it seems that we just lost that replica (would never catch-up), 
> unless we restart that broker.
> It appears that ReplicaFetcher thread for that partition is dead. We didn't 
> see any exception in server or controller logs.
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (KAFKA-9206) Consumer should handle `CORRUPT_MESSAGE` error code in fetch response

2019-11-18 Thread Jason Gustafson (Jira)
Jason Gustafson created KAFKA-9206:
--

 Summary: Consumer should handle `CORRUPT_MESSAGE` error code in 
fetch response
 Key: KAFKA-9206
 URL: https://issues.apache.org/jira/browse/KAFKA-9206
 Project: Kafka
  Issue Type: Bug
Reporter: Jason Gustafson


This error code is possible, for example, when the broker scans the log to find 
the fetch offset after the index lookup. Currently this results in a slightly 
obscure message such as the following:
{code:java}
java.lang.IllegalStateException: Unexpected error code 2 while fetching 
data{code}
 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (KAFKA-9157) logcleaner could generate empty segment files after cleaning

2019-11-18 Thread sats (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-9157?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16976745#comment-16976745
 ] 

sats commented on KAFKA-9157:
-

[~huxi_2b] sure please go head.

> logcleaner could generate empty segment files after cleaning
> 
>
> Key: KAFKA-9157
> URL: https://issues.apache.org/jira/browse/KAFKA-9157
> Project: Kafka
>  Issue Type: Improvement
>  Components: core
>Affects Versions: 2.3.0
>Reporter: Jun Rao
>Priority: Major
>
> Currently, the log cleaner could only combine segments within a 2-billion 
> offset range. If all records in that range are deleted, an empty segment 
> could be generated. It would be useful to avoid generating such empty 
> segments.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (KAFKA-9205) Add an option to enforce rack-aware partition reassignment

2019-11-18 Thread Vahid Hashemian (Jira)
Vahid Hashemian created KAFKA-9205:
--

 Summary: Add an option to enforce rack-aware partition reassignment
 Key: KAFKA-9205
 URL: https://issues.apache.org/jira/browse/KAFKA-9205
 Project: Kafka
  Issue Type: Improvement
  Components: admin, tools
Reporter: Vahid Hashemian


One regularly used healing operation on Kafka clusters is replica reassignments 
for topic partitions. For example, when there is a skew in inbound/outbound 
traffic of a broker replica reassignment can be used to move some 
leaders/followers from the broker; or if there is a skew in disk usage of 
brokers, replica reassignment can more some partitions to other brokers that 
have more disk space available.

In Kafka clusters that span across multiple data centers (or availability 
zones), high availability is a priority; in the sense that when a data center 
goes offline the cluster should be able to resume normal operation by 
guaranteeing partition replicas in all data centers.

This guarantee is currently the responsibility of the on-call engineer that 
performs the reassignment or the tool that automatically generates the 
reassignment plan for improving the cluster health (e.g. by considering the 
rack configuration value of each broker in the cluster). the former, is quite 
error-prone, and the latter, would lead to duplicate code in all such admin 
tools (which are not error free either).

It would be great for the built-in replica assignment API and tool provided by 
Kafka to support a rack aware verification option that would simply return an 
error when [some] brokers in any replica set share a common rack. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (KAFKA-9203) kafka-client 2.3.1 fails to consume lz4 compressed topic in kafka 2.1.1

2019-11-18 Thread Ismael Juma (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-9203?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16976701#comment-16976701
 ] 

Ismael Juma commented on KAFKA-9203:


We upgraded lz4-java from 1.5.0 to 1.6.0 in Kafka 2.3. I wonder if that could 
be the reason:

[https://github.com/lz4/lz4-java/blob/master/CHANGES.md]

> kafka-client 2.3.1 fails to consume lz4 compressed topic in kafka 2.1.1
> ---
>
> Key: KAFKA-9203
> URL: https://issues.apache.org/jira/browse/KAFKA-9203
> Project: Kafka
>  Issue Type: Bug
>  Components: compression, consumer
>Affects Versions: 2.3.0, 2.3.1
>Reporter: David Watzke
>Priority: Major
>
> I run kafka cluster 2.1.1
> when I upgraded a client app to use kafka-client 2.3.0 (or 2.3.1) instead of 
> 2.2.0, I immediately started getting the following exceptions in a loop when 
> consuming a topic with LZ4-compressed messages:
> {noformat}
> 2019-11-18 11:59:16.888 ERROR [afka-consumer-thread] 
> com.avast.utils2.kafka.consumer.ConsumerThread: Unexpected exception occurred 
> while polling and processing messages: org.apache.kafka.common.KafkaExce
> ption: Received exception when fetching the next record from 
> FILEREP_LONER_REQUEST-4. If needed, please seek past the record to continue 
> consumption. 
> org.apache.kafka.common.KafkaException: Received exception when fetching the 
> next record from FILEREP_LONER_REQUEST-4. If needed, please seek past the 
> record to continue consumption. 
>     at 
> org.apache.kafka.clients.consumer.internals.Fetcher$PartitionRecords.fetchRecords(Fetcher.java:1473)
>  
>     at 
> org.apache.kafka.clients.consumer.internals.Fetcher$PartitionRecords.access$1600(Fetcher.java:1332)
>  
>     at 
> org.apache.kafka.clients.consumer.internals.Fetcher.fetchRecords(Fetcher.java:645)
>  
>     at 
> org.apache.kafka.clients.consumer.internals.Fetcher.fetchedRecords(Fetcher.java:606)
>  
>     at 
> org.apache.kafka.clients.consumer.KafkaConsumer.pollForFetches(KafkaConsumer.java:1263)
>  
>     at 
> org.apache.kafka.clients.consumer.KafkaConsumer.poll(KafkaConsumer.java:1225) 
>     at 
> org.apache.kafka.clients.consumer.KafkaConsumer.poll(KafkaConsumer.java:1201) 
>     at 
> com.avast.utils2.kafka.consumer.ConsumerThread.pollAndProcessMessagesFromKafka(ConsumerThread.java:180)
>  
>     at 
> com.avast.utils2.kafka.consumer.ConsumerThread.run(ConsumerThread.java:149) 
>     at 
> com.avast.filerep.saver.RequestSaver$.$anonfun$main$8(RequestSaver.scala:28) 
>     at 
> com.avast.filerep.saver.RequestSaver$.$anonfun$main$8$adapted(RequestSaver.scala:19)
>  
>     at 
> resource.AbstractManagedResource.$anonfun$acquireFor$1(AbstractManagedResource.scala:88)
>  
>     at 
> scala.util.control.Exception$Catch.$anonfun$either$1(Exception.scala:252) 
>     at scala.util.control.Exception$Catch.apply(Exception.scala:228) 
>     at scala.util.control.Exception$Catch.either(Exception.scala:252) 
>     at 
> resource.AbstractManagedResource.acquireFor(AbstractManagedResource.scala:88) 
>     at 
> resource.ManagedResourceOperations.apply(ManagedResourceOperations.scala:26) 
>     at 
> resource.ManagedResourceOperations.apply$(ManagedResourceOperations.scala:26) 
>     at 
> resource.AbstractManagedResource.apply(AbstractManagedResource.scala:50) 
>     at 
> resource.ManagedResourceOperations.acquireAndGet(ManagedResourceOperations.scala:25)
>  
>     at 
> resource.ManagedResourceOperations.acquireAndGet$(ManagedResourceOperations.scala:25)
>  
>     at 
> resource.AbstractManagedResource.acquireAndGet(AbstractManagedResource.scala:50)
>  
>     at 
> resource.ManagedResourceOperations.foreach(ManagedResourceOperations.scala:53)
>  
>     at 
> resource.ManagedResourceOperations.foreach$(ManagedResourceOperations.scala:53)
>  
>     at 
> resource.AbstractManagedResource.foreach(AbstractManagedResource.scala:50) 
>     at 
> com.avast.filerep.saver.RequestSaver$.$anonfun$main$6(RequestSaver.scala:19) 
>     at 
> com.avast.filerep.saver.RequestSaver$.$anonfun$main$6$adapted(RequestSaver.scala:18)
>  
>     at 
> resource.AbstractManagedResource.$anonfun$acquireFor$1(AbstractManagedResource.scala:88)
>  
>     at 
> scala.util.control.Exception$Catch.$anonfun$either$1(Exception.scala:252) 
>     at scala.util.control.Exception$Catch.apply(Exception.scala:228) 
>     at scala.util.control.Exception$Catch.either(Exception.scala:252) 
>     at 
> resource.AbstractManagedResource.acquireFor(AbstractManagedResource.scala:88) 
>     at 
> resource.ManagedResourceOperations.apply(ManagedResourceOperations.scala:26) 
>     at 
> resource.ManagedResourceOperations.apply$(ManagedResourceOperations.scala:26) 
>    

[jira] [Commented] (KAFKA-9203) kafka-client 2.3.1 fails to consume lz4 compressed topic in kafka 2.1.1

2019-11-18 Thread Ismael Juma (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-9203?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16976693#comment-16976693
 ] 

Ismael Juma commented on KAFKA-9203:


Thanks for the report. We have many tests and workloads running with lz4 and 
this is the first time I am seeing this issue, so it's more subtle than "lz4 
broken with kafka 2.3". Do you know which client was used to compress these 
messages? Was it the Java producer 2.2?

> kafka-client 2.3.1 fails to consume lz4 compressed topic in kafka 2.1.1
> ---
>
> Key: KAFKA-9203
> URL: https://issues.apache.org/jira/browse/KAFKA-9203
> Project: Kafka
>  Issue Type: Bug
>  Components: compression, consumer
>Affects Versions: 2.3.0, 2.3.1
>Reporter: David Watzke
>Priority: Major
>
> I run kafka cluster 2.1.1
> when I upgraded a client app to use kafka-client 2.3.0 (or 2.3.1) instead of 
> 2.2.0, I immediately started getting the following exceptions in a loop when 
> consuming a topic with LZ4-compressed messages:
> {noformat}
> 2019-11-18 11:59:16.888 ERROR [afka-consumer-thread] 
> com.avast.utils2.kafka.consumer.ConsumerThread: Unexpected exception occurred 
> while polling and processing messages: org.apache.kafka.common.KafkaExce
> ption: Received exception when fetching the next record from 
> FILEREP_LONER_REQUEST-4. If needed, please seek past the record to continue 
> consumption. 
> org.apache.kafka.common.KafkaException: Received exception when fetching the 
> next record from FILEREP_LONER_REQUEST-4. If needed, please seek past the 
> record to continue consumption. 
>     at 
> org.apache.kafka.clients.consumer.internals.Fetcher$PartitionRecords.fetchRecords(Fetcher.java:1473)
>  
>     at 
> org.apache.kafka.clients.consumer.internals.Fetcher$PartitionRecords.access$1600(Fetcher.java:1332)
>  
>     at 
> org.apache.kafka.clients.consumer.internals.Fetcher.fetchRecords(Fetcher.java:645)
>  
>     at 
> org.apache.kafka.clients.consumer.internals.Fetcher.fetchedRecords(Fetcher.java:606)
>  
>     at 
> org.apache.kafka.clients.consumer.KafkaConsumer.pollForFetches(KafkaConsumer.java:1263)
>  
>     at 
> org.apache.kafka.clients.consumer.KafkaConsumer.poll(KafkaConsumer.java:1225) 
>     at 
> org.apache.kafka.clients.consumer.KafkaConsumer.poll(KafkaConsumer.java:1201) 
>     at 
> com.avast.utils2.kafka.consumer.ConsumerThread.pollAndProcessMessagesFromKafka(ConsumerThread.java:180)
>  
>     at 
> com.avast.utils2.kafka.consumer.ConsumerThread.run(ConsumerThread.java:149) 
>     at 
> com.avast.filerep.saver.RequestSaver$.$anonfun$main$8(RequestSaver.scala:28) 
>     at 
> com.avast.filerep.saver.RequestSaver$.$anonfun$main$8$adapted(RequestSaver.scala:19)
>  
>     at 
> resource.AbstractManagedResource.$anonfun$acquireFor$1(AbstractManagedResource.scala:88)
>  
>     at 
> scala.util.control.Exception$Catch.$anonfun$either$1(Exception.scala:252) 
>     at scala.util.control.Exception$Catch.apply(Exception.scala:228) 
>     at scala.util.control.Exception$Catch.either(Exception.scala:252) 
>     at 
> resource.AbstractManagedResource.acquireFor(AbstractManagedResource.scala:88) 
>     at 
> resource.ManagedResourceOperations.apply(ManagedResourceOperations.scala:26) 
>     at 
> resource.ManagedResourceOperations.apply$(ManagedResourceOperations.scala:26) 
>     at 
> resource.AbstractManagedResource.apply(AbstractManagedResource.scala:50) 
>     at 
> resource.ManagedResourceOperations.acquireAndGet(ManagedResourceOperations.scala:25)
>  
>     at 
> resource.ManagedResourceOperations.acquireAndGet$(ManagedResourceOperations.scala:25)
>  
>     at 
> resource.AbstractManagedResource.acquireAndGet(AbstractManagedResource.scala:50)
>  
>     at 
> resource.ManagedResourceOperations.foreach(ManagedResourceOperations.scala:53)
>  
>     at 
> resource.ManagedResourceOperations.foreach$(ManagedResourceOperations.scala:53)
>  
>     at 
> resource.AbstractManagedResource.foreach(AbstractManagedResource.scala:50) 
>     at 
> com.avast.filerep.saver.RequestSaver$.$anonfun$main$6(RequestSaver.scala:19) 
>     at 
> com.avast.filerep.saver.RequestSaver$.$anonfun$main$6$adapted(RequestSaver.scala:18)
>  
>     at 
> resource.AbstractManagedResource.$anonfun$acquireFor$1(AbstractManagedResource.scala:88)
>  
>     at 
> scala.util.control.Exception$Catch.$anonfun$either$1(Exception.scala:252) 
>     at scala.util.control.Exception$Catch.apply(Exception.scala:228) 
>     at scala.util.control.Exception$Catch.either(Exception.scala:252) 
>     at 
> resource.AbstractManagedResource.acquireFor(AbstractManagedResource.scala:88) 
>     at 
> 

[jira] [Commented] (KAFKA-9180) Broker won't start with empty log dir

2019-11-18 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-9180?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16976691#comment-16976691
 ] 

ASF GitHub Bot commented on KAFKA-9180:
---

ijuma commented on pull request #7700: KAFKA-9180: Introduce 
BrokerMetadataCheckpointTest
URL: https://github.com/apache/kafka/pull/7700
 
 
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Broker won't start with empty log dir
> -
>
> Key: KAFKA-9180
> URL: https://issues.apache.org/jira/browse/KAFKA-9180
> Project: Kafka
>  Issue Type: Bug
>  Components: core
>Affects Versions: 2.4.0
>Reporter: Magnus Edenhill
>Assignee: Ismael Juma
>Priority: Blocker
>
> On kafka trunk at commit 1675115ec193acf4c7d44e68a57272edfec0b455:
>  
> Attempting to start the broker with an existing but empty log dir yields the 
> following error and terminates the process:
> {code:java}
> [2019-11-13 10:42:16,922] ERROR Failed to read meta.properties file under dir 
> /Users/magnus/src/librdkafka/tests/tmp/LibrdkafkaTestCluster/73638126/KafkaBrokerApp/2/logs/meta.properties
>  due to 
> /Users/magnus/src/librdkafka/tests/tmp/LibrdkafkaTestCluster/73638126/KafkaBrokerApp/2/logs/meta.properties
>  (No such file or directory) 
> (kafka.server.BrokerMetadataCheckpoint)[2019-11-13 10:42:16,924] ERROR Fail 
> to read meta.properties under log directory 
> /Users/magnus/src/librdkafka/tests/tmp/LibrdkafkaTestCluster/73638126/KafkaBrokerApp/2/logs
>  (kafka.server.KafkaServer)java.io.FileNotFoundException: 
> /Users/magnus/src/librdkafka/tests/tmp/LibrdkafkaTestCluster/73638126/KafkaBrokerApp/2/logs/meta.properties
>  (No such file or directory)at java.io.FileInputStream.open0(Native 
> Method)at java.io.FileInputStream.open(FileInputStream.java:195)  
>   at java.io.FileInputStream.(FileInputStream.java:138)at 
> java.io.FileInputStream.(FileInputStream.java:93)at 
> org.apache.kafka.common.utils.Utils.loadProps(Utils.java:512)at 
> kafka.server.BrokerMetadataCheckpoint.liftedTree2$1(BrokerMetadataCheckpoint.scala:73)
> at 
> kafka.server.BrokerMetadataCheckpoint.read(BrokerMetadataCheckpoint.scala:72) 
>at 
> kafka.server.KafkaServer.$anonfun$getBrokerMetadataAndOfflineDirs$1(KafkaServer.scala:704)
> at 
> kafka.server.KafkaServer.$anonfun$getBrokerMetadataAndOfflineDirs$1$adapted(KafkaServer.scala:702)
> at 
> scala.collection.IndexedSeqOptimized.foreach(IndexedSeqOptimized.scala:36)
> at 
> scala.collection.IndexedSeqOptimized.foreach$(IndexedSeqOptimized.scala:33)   
>  at scala.collection.mutable.WrappedArray.foreach(WrappedArray.scala:38)  
>   at 
> kafka.server.KafkaServer.getBrokerMetadataAndOfflineDirs(KafkaServer.scala:702)
> at kafka.server.KafkaServer.startup(KafkaServer.scala:214)at 
> kafka.server.KafkaServerStartable.startup(KafkaServerStartable.scala:44)  
>   at kafka.Kafka$.main(Kafka.scala:84)at 
> kafka.Kafka.main(Kafka.scala) {code}
>  
>  
> Changing the catch to FileNotFoundException fixes the issue, here:
> [https://github.com/apache/kafka/blob/trunk/core/src/main/scala/kafka/server/BrokerMetadataCheckpoint.scala#L84]
>  
>  
> This is a regression from 2.3.x.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (KAFKA-9170) KafkaStreams constructor fails to read configuration from Properties object created with default values

2019-11-18 Thread Oleg Muravskiy (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-9170?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16976669#comment-16976669
 ] 

Oleg Muravskiy commented on KAFKA-9170:
---

OK, I will need some time to figure out the scope of changes needed and whether 
it quantifies for a KIP, but do not have any spare time right now, so please 
bear with me

> KafkaStreams constructor fails to read configuration from Properties object 
> created with default values
> ---
>
> Key: KAFKA-9170
> URL: https://issues.apache.org/jira/browse/KAFKA-9170
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Affects Versions: 2.3.0
>Reporter: Oleg Muravskiy
>Priority: Major
>
> When the *Properties* object passed in to the *KafkaStreams* constructor is 
> created like 
>  
> {code:java}
> new Properties(defaultProperties){code}
>  
> KafkaStreams fails to read properties properly, which in my case results in 
> an error:
>  
> {noformat}
> org.apache.kafka.common.config.ConfigException: Missing required 
> configuration "bootstrap.servers" which has no default 
> value.org.apache.kafka.common.config.ConfigException: Missing required 
> configuration "bootstrap.servers" which has no default value. at 
> org.apache.kafka.common.config.ConfigDef.parseValue(ConfigDef.java:476) at 
> org.apache.kafka.common.config.ConfigDef.parse(ConfigDef.java:466) at 
> org.apache.kafka.common.config.AbstractConfig.(AbstractConfig.java:108) 
> at 
> org.apache.kafka.common.config.AbstractConfig.(AbstractConfig.java:142) 
> at org.apache.kafka.streams.StreamsConfig.(StreamsConfig.java:844) at 
> org.apache.kafka.streams.StreamsConfig.(StreamsConfig.java:839) at 
> org.apache.kafka.streams.KafkaStreams.(KafkaStreams.java:544)
> {noformat}
> This is due to the fact that the constructor that receives the *Properties* 
> class:
>  
> {code:java}
> public KafkaStreams(final Topology topology,
>  final Properties props) {
>  this(topology.internalTopologyBuilder, new StreamsConfig(props), new 
> DefaultKafkaClientSupplier());
> {code}
> passes *props* into *StreamsConfig*, which ignores the *Properties* 
> interface, and only uses the *Map* interface:
>  
> {code:java}
> public StreamsConfig(final Map props) {
>  this(props, true);
> } 
> {code}
> (Note that if you do 
> {{props.getProperty(StreamsConfig.BOOTSTRAP_SERVERS_CONFIG)}}, it returns the 
> correct value).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (KAFKA-9184) Redundant task creation after worker fails to join a specific group generation

2019-11-18 Thread Timur Rubeko (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-9184?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16976650#comment-16976650
 ] 

Timur Rubeko commented on KAFKA-9184:
-

Please, let me know should you need additional information.

> Redundant task creation after worker fails to join a specific group generation
> --
>
> Key: KAFKA-9184
> URL: https://issues.apache.org/jira/browse/KAFKA-9184
> Project: Kafka
>  Issue Type: Bug
>  Components: KafkaConnect
>Affects Versions: 2.3.2
>Reporter: Konstantine Karantasis
>Assignee: Konstantine Karantasis
>Priority: Blocker
> Fix For: 2.3.2
>
>
> First reported here: 
> https://stackoverflow.com/questions/58631092/kafka-connect-assigns-same-task-to-multiple-workers
> There seems to be an issue with task reassignment when a worker rejoins after 
> an unsuccessful join request. The worker seems to be outside the group for a 
> generation but when it joins again the same task is running in more than one 
> worker



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Comment Edited] (KAFKA-9184) Redundant task creation after worker fails to join a specific group generation

2019-11-18 Thread Timur Rubeko (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-9184?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16976648#comment-16976648
 ] 

Timur Rubeko edited comment on KAFKA-9184 at 11/18/19 3:54 PM:
---

Hello. SO question author here.

Following is an example of a sequence of events that typically leads to the 
redundant task creation. Set-up: three workers and three connectors. Relevant 
logs:

 

*Worker A*:
{code:java}
[2019-11-03 11:07:26,912] INFO [Worker clientId=connect-1, 
groupId=ingest-sources-cluster] Joined group at generation 639 and got 
assignment: Assignment{error=0, 
leader='connect-1-7d69fcf2-1025-418b-9091-4054b984d18f', 
leaderUrl='http://10.16.0.18:8083/', offset=250, connectorIds=[mqtt-source], 
taskIds=[mqtt-source-0], revokedConnectorIds=[], revokedTaskIds=[], 
delay=30} 
(org.apache.kafka.connect.runtime.distributed.DistributedHerder:1397)
[2019-11-03 11:12:06,192] INFO [Worker clientId=connect-1, 
groupId=ingest-sources-cluster] Joined group at generation 640 and got 
assignment: Assignment{error=0, 
leader='connect-1-7d69fcf2-1025-418b-9091-4054b984d18f', 
leaderUrl='http://10.16.0.18:8083/', offset=250, connectorIds=[hdfs-sink, 
another-hdfs-sink, mqtt-source], taskIds=[hdfs-sink-0, another-hdfs-sink-0, 
mqtt-source-0], revokedConnectorIds=[], revokedTaskIds=[], delay=0} 
(org.apache.kafka.connect.runtime.distributed.DistributedHerder:1397)
[2019-11-03 11:12:09,247] INFO [Worker clientId=connect-1, 
groupId=ingest-sources-cluster] Joined group at generation 641 and got 
assignment: Assignment{error=0, 
leader='connect-1-7d69fcf2-1025-418b-9091-4054b984d18f', 
leaderUrl='http://10.16.0.18:8083/', offset=250, connectorIds=[hdfs-sink, 
another-hdfs-sink, mqtt-source], taskIds=[hdfs-sink-0, another-hdfs-sink-0, 
mqtt-source-0], revokedConnectorIds=[], revokedTaskIds=[], delay=0} 
(org.apache.kafka.connect.runtime.distributed.DistributedHerder:1397)
[2019-11-03 12:49:05,632] INFO [Worker clientId=connect-1, 
groupId=ingest-sources-cluster] Joined group at generation 642 and got 
assignment: Assignment{error=0, 
leader='connect-1-7d69fcf2-1025-418b-9091-4054b984d18f', 
leaderUrl='http://10.16.0.18:8083/', offset=250, connectorIds=[hdfs-sink, 
another-hdfs-sink, mqtt-source], taskIds=[hdfs-sink-0, another-hdfs-sink-0, 
mqtt-source-0], revokedConnectorIds=[], revokedTaskIds=[], delay=0} 
(org.apache.kafka.connect.runtime.distributed.DistributedHerder:1397)
{code}
 

*Worker B*:
{code:java}
[2019-11-03 11:07:26,911] INFO [Worker clientId=connect-1, 
groupId=ingest-sources-cluster] Joined group at generation 639 and got 
assignment: Assignment{error=0, 
leader='connect-1-7d69fcf2-1025-418b-9091-4054b984d18f', 
leaderUrl='http://10.16.0.18:8083/', offset=250, 
connectorIds=[another-hdfs-sink], taskIds=[another-hdfs-sink-0], 
revokedConnectorIds=[], revokedTaskIds=[], delay=30} 
(org.apache.kafka.connect.runtime.distributed.DistributedHerder:1397)
[2019-11-03 11:12:06,041] INFO [Worker clientId=connect-1, 
groupId=ingest-sources-cluster] Attempt to heartbeat failed for since member id 
connect-1-bf534716-be2f-4cb3-9f26-521023c6b504 is not valid. 
(org.apache.kafka.clients.consumer.internals.AbstractCoordinator:947)
[2019-11-03 11:12:09,251] INFO [Worker clientId=connect-1, 
groupId=ingest-sources-cluster] Joined group at generation 641 and got 
assignment: Assignment{error=0, 
leader='connect-1-7d69fcf2-1025-418b-9091-4054b984d18f', 
leaderUrl='http://10.16.0.18:8083/', offset=250, 
connectorIds=[another-hdfs-sink], taskIds=[another-hdfs-sink-0], 
revokedConnectorIds=[], revokedTaskIds=[], delay=0} 
(org.apache.kafka.connect.runtime.distributed.DistributedHerder:1397)
[2019-11-03 12:49:03,150] INFO [Worker clientId=connect-1, 
groupId=ingest-sources-cluster] Attempt to heartbeat failed for since member id 
connect-1-c930bdb9-eedf-4313-95e0-4a6927836094 is not valid. 
(org.apache.kafka.clients.consumer.internals.AbstractCoordinator:947)
[2019-11-03 12:49:05,632] INFO [Worker clientId=connect-1, 
groupId=ingest-sources-cluster] Joined group at generation 642 and got 
assignment: Assignment{error=0, 
leader='connect-1-7d69fcf2-1025-418b-9091-4054b984d18f', 
leaderUrl='http://10.16.0.18:8083/', offset=250, 
connectorIds=[another-hdfs-sink], taskIds=[another-hdfs-sink-0], 
revokedConnectorIds=[], revokedTaskIds=[], delay=0} 
(org.apache.kafka.connect.runtime.distributed.DistributedHerder:1397)
{code}
 

*Worker C*:
{code:java}
[2019-11-03 11:07:26,911] INFO [Worker clientId=connect-1, 
groupId=ingest-sources-cluster] Joined group at generation 639 and got 
assignment: Assignment{error=0, 
leader='connect-1-7d69fcf2-1025-418b-9091-4054b984d18f', 
leaderUrl='http://10.16.0.18:8083/', offset=250, connectorIds=[hdfs-sink], 
taskIds=[hdfs-sink-0], revokedConnectorIds=[], revokedTaskIds=[], delay=30} 
(org.apache.kafka.connect.runtime.distributed.DistributedHerder:1397)
[2019-11-03 11:12:06,006] INFO 

[jira] [Comment Edited] (KAFKA-9184) Redundant task creation after worker fails to join a specific group generation

2019-11-18 Thread Timur Rubeko (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-9184?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16976648#comment-16976648
 ] 

Timur Rubeko edited comment on KAFKA-9184 at 11/18/19 3:50 PM:
---

Hello. SO question author here.

Following is an example of a sequence of events that typically leads to the 
redundant task creation. Set-up: three workers and three connectors. Relevant 
logs:

 

*Worker A*:
{code:java}
[2019-11-03 11:07:26,912] INFO [Worker clientId=connect-1, 
groupId=ingest-sources-cluster] Joined group at generation 639 and got 
assignment: Assignment{error=0, 
leader='connect-1-7d69fcf2-1025-418b-9091-4054b984d18f', 
leaderUrl='http://10.16.0.18:8083/', offset=250, connectorIds=[mqtt-source], 
taskIds=[mqtt-source-0], revokedConnectorIds=[], revokedTaskIds=[], 
delay=30} 
(org.apache.kafka.connect.runtime.distributed.DistributedHerder:1397)
[2019-11-03 11:12:06,192] INFO [Worker clientId=connect-1, 
groupId=ingest-sources-cluster] Joined group at generation 640 and got 
assignment: Assignment{error=0, 
leader='connect-1-7d69fcf2-1025-418b-9091-4054b984d18f', 
leaderUrl='http://10.16.0.18:8083/', offset=250, connectorIds=[hdfs-sink, 
another-hdfs-sink, mqtt-source], taskIds=[hdfs-sink-0, another-hdfs-sink-0, 
mqtt-source-0], revokedConnectorIds=[], revokedTaskIds=[], delay=0} 
(org.apache.kafka.connect.runtime.distributed.DistributedHerder:1397)
[2019-11-03 11:12:09,247] INFO [Worker clientId=connect-1, 
groupId=ingest-sources-cluster] Joined group at generation 641 and got 
assignment: Assignment{error=0, 
leader='connect-1-7d69fcf2-1025-418b-9091-4054b984d18f', 
leaderUrl='http://10.16.0.18:8083/', offset=250, connectorIds=[hdfs-sink, 
another-hdfs-sink, mqtt-source], taskIds=[hdfs-sink-0, another-hdfs-sink-0, 
mqtt-source-0], revokedConnectorIds=[], revokedTaskIds=[], delay=0} 
(org.apache.kafka.connect.runtime.distributed.DistributedHerder:1397)
[2019-11-03 12:49:05,632] INFO [Worker clientId=connect-1, 
groupId=ingest-sources-cluster] Joined group at generation 642 and got 
assignment: Assignment{error=0, 
leader='connect-1-7d69fcf2-1025-418b-9091-4054b984d18f', 
leaderUrl='http://10.16.0.18:8083/', offset=250, connectorIds=[hdfs-sink, 
another-hdfs-sink, mqtt-source], taskIds=[hdfs-sink-0, another-hdfs-sink-0, 
mqtt-source-0], revokedConnectorIds=[], revokedTaskIds=[], delay=0} 
(org.apache.kafka.connect.runtime.distributed.DistributedHerder:1397)
{code}
 

*Worker B*:
{code:java}
[2019-11-03 11:07:26,911] INFO [Worker clientId=connect-1, 
groupId=ingest-sources-cluster] Joined group at generation 639 and got 
assignment: Assignment{error=0, 
leader='connect-1-7d69fcf2-1025-418b-9091-4054b984d18f', 
leaderUrl='http://10.16.0.18:8083/', offset=250, 
connectorIds=[another-hdfs-sink], taskIds=[another-hdfs-sink-0], 
revokedConnectorIds=[], revokedTaskIds=[], delay=30} 
(org.apache.kafka.connect.runtime.distributed.DistributedHerder:1397)
[2019-11-03 11:12:06,041] INFO [Worker clientId=connect-1, 
groupId=ingest-sources-cluster] Attempt to heartbeat failed for since member id 
connect-1-bf534716-be2f-4cb3-9f26-521023c6b504 is not valid. 
(org.apache.kafka.clients.consumer.internals.AbstractCoordinator:947)
[2019-11-03 11:12:09,251] INFO [Worker clientId=connect-1, 
groupId=ingest-sources-cluster] Joined group at generation 641 and got 
assignment: Assignment{error=0, 
leader='connect-1-7d69fcf2-1025-418b-9091-4054b984d18f', 
leaderUrl='http://10.16.0.18:8083/', offset=250, 
connectorIds=[another-hdfs-sink], taskIds=[another-hdfs-sink-0], 
revokedConnectorIds=[], revokedTaskIds=[], delay=0} 
(org.apache.kafka.connect.runtime.distributed.DistributedHerder:1397)
[2019-11-03 12:49:03,150] INFO [Worker clientId=connect-1, 
groupId=ingest-sources-cluster] Attempt to heartbeat failed for since member id 
connect-1-c930bdb9-eedf-4313-95e0-4a6927836094 is not valid. 
(org.apache.kafka.clients.consumer.internals.AbstractCoordinator:947)
[2019-11-03 12:49:05,632] INFO [Worker clientId=connect-1, 
groupId=ingest-sources-cluster] Joined group at generation 642 and got 
assignment: Assignment{error=0, 
leader='connect-1-7d69fcf2-1025-418b-9091-4054b984d18f', 
leaderUrl='http://10.16.0.18:8083/', offset=250, 
connectorIds=[another-hdfs-sink], taskIds=[another-hdfs-sink-0], 
revokedConnectorIds=[], revokedTaskIds=[], delay=0} 
(org.apache.kafka.connect.runtime.distributed.DistributedHerder:1397)
{code}
 

*Worker C*:
{code:java}
[2019-11-03 11:07:26,911] INFO [Worker clientId=connect-1, 
groupId=ingest-sources-cluster] Joined group at generation 639 and got 
assignment: Assignment{error=0, 
leader='connect-1-7d69fcf2-1025-418b-9091-4054b984d18f', 
leaderUrl='http://10.16.0.18:8083/', offset=250, connectorIds=[hdfs-sink], 
taskIds=[hdfs-sink-0], revokedConnectorIds=[], revokedTaskIds=[], delay=30} 
(org.apache.kafka.connect.runtime.distributed.DistributedHerder:1397)
[2019-11-03 11:12:06,006] INFO 

[jira] [Commented] (KAFKA-9184) Redundant task creation after worker fails to join a specific group generation

2019-11-18 Thread Timur Rubeko (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-9184?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16976648#comment-16976648
 ] 

Timur Rubeko commented on KAFKA-9184:
-

Hello. SO question author here.

Following is an example of a sequence of events that typically leads to the 
redundant task creation. Set-up: three workers and 3 connectors. Relevant logs:

*Worker A*:
{code:java}
[2019-11-03 11:07:26,912] INFO [Worker clientId=connect-1, 
groupId=ingest-sources-cluster] Joined group at generation 639 and got 
assignment: Assignment{error=0, 
leader='connect-1-7d69fcf2-1025-418b-9091-4054b984d18f', 
leaderUrl='http://10.16.0.18:8083/', offset=250, connectorIds=[mqtt-source], 
taskIds=[mqtt-source-0], revokedConnectorIds=[], revokedTaskIds=[], 
delay=30} 
(org.apache.kafka.connect.runtime.distributed.DistributedHerder:1397)
[2019-11-03 11:12:06,192] INFO [Worker clientId=connect-1, 
groupId=ingest-sources-cluster] Joined group at generation 640 and got 
assignment: Assignment{error=0, 
leader='connect-1-7d69fcf2-1025-418b-9091-4054b984d18f', 
leaderUrl='http://10.16.0.18:8083/', offset=250, connectorIds=[hdfs-sink, 
another-hdfs-sink, mqtt-source], taskIds=[hdfs-sink-0, another-hdfs-sink-0, 
mqtt-source-0], revokedConnectorIds=[], revokedTaskIds=[], delay=0} 
(org.apache.kafka.connect.runtime.distributed.DistributedHerder:1397)
[2019-11-03 11:12:09,247] INFO [Worker clientId=connect-1, 
groupId=ingest-sources-cluster] Joined group at generation 641 and got 
assignment: Assignment{error=0, 
leader='connect-1-7d69fcf2-1025-418b-9091-4054b984d18f', 
leaderUrl='http://10.16.0.18:8083/', offset=250, connectorIds=[hdfs-sink, 
another-hdfs-sink, mqtt-source], taskIds=[hdfs-sink-0, another-hdfs-sink-0, 
mqtt-source-0], revokedConnectorIds=[], revokedTaskIds=[], delay=0} 
(org.apache.kafka.connect.runtime.distributed.DistributedHerder:1397)
[2019-11-03 12:49:05,632] INFO [Worker clientId=connect-1, 
groupId=ingest-sources-cluster] Joined group at generation 642 and got 
assignment: Assignment{error=0, 
leader='connect-1-7d69fcf2-1025-418b-9091-4054b984d18f', 
leaderUrl='http://10.16.0.18:8083/', offset=250, connectorIds=[hdfs-sink, 
another-hdfs-sink, mqtt-source], taskIds=[hdfs-sink-0, another-hdfs-sink-0, 
mqtt-source-0], revokedConnectorIds=[], revokedTaskIds=[], delay=0} 
(org.apache.kafka.connect.runtime.distributed.DistributedHerder:1397)
{code}
*Worker B*:

 
{code:java}
[2019-11-03 11:07:26,911] INFO [Worker clientId=connect-1, 
groupId=ingest-sources-cluster] Joined group at generation 639 and got 
assignment: Assignment{error=0, 
leader='connect-1-7d69fcf2-1025-418b-9091-4054b984d18f', 
leaderUrl='http://10.16.0.18:8083/', offset=250, 
connectorIds=[another-hdfs-sink], taskIds=[another-hdfs-sink-0], 
revokedConnectorIds=[], revokedTaskIds=[], delay=30} 
(org.apache.kafka.connect.runtime.distributed.DistributedHerder:1397)
[2019-11-03 11:12:06,041] INFO [Worker clientId=connect-1, 
groupId=ingest-sources-cluster] Attempt to heartbeat failed for since member id 
connect-1-bf534716-be2f-4cb3-9f26-521023c6b504 is not valid. 
(org.apache.kafka.clients.consumer.internals.AbstractCoordinator:947)
[2019-11-03 11:12:09,251] INFO [Worker clientId=connect-1, 
groupId=ingest-sources-cluster] Joined group at generation 641 and got 
assignment: Assignment{error=0, 
leader='connect-1-7d69fcf2-1025-418b-9091-4054b984d18f', 
leaderUrl='http://10.16.0.18:8083/', offset=250, 
connectorIds=[another-hdfs-sink], taskIds=[another-hdfs-sink-0], 
revokedConnectorIds=[], revokedTaskIds=[], delay=0} 
(org.apache.kafka.connect.runtime.distributed.DistributedHerder:1397)
[2019-11-03 12:49:03,150] INFO [Worker clientId=connect-1, 
groupId=ingest-sources-cluster] Attempt to heartbeat failed for since member id 
connect-1-c930bdb9-eedf-4313-95e0-4a6927836094 is not valid. 
(org.apache.kafka.clients.consumer.internals.AbstractCoordinator:947)
[2019-11-03 12:49:05,632] INFO [Worker clientId=connect-1, 
groupId=ingest-sources-cluster] Joined group at generation 642 and got 
assignment: Assignment{error=0, 
leader='connect-1-7d69fcf2-1025-418b-9091-4054b984d18f', 
leaderUrl='http://10.16.0.18:8083/', offset=250, 
connectorIds=[another-hdfs-sink], taskIds=[another-hdfs-sink-0], 
revokedConnectorIds=[], revokedTaskIds=[], delay=0} 
(org.apache.kafka.connect.runtime.distributed.DistributedHerder:1397)
{code}
 

*Worker C*:

 
{code:java}
[2019-11-03 11:07:26,911] INFO [Worker clientId=connect-1, 
groupId=ingest-sources-cluster] Joined group at generation 639 and got 
assignment: Assignment{error=0, 
leader='connect-1-7d69fcf2-1025-418b-9091-4054b984d18f', 
leaderUrl='http://10.16.0.18:8083/', offset=250, connectorIds=[hdfs-sink], 
taskIds=[hdfs-sink-0], revokedConnectorIds=[], revokedTaskIds=[], delay=30} 
(org.apache.kafka.connect.runtime.distributed.DistributedHerder:1397)
[2019-11-03 11:12:06,006] INFO [Worker clientId=connect-1, 
groupId=ingest-sources-cluster] 

[jira] [Updated] (KAFKA-9204) ReplaceField transformation fails when encountering tombstone event

2019-11-18 Thread Georgios Kalogiros (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-9204?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Georgios Kalogiros updated KAFKA-9204:
--
Fix Version/s: (was: 2.3.0)

> ReplaceField transformation fails when encountering tombstone event
> ---
>
> Key: KAFKA-9204
> URL: https://issues.apache.org/jira/browse/KAFKA-9204
> Project: Kafka
>  Issue Type: Bug
>  Components: KafkaConnect
>Reporter: Georgios Kalogiros
>Priority: Major
>
> When applying the {{ReplaceField}} transformation to a tombstone event, an 
> exception is raised:
>  
> {code:java}
> org.apache.kafka.connect.errors.ConnectException: Tolerance exceeded in error 
> handler
>   at 
> org.apache.kafka.connect.runtime.errors.RetryWithToleranceOperator.execAndHandleError(RetryWithToleranceOperator.java:178)
>   at 
> org.apache.kafka.connect.runtime.errors.RetryWithToleranceOperator.execute(RetryWithToleranceOperator.java:104)
>   at 
> org.apache.kafka.connect.runtime.TransformationChain.apply(TransformationChain.java:50)
>   at 
> org.apache.kafka.connect.runtime.WorkerSinkTask.convertAndTransformRecord(WorkerSinkTask.java:506)
>   at 
> org.apache.kafka.connect.runtime.WorkerSinkTask.convertMessages(WorkerSinkTask.java:464)
>   at 
> org.apache.kafka.connect.runtime.WorkerSinkTask.poll(WorkerSinkTask.java:320)
>   at 
> org.apache.kafka.connect.runtime.WorkerSinkTask.iteration(WorkerSinkTask.java:224)
>   at 
> org.apache.kafka.connect.runtime.WorkerSinkTask.execute(WorkerSinkTask.java:192)
>   at 
> org.apache.kafka.connect.runtime.WorkerTask.doRun(WorkerTask.java:177)
>   at org.apache.kafka.connect.runtime.WorkerTask.run(WorkerTask.java:227)
>   at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>   at java.lang.Thread.run(Thread.java:748)
> Caused by: org.apache.kafka.connect.errors.DataException: Only Map objects 
> supported in absence of schema for [field replacement], found: null
>   at 
> org.apache.kafka.connect.transforms.util.Requirements.requireMap(Requirements.java:38)
>   at 
> org.apache.kafka.connect.transforms.ReplaceField.applySchemaless(ReplaceField.java:134)
>   at 
> org.apache.kafka.connect.transforms.ReplaceField.apply(ReplaceField.java:127)
>   at 
> org.apache.kafka.connect.runtime.TransformationChain.lambda$apply$0(TransformationChain.java:50)
>   at 
> org.apache.kafka.connect.runtime.errors.RetryWithToleranceOperator.execAndRetry(RetryWithToleranceOperator.java:128)
>   at 
> org.apache.kafka.connect.runtime.errors.RetryWithToleranceOperator.execAndHandleError(RetryWithToleranceOperator.java:162)
>   ... 14 more
> {code}
> There was a similar bug for the InsertField transformation that got merged in 
> recently:
> https://issues.apache.org/jira/browse/KAFKA-8523
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (KAFKA-9204) ReplaceField transformation fails when encountering tombstone event

2019-11-18 Thread Georgios Kalogiros (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-9204?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Georgios Kalogiros updated KAFKA-9204:
--
Affects Version/s: 2.3.0

> ReplaceField transformation fails when encountering tombstone event
> ---
>
> Key: KAFKA-9204
> URL: https://issues.apache.org/jira/browse/KAFKA-9204
> Project: Kafka
>  Issue Type: Bug
>  Components: KafkaConnect
>Affects Versions: 2.3.0
>Reporter: Georgios Kalogiros
>Priority: Major
>
> When applying the {{ReplaceField}} transformation to a tombstone event, an 
> exception is raised:
>  
> {code:java}
> org.apache.kafka.connect.errors.ConnectException: Tolerance exceeded in error 
> handler
>   at 
> org.apache.kafka.connect.runtime.errors.RetryWithToleranceOperator.execAndHandleError(RetryWithToleranceOperator.java:178)
>   at 
> org.apache.kafka.connect.runtime.errors.RetryWithToleranceOperator.execute(RetryWithToleranceOperator.java:104)
>   at 
> org.apache.kafka.connect.runtime.TransformationChain.apply(TransformationChain.java:50)
>   at 
> org.apache.kafka.connect.runtime.WorkerSinkTask.convertAndTransformRecord(WorkerSinkTask.java:506)
>   at 
> org.apache.kafka.connect.runtime.WorkerSinkTask.convertMessages(WorkerSinkTask.java:464)
>   at 
> org.apache.kafka.connect.runtime.WorkerSinkTask.poll(WorkerSinkTask.java:320)
>   at 
> org.apache.kafka.connect.runtime.WorkerSinkTask.iteration(WorkerSinkTask.java:224)
>   at 
> org.apache.kafka.connect.runtime.WorkerSinkTask.execute(WorkerSinkTask.java:192)
>   at 
> org.apache.kafka.connect.runtime.WorkerTask.doRun(WorkerTask.java:177)
>   at org.apache.kafka.connect.runtime.WorkerTask.run(WorkerTask.java:227)
>   at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>   at java.lang.Thread.run(Thread.java:748)
> Caused by: org.apache.kafka.connect.errors.DataException: Only Map objects 
> supported in absence of schema for [field replacement], found: null
>   at 
> org.apache.kafka.connect.transforms.util.Requirements.requireMap(Requirements.java:38)
>   at 
> org.apache.kafka.connect.transforms.ReplaceField.applySchemaless(ReplaceField.java:134)
>   at 
> org.apache.kafka.connect.transforms.ReplaceField.apply(ReplaceField.java:127)
>   at 
> org.apache.kafka.connect.runtime.TransformationChain.lambda$apply$0(TransformationChain.java:50)
>   at 
> org.apache.kafka.connect.runtime.errors.RetryWithToleranceOperator.execAndRetry(RetryWithToleranceOperator.java:128)
>   at 
> org.apache.kafka.connect.runtime.errors.RetryWithToleranceOperator.execAndHandleError(RetryWithToleranceOperator.java:162)
>   ... 14 more
> {code}
> There was a similar bug for the InsertField transformation that got merged in 
> recently:
> https://issues.apache.org/jira/browse/KAFKA-8523
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (KAFKA-9204) ReplaceField transformation fails when encountering tombstone event

2019-11-18 Thread Georgios Kalogiros (Jira)
Georgios Kalogiros created KAFKA-9204:
-

 Summary: ReplaceField transformation fails when encountering 
tombstone event
 Key: KAFKA-9204
 URL: https://issues.apache.org/jira/browse/KAFKA-9204
 Project: Kafka
  Issue Type: Bug
  Components: KafkaConnect
Reporter: Georgios Kalogiros
 Fix For: 2.3.0


When applying the {{ReplaceField}} transformation to a tombstone event, an 
exception is raised:

 
{code:java}
org.apache.kafka.connect.errors.ConnectException: Tolerance exceeded in error 
handler
at 
org.apache.kafka.connect.runtime.errors.RetryWithToleranceOperator.execAndHandleError(RetryWithToleranceOperator.java:178)
at 
org.apache.kafka.connect.runtime.errors.RetryWithToleranceOperator.execute(RetryWithToleranceOperator.java:104)
at 
org.apache.kafka.connect.runtime.TransformationChain.apply(TransformationChain.java:50)
at 
org.apache.kafka.connect.runtime.WorkerSinkTask.convertAndTransformRecord(WorkerSinkTask.java:506)
at 
org.apache.kafka.connect.runtime.WorkerSinkTask.convertMessages(WorkerSinkTask.java:464)
at 
org.apache.kafka.connect.runtime.WorkerSinkTask.poll(WorkerSinkTask.java:320)
at 
org.apache.kafka.connect.runtime.WorkerSinkTask.iteration(WorkerSinkTask.java:224)
at 
org.apache.kafka.connect.runtime.WorkerSinkTask.execute(WorkerSinkTask.java:192)
at 
org.apache.kafka.connect.runtime.WorkerTask.doRun(WorkerTask.java:177)
at org.apache.kafka.connect.runtime.WorkerTask.run(WorkerTask.java:227)
at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Caused by: org.apache.kafka.connect.errors.DataException: Only Map objects 
supported in absence of schema for [field replacement], found: null
at 
org.apache.kafka.connect.transforms.util.Requirements.requireMap(Requirements.java:38)
at 
org.apache.kafka.connect.transforms.ReplaceField.applySchemaless(ReplaceField.java:134)
at 
org.apache.kafka.connect.transforms.ReplaceField.apply(ReplaceField.java:127)
at 
org.apache.kafka.connect.runtime.TransformationChain.lambda$apply$0(TransformationChain.java:50)
at 
org.apache.kafka.connect.runtime.errors.RetryWithToleranceOperator.execAndRetry(RetryWithToleranceOperator.java:128)
at 
org.apache.kafka.connect.runtime.errors.RetryWithToleranceOperator.execAndHandleError(RetryWithToleranceOperator.java:162)
... 14 more
{code}
There was a similar bug for the InsertField transformation that got merged in 
recently:
https://issues.apache.org/jira/browse/KAFKA-8523

 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Comment Edited] (KAFKA-9124) KIP-497: ISR changes should be propagated via Kafka protocol

2019-11-18 Thread Viktor Somogyi-Vass (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-9124?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16976446#comment-16976446
 ] 

Viktor Somogyi-Vass edited comment on KAFKA-9124 at 11/18/19 11:06 AM:
---

[~cmccabe] sure. Do you think we should mark this or the other jira as a 
duplicate?


was (Author: viktorsomogyi):
[~cmccabe] sure. Do you think we should mark this as a duplicate?

> KIP-497: ISR changes should be propagated via Kafka protocol
> 
>
> Key: KAFKA-9124
> URL: https://issues.apache.org/jira/browse/KAFKA-9124
> Project: Kafka
>  Issue Type: Sub-task
>Reporter: Viktor Somogyi-Vass
>Assignee: Colin McCabe
>Priority: Major
>
> Currently {{Partition.expandIsr}} and {{Partition.shrinkIsr}} updates 
> Zookeeper which is listened by the controller and that's how it notices the 
> ISR changes and sends out metadata requests.
> Instead of this the brokers should use Kafka protocol messages to send out 
> ISR change notifications.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (KAFKA-9125) GroupMetadataManager and TransactionStateManager should query the controller instead of zkClient

2019-11-18 Thread Viktor Somogyi-Vass (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-9125?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Viktor Somogyi-Vass updated KAFKA-9125:
---
Summary: GroupMetadataManager and TransactionStateManager should query the 
controller instead of zkClient  (was: GroupMetadataManager and 
TransactionStateManager should use metadata cache instead of zkClient)

> GroupMetadataManager and TransactionStateManager should query the controller 
> instead of zkClient
> 
>
> Key: KAFKA-9125
> URL: https://issues.apache.org/jira/browse/KAFKA-9125
> Project: Kafka
>  Issue Type: Sub-task
>Reporter: Viktor Somogyi-Vass
>Assignee: Viktor Somogyi-Vass
>Priority: Major
>
> Both classes query their respective topic's partition count via the zkClient. 
> This however could be queried by the broker's local metadata cache.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (KAFKA-9203) kafka-client 2.3.1 fails to consume lz4 compressed topic in kafka 2.1.1

2019-11-18 Thread David Watzke (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-9203?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David Watzke updated KAFKA-9203:

Description: 
I run kafka cluster 2.1.1

when I upgraded a client app to use kafka-client 2.3.0 (or 2.3.1) instead of 
2.2.0, I immediately started getting the following exceptions in a loop when 
consuming a topic with LZ4-compressed messages:
{noformat}
2019-11-18 11:59:16.888 ERROR [afka-consumer-thread] 
com.avast.utils2.kafka.consumer.ConsumerThread: Unexpected exception occurred 
while polling and processing messages: org.apache.kafka.common.KafkaExce
ption: Received exception when fetching the next record from 
FILEREP_LONER_REQUEST-4. If needed, please seek past the record to continue 
consumption. 
org.apache.kafka.common.KafkaException: Received exception when fetching the 
next record from FILEREP_LONER_REQUEST-4. If needed, please seek past the 
record to continue consumption. 
    at 
org.apache.kafka.clients.consumer.internals.Fetcher$PartitionRecords.fetchRecords(Fetcher.java:1473)
 
    at 
org.apache.kafka.clients.consumer.internals.Fetcher$PartitionRecords.access$1600(Fetcher.java:1332)
 
    at 
org.apache.kafka.clients.consumer.internals.Fetcher.fetchRecords(Fetcher.java:645)
 
    at 
org.apache.kafka.clients.consumer.internals.Fetcher.fetchedRecords(Fetcher.java:606)
 
    at 
org.apache.kafka.clients.consumer.KafkaConsumer.pollForFetches(KafkaConsumer.java:1263)
 
    at 
org.apache.kafka.clients.consumer.KafkaConsumer.poll(KafkaConsumer.java:1225) 
    at 
org.apache.kafka.clients.consumer.KafkaConsumer.poll(KafkaConsumer.java:1201) 
    at 
com.avast.utils2.kafka.consumer.ConsumerThread.pollAndProcessMessagesFromKafka(ConsumerThread.java:180)
 
    at 
com.avast.utils2.kafka.consumer.ConsumerThread.run(ConsumerThread.java:149) 
    at 
com.avast.filerep.saver.RequestSaver$.$anonfun$main$8(RequestSaver.scala:28) 
    at 
com.avast.filerep.saver.RequestSaver$.$anonfun$main$8$adapted(RequestSaver.scala:19)
 
    at 
resource.AbstractManagedResource.$anonfun$acquireFor$1(AbstractManagedResource.scala:88)
 
    at 
scala.util.control.Exception$Catch.$anonfun$either$1(Exception.scala:252) 
    at scala.util.control.Exception$Catch.apply(Exception.scala:228) 
    at scala.util.control.Exception$Catch.either(Exception.scala:252) 
    at 
resource.AbstractManagedResource.acquireFor(AbstractManagedResource.scala:88) 
    at 
resource.ManagedResourceOperations.apply(ManagedResourceOperations.scala:26) 
    at 
resource.ManagedResourceOperations.apply$(ManagedResourceOperations.scala:26) 
    at 
resource.AbstractManagedResource.apply(AbstractManagedResource.scala:50) 
    at 
resource.ManagedResourceOperations.acquireAndGet(ManagedResourceOperations.scala:25)
 
    at 
resource.ManagedResourceOperations.acquireAndGet$(ManagedResourceOperations.scala:25)
 
    at 
resource.AbstractManagedResource.acquireAndGet(AbstractManagedResource.scala:50)
 
    at 
resource.ManagedResourceOperations.foreach(ManagedResourceOperations.scala:53) 
    at 
resource.ManagedResourceOperations.foreach$(ManagedResourceOperations.scala:53) 
    at 
resource.AbstractManagedResource.foreach(AbstractManagedResource.scala:50) 
    at 
com.avast.filerep.saver.RequestSaver$.$anonfun$main$6(RequestSaver.scala:19) 
    at 
com.avast.filerep.saver.RequestSaver$.$anonfun$main$6$adapted(RequestSaver.scala:18)
 
    at 
resource.AbstractManagedResource.$anonfun$acquireFor$1(AbstractManagedResource.scala:88)
 
    at 
scala.util.control.Exception$Catch.$anonfun$either$1(Exception.scala:252) 
    at scala.util.control.Exception$Catch.apply(Exception.scala:228) 
    at scala.util.control.Exception$Catch.either(Exception.scala:252) 
    at 
resource.AbstractManagedResource.acquireFor(AbstractManagedResource.scala:88) 
    at 
resource.ManagedResourceOperations.apply(ManagedResourceOperations.scala:26) 
    at 
resource.ManagedResourceOperations.apply$(ManagedResourceOperations.scala:26) 
    at 
resource.AbstractManagedResource.apply(AbstractManagedResource.scala:50) 
    at 
resource.ManagedResourceOperations.acquireAndGet(ManagedResourceOperations.scala:25)
 
    at 
resource.ManagedResourceOperations.acquireAndGet$(ManagedResourceOperations.scala:25)
 
    at 
resource.AbstractManagedResource.acquireAndGet(AbstractManagedResource.scala:50)
 
    at 
resource.ManagedResourceOperations.foreach(ManagedResourceOperations.scala:53) 
    at 
resource.ManagedResourceOperations.foreach$(ManagedResourceOperations.scala:53) 
    at 
resource.AbstractManagedResource.foreach(AbstractManagedResource.scala:50) 
    at 
com.avast.filerep.saver.RequestSaver$.$anonfun$main$4(RequestSaver.scala:18) 
    at 

[jira] [Commented] (KAFKA-9124) KIP-497: ISR changes should be propagated via Kafka protocol

2019-11-18 Thread Viktor Somogyi-Vass (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-9124?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16976446#comment-16976446
 ] 

Viktor Somogyi-Vass commented on KAFKA-9124:


[~cmccabe] sure. Do you think we should mark this as a duplicate?

> KIP-497: ISR changes should be propagated via Kafka protocol
> 
>
> Key: KAFKA-9124
> URL: https://issues.apache.org/jira/browse/KAFKA-9124
> Project: Kafka
>  Issue Type: Sub-task
>Reporter: Viktor Somogyi-Vass
>Assignee: Colin McCabe
>Priority: Major
>
> Currently {{Partition.expandIsr}} and {{Partition.shrinkIsr}} updates 
> Zookeeper which is listened by the controller and that's how it notices the 
> ISR changes and sends out metadata requests.
> Instead of this the brokers should use Kafka protocol messages to send out 
> ISR change notifications.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (KAFKA-9203) kafka-client 2.3.1 fails to consume lz4 compressed topic in kafka 2.1.1

2019-11-18 Thread David Watzke (Jira)
David Watzke created KAFKA-9203:
---

 Summary: kafka-client 2.3.1 fails to consume lz4 compressed topic 
in kafka 2.1.1
 Key: KAFKA-9203
 URL: https://issues.apache.org/jira/browse/KAFKA-9203
 Project: Kafka
  Issue Type: Bug
  Components: compression, consumer
Affects Versions: 2.3.0, 2.3.1
Reporter: David Watzke


I run kafka cluster 2.1.1

when I upgraded a client app to use kafka-client 2.3.0 (or 2.3.1) instead of 
2.2.0, I immediately started getting the following exceptions in a loop:

{noformat}

2019-11-18 11:59:16.888 ERROR [afka-consumer-thread] 
com.avast.utils2.kafka.consumer.ConsumerThread: Unexpected exception occurred 
while polling and processing messages: org.apache.kafka.common.KafkaExce
ption: Received exception when fetching the next record from 
FILEREP_LONER_REQUEST-4. If needed, please seek past the record to continue 
consumption. 
org.apache.kafka.common.KafkaException: Received exception when fetching the 
next record from FILEREP_LONER_REQUEST-4. If needed, please seek past the 
record to continue consumption. 
    at 
org.apache.kafka.clients.consumer.internals.Fetcher$PartitionRecords.fetchRecords(Fetcher.java:1473)
 
    at 
org.apache.kafka.clients.consumer.internals.Fetcher$PartitionRecords.access$1600(Fetcher.java:1332)
 
    at 
org.apache.kafka.clients.consumer.internals.Fetcher.fetchRecords(Fetcher.java:645)
 
    at 
org.apache.kafka.clients.consumer.internals.Fetcher.fetchedRecords(Fetcher.java:606)
 
    at 
org.apache.kafka.clients.consumer.KafkaConsumer.pollForFetches(KafkaConsumer.java:1263)
 
    at 
org.apache.kafka.clients.consumer.KafkaConsumer.poll(KafkaConsumer.java:1225) 
    at 
org.apache.kafka.clients.consumer.KafkaConsumer.poll(KafkaConsumer.java:1201) 
    at 
com.avast.utils2.kafka.consumer.ConsumerThread.pollAndProcessMessagesFromKafka(ConsumerThread.java:180)
 
    at 
com.avast.utils2.kafka.consumer.ConsumerThread.run(ConsumerThread.java:149) 
    at 
com.avast.filerep.saver.RequestSaver$.$anonfun$main$8(RequestSaver.scala:28) 
    at 
com.avast.filerep.saver.RequestSaver$.$anonfun$main$8$adapted(RequestSaver.scala:19)
 
    at 
resource.AbstractManagedResource.$anonfun$acquireFor$1(AbstractManagedResource.scala:88)
 
    at 
scala.util.control.Exception$Catch.$anonfun$either$1(Exception.scala:252) 
    at scala.util.control.Exception$Catch.apply(Exception.scala:228) 
    at scala.util.control.Exception$Catch.either(Exception.scala:252) 
    at 
resource.AbstractManagedResource.acquireFor(AbstractManagedResource.scala:88) 
    at 
resource.ManagedResourceOperations.apply(ManagedResourceOperations.scala:26) 
    at 
resource.ManagedResourceOperations.apply$(ManagedResourceOperations.scala:26) 
    at 
resource.AbstractManagedResource.apply(AbstractManagedResource.scala:50) 
    at 
resource.ManagedResourceOperations.acquireAndGet(ManagedResourceOperations.scala:25)
 
    at 
resource.ManagedResourceOperations.acquireAndGet$(ManagedResourceOperations.scala:25)
 
    at 
resource.AbstractManagedResource.acquireAndGet(AbstractManagedResource.scala:50)
 
    at 
resource.ManagedResourceOperations.foreach(ManagedResourceOperations.scala:53) 
    at 
resource.ManagedResourceOperations.foreach$(ManagedResourceOperations.scala:53) 
    at 
resource.AbstractManagedResource.foreach(AbstractManagedResource.scala:50) 
    at 
com.avast.filerep.saver.RequestSaver$.$anonfun$main$6(RequestSaver.scala:19) 
    at 
com.avast.filerep.saver.RequestSaver$.$anonfun$main$6$adapted(RequestSaver.scala:18)
 
    at 
resource.AbstractManagedResource.$anonfun$acquireFor$1(AbstractManagedResource.scala:88)
 
    at 
scala.util.control.Exception$Catch.$anonfun$either$1(Exception.scala:252) 
    at scala.util.control.Exception$Catch.apply(Exception.scala:228) 
    at scala.util.control.Exception$Catch.either(Exception.scala:252) 
    at 
resource.AbstractManagedResource.acquireFor(AbstractManagedResource.scala:88) 
    at 
resource.ManagedResourceOperations.apply(ManagedResourceOperations.scala:26) 
    at 
resource.ManagedResourceOperations.apply$(ManagedResourceOperations.scala:26) 
    at 
resource.AbstractManagedResource.apply(AbstractManagedResource.scala:50) 
    at 
resource.ManagedResourceOperations.acquireAndGet(ManagedResourceOperations.scala:25)
 
    at 
resource.ManagedResourceOperations.acquireAndGet$(ManagedResourceOperations.scala:25)
 
    at 
resource.AbstractManagedResource.acquireAndGet(AbstractManagedResource.scala:50)
 
    at 
resource.ManagedResourceOperations.foreach(ManagedResourceOperations.scala:53) 
    at 
resource.ManagedResourceOperations.foreach$(ManagedResourceOperations.scala:53) 
    at 

[jira] [Assigned] (KAFKA-9167) Implement a broker to controller request channel

2019-11-18 Thread Viktor Somogyi-Vass (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-9167?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Viktor Somogyi-Vass reassigned KAFKA-9167:
--

Assignee: Viktor Somogyi-Vass  (was: Dhruvil Shah)

> Implement a broker to controller request channel
> 
>
> Key: KAFKA-9167
> URL: https://issues.apache.org/jira/browse/KAFKA-9167
> Project: Kafka
>  Issue Type: Sub-task
>  Components: controller, core
>Reporter: Viktor Somogyi-Vass
>Assignee: Viktor Somogyi-Vass
>Priority: Major
>
> In some cases, we will need to create a new API to replace an operation that 
> was formerly done via ZooKeeper.  One example of this is that when the leader 
> of a partition wants to modify the in-sync replica set, it currently modifies 
> ZooKeeper directly  In the post-ZK world, the leader will make an RPC to the 
> active controller instead.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (KAFKA-9167) Implement a broker to controller request channel

2019-11-18 Thread Viktor Somogyi-Vass (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-9167?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16976440#comment-16976440
 ] 

Viktor Somogyi-Vass commented on KAFKA-9167:


Ok, let's catch up about your ideas too, don't want to leave them out.

> Implement a broker to controller request channel
> 
>
> Key: KAFKA-9167
> URL: https://issues.apache.org/jira/browse/KAFKA-9167
> Project: Kafka
>  Issue Type: Sub-task
>  Components: controller, core
>Reporter: Viktor Somogyi-Vass
>Assignee: Dhruvil Shah
>Priority: Major
>
> In some cases, we will need to create a new API to replace an operation that 
> was formerly done via ZooKeeper.  One example of this is that when the leader 
> of a partition wants to modify the in-sync replica set, it currently modifies 
> ZooKeeper directly  In the post-ZK world, the leader will make an RPC to the 
> active controller instead.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (KAFKA-8958) Fix Kafka Streams JavaDocs with regard to used Serdes

2019-11-18 Thread bibin sebastian (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-8958?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

bibin sebastian reassigned KAFKA-8958:
--

Assignee: (was: bibin sebastian)

> Fix Kafka Streams JavaDocs with regard to used Serdes
> -
>
> Key: KAFKA-8958
> URL: https://issues.apache.org/jira/browse/KAFKA-8958
> Project: Kafka
>  Issue Type: Improvement
>  Components: streams
>Reporter: Matthias J. Sax
>Priority: Minor
>  Labels: beginner, newbie
>
> In older released, Kafka Streams applied operator specific overwrites of 
> Serdes as in-place overwrites. In newer releases, Kafka Streams tries to 
> re-use Serdes more "aggressively" by pushing serde information downstream if 
> the key and/or value did not change.
> However, we never updated the JavaDocs accordingly. For example 
> `KStream#through(String topic)` JavaDocs say:
> {code:java}
> Materialize this stream to a topic and creates a new {@code KStream} from the 
> topic using default serializers, deserializers, and producer's {@link 
> DefaultPartitioner}.
> {code}
> The JavaDocs don't put into account that Serdes might have been set further 
> upstream, and the defaults from the config would not be used.
> `KStream#through()` is just one example. We should address this through all 
> JavaDocs over all operators (ie, KStream, KGroupedStream, 
> TimeWindowedKStream, SessionWindowedKStream, KTable, and KGroupedTable.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)