[jira] [Created] (KAFKA-12456) Log "Listeners are not identical across brokers" message at WARN/INFO instead of ERROR

2021-03-11 Thread Michael Bingham (Jira)
Michael Bingham created KAFKA-12456:
---

 Summary: Log "Listeners are not identical across brokers" message 
at WARN/INFO instead of ERROR
 Key: KAFKA-12456
 URL: https://issues.apache.org/jira/browse/KAFKA-12456
 Project: Kafka
  Issue Type: Improvement
  Components: core, security
Affects Versions: 2.6.1, 2.7.0
Reporter: Michael Bingham


Currently, when listeners are not identical across brokers, an {{ERROR}} level 
log message is generated like:
{code:java}
 [2021-03-11 20:25:18,881] ERROR [MetadataCache brokerId=0] Listeners are not 
identical across brokers: LongMap(0 -> Map(ListenerName(PLAINTEXT) -> 
192.168.111.99:9092 (id: 0 rack: null), ListenerName(SSL) -> 
192.168.111.99:9094 (id: 0 rack: null), ListenerName(SASL_PLAINTEXT) -> 
192.168.111.99:9093 (id: 0 rack: null)), 2 -> Map(ListenerName(PLAINTEXT) -> 
192.168.116.148:9092 (id: 2 rack: null), ListenerName(SASL_PLAINTEXT) -> 
192.168.116.148:9093 (id: 2 rack: null)), 1 -> Map(ListenerName(PLAINTEXT) -> 
192.168.111.100:9092 (id: 1 rack: null), ListenerName(SASL_PLAINTEXT) -> 
192.168.111.100:9093 (id: 1 rack: null))) (kafka.server.MetadataCache) {code}
When adding security incrementally with a rolling update, this event is 
normal/expected, so recommend lowering the log level for this message to 
{{WARN}} or {{INFO}}  to avoid confusion.

This log message was originally added with 
https://issues.apache.org/jira/browse/KAFKA-6501 . 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (KAFKA-10564) Continuous logging about deleting obsolete state directories

2020-10-01 Thread Michael Bingham (Jira)
Michael Bingham created KAFKA-10564:
---

 Summary: Continuous logging about deleting obsolete state 
directories
 Key: KAFKA-10564
 URL: https://issues.apache.org/jira/browse/KAFKA-10564
 Project: Kafka
  Issue Type: Bug
  Components: streams
Affects Versions: 2.6.0
Reporter: Michael Bingham


The internal process which automatically cleans up obsolete task state 
directories was modified in https://issues.apache.org/jira/browse/KAFKA-6647. 
The current logic in 2.6 is to remove all files from the task directory except 
the {{.lock}} file:

[https://github.com/apache/kafka/blob/2.6/streams/src/main/java/org/apache/kafka/streams/processor/internals/StateDirectory.java#L335]

However, the directory is only removed in its entirely for a manual cleanup:

[https://github.com/apache/kafka/blob/2.6/streams/src/main/java/org/apache/kafka/streams/processor/internals/StateDirectory.java#L349-L353]

The result of this is that Streams will continue revisiting this directory and 
trying to clean it up, since it determines what to clean based on 
last-modification time of the task directory (which is now no longer deleted 
during the automatic cleanup). So a user will see log messages like:
stream-thread [app-c2d773e6-8ac3-4435-9777-378e0ec0ab82-CleanupThread] Deleting 
obsolete state directory 0_8 for task 0_8 as 6061ms has elapsed (cleanup 
delay is 60ms)
repeated again and again.

This issue doesn't seem to break anything - it's more about avoiding 
unnecessary logging and cleaning up empty task directories.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (KAFKA-9965) Uneven distribution with RoundRobinPartitioner in AK 2.4+

2020-05-06 Thread Michael Bingham (Jira)
Michael Bingham created KAFKA-9965:
--

 Summary: Uneven distribution with RoundRobinPartitioner in AK 2.4+
 Key: KAFKA-9965
 URL: https://issues.apache.org/jira/browse/KAFKA-9965
 Project: Kafka
  Issue Type: Bug
  Components: producer 
Affects Versions: 2.4.1, 2.5.0, 2.4.0
Reporter: Michael Bingham


{{RoundRobinPartitioner}} states that it will provide equal distribution of 
records across partitions. However with the enhancements made in KIP-480, it 
may not. In some cases, when a new batch is started, the partitioner may be 
called a second time for the same record:

[https://github.com/apache/kafka/blob/2.4/clients/src/main/java/org/apache/kafka/clients/producer/KafkaProducer.java#L909]

[https://github.com/apache/kafka/blob/2.4/clients/src/main/java/org/apache/kafka/clients/producer/KafkaProducer.java#L934]

Each time the partitioner is called, it increments a counter in 
{{RoundRobinPartitioner}}, so this can result in unequal distribution.

Easiest fix might be to decrement the counter in 
{{RoundRobinPartitioner#onNewBatch}}.

 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (KAFKA-9964) Better description of RoundRobinPartitioner behavior for AK 2.4+

2020-05-06 Thread Michael Bingham (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-9964?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael Bingham resolved KAFKA-9964.

Resolution: Invalid

> Better description of RoundRobinPartitioner behavior for AK 2.4+
> 
>
> Key: KAFKA-9964
> URL: https://issues.apache.org/jira/browse/KAFKA-9964
> Project: Kafka
>  Issue Type: Improvement
>  Components: producer 
>Affects Versions: 2.4.0, 2.5.0, 2.4.1
>Reporter: Michael Bingham
>Priority: Minor
>
> The Javadocs for {{RoundRobinPartitioner}} currently state:
> {quote}This partitioning strategy can be used when user wants to distribute 
> the writes to all partitions equally
> {quote}
> In AK 2.4+, equal distribution is not guaranteed, even with this partitioner. 
> The enhancements to consider batching made with 
> [https://cwiki.apache.org/confluence/display/KAFKA/KIP-480%3A+Sticky+Partitioner]
>  affect this partitioner as well.
> So it would be useful to add some additional Javadocs to explain that unless 
> batching is disabled, even distribution of records is not guaranteed.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (KAFKA-9964) Better description of RoundRobinPartitioner behavior for AK 2.4+

2020-05-06 Thread Michael Bingham (Jira)
Michael Bingham created KAFKA-9964:
--

 Summary: Better description of RoundRobinPartitioner behavior for 
AK 2.4+
 Key: KAFKA-9964
 URL: https://issues.apache.org/jira/browse/KAFKA-9964
 Project: Kafka
  Issue Type: Improvement
  Components: producer 
Affects Versions: 2.4.1, 2.5.0, 2.4.0
Reporter: Michael Bingham


The Javadocs for {{RoundRobinPartitioner}} currently state:
{quote}This partitioning strategy can be used when user wants to distribute the 
writes to all partitions equally
{quote}
In AK 2.4+, equal distribution is not guaranteed, even with this partitioner. 
The enhancements to consider batching made with 
[https://cwiki.apache.org/confluence/display/KAFKA/KIP-480%3A+Sticky+Partitioner]
 affect this partitioner as well.


So it would be useful to add some additional Javadocs to explain that unless 
batching is disabled, even distribution of records is not guaranteed.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (KAFKA-9524) Default window retention does not consider grace period

2020-02-07 Thread Michael Bingham (Jira)
Michael Bingham created KAFKA-9524:
--

 Summary: Default window retention does not consider grace period
 Key: KAFKA-9524
 URL: https://issues.apache.org/jira/browse/KAFKA-9524
 Project: Kafka
  Issue Type: Bug
  Components: streams
Affects Versions: 2.4.0
Reporter: Michael Bingham


In a windowed aggregation, if you specify a window size larger than the default 
window retention (1 day), Streams will implicitly set retention accordingly to 
accommodate windows of that size. For example,

 
{code:java}
.windowedBy(TimeWindows.of(Duration.ofDays(20))) 
{code}
In this case, Streams will implicitly set window retention to 20 days, and no 
exceptions will occur.

However, if you also include a non-zero grace period on the window, such as:

 
{code:java}
.windowedBy(TimeWindows.of(Duration.ofDays(20)).grace(Duration.ofMinutes(5)))
{code}
In this case, Streams will still implicitly set the window retention 20 days 
(not 20 days + 5 minutes grace), and an exception will be thrown:
Exception in thread "main" java.lang.IllegalArgumentException: The retention 
period of the window store KSTREAM-KEY-SELECT-02 must be no smaller 
than its window size plus the grace period. Got size=[172800], 
grace=[30], retention=[172800]
Ideally, Streams should include grace period when implicitly setting window 
retention.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (KAFKA-9485) Dynamic updates to num.recovery.threads.per.data.dir are not applied right away

2020-01-30 Thread Michael Bingham (Jira)
Michael Bingham created KAFKA-9485:
--

 Summary: Dynamic updates to num.recovery.threads.per.data.dir are 
not applied right away
 Key: KAFKA-9485
 URL: https://issues.apache.org/jira/browse/KAFKA-9485
 Project: Kafka
  Issue Type: Improvement
  Components: core, log
Affects Versions: 2.4.0
Reporter: Michael Bingham


The {{num.recovery.threads.per.data.dir}} broker property is a {{cluster-wide}} 
dynamically configurable setting, but it does not appear that it would have any 
dynamic effect on actual broker behavior.

The recovery thread pool is currently only created once when the {{LogManager}} 
is started and the {{loadLogs()}} method is called. If this property is later 
changed dynamically, it would have no effect until the broker is restarted.

This might be confusing to someone modifying this property, so perhaps should 
be made more clear in the documentation, or perhaps changed to a \{{read-only}} 
property. The only benefit I see to having it be a dynamic config property is 
that it can be applied once for the entire cluster, instead of individually 
specified in each broker's {{server.properties}} file.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (KAFKA-9331) Add option to terminate application when StreamThread(s) die

2019-12-23 Thread Michael Bingham (Jira)
Michael Bingham created KAFKA-9331:
--

 Summary: Add option to terminate application when StreamThread(s) 
die
 Key: KAFKA-9331
 URL: https://issues.apache.org/jira/browse/KAFKA-9331
 Project: Kafka
  Issue Type: Improvement
  Components: streams
Affects Versions: 2.4.0
Reporter: Michael Bingham


Currently, if a {{StreamThread}} dies due to an unexpected exception, the 
Streams application continues running. Even if all {{StreamThread}}s die, the 
application will continue running, but will be in an {{ERROR}} state. 

Many users want or expect the application to terminate in the event of a fatal 
exception that kills one or more {{StreamThread}}s. Currently, this requires 
extra work from the developer to register an uncaught exception handler on the 
{{KafkaStreams}} object and trigger a shutdown as needed.

It would be useful to provide a configurable option for the Streams application 
to have it automatically terminate with an exception if one or more 
{{StreamThread}}(s) die.

 

 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (KAFKA-9209) Avoid sending unnecessary offset updates from consumer after KIP-211

2019-11-18 Thread Michael Bingham (Jira)
Michael Bingham created KAFKA-9209:
--

 Summary: Avoid sending unnecessary offset updates from consumer 
after KIP-211
 Key: KAFKA-9209
 URL: https://issues.apache.org/jira/browse/KAFKA-9209
 Project: Kafka
  Issue Type: Improvement
  Components: consumer
Affects Versions: 2.3.0
Reporter: Michael Bingham


With KIP-211 
([https://cwiki.apache.org/confluence/display/KAFKA/KIP-211%3A+Revise+Expiration+Semantics+of+Consumer+Group+Offsets]),
 offsets will no longer expire as long as the consumer group is active.

If the consumer has {{enable.auto.commit=true}}, and if no new events are 
arriving on subscribed partition(s), the consumer still sends offsets 
(unchanged) to the group coordinator just to keep them from expiring. This is 
no longer necessary, and an optimization could potentially be implemented to 
only send offsets with auto commit when there are actual updates to be made 
(i.e., when new events have been processed). 

This would require detecting whether the broker supports the new expiration 
semantics in KIP-211, and only apply the optimization when it does.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (KAFKA-7487) DumpLogSegments reports mismatches for indexed offsets which are not at the start of a record batch

2018-10-05 Thread Michael Bingham (JIRA)
Michael Bingham created KAFKA-7487:
--

 Summary: DumpLogSegments reports mismatches for indexed offsets 
which are not at the start of a record batch
 Key: KAFKA-7487
 URL: https://issues.apache.org/jira/browse/KAFKA-7487
 Project: Kafka
  Issue Type: Improvement
  Components: core
Affects Versions: 2.0.0
Reporter: Michael Bingham


When running {{DumpLogSegments}} against an {{.index file}}, mismatches may be 
reported when the indexed message offset is not the first record in a batch. 
For example:

{code}
 Mismatches in 
:/var/lib/kafka/data/replicated-topic-0/.index
 Index offset: 968, log offset: 966
{code}

And looking at the corresponding {{.log}} file:

{code}
baseOffset: 966 lastOffset: 968 count: 3 baseSequence: -1 lastSequence: -1 
producerId: -1 producerEpoch: -1 partitionLeaderEpoch: 0 isTransactional: false 
position: 3952771 CreateTime: 1538768639065 isvalid: true size: 12166 magic: 2 
compresscodec: NONE crc: 294402254 
{code}

In this case, the last offset in the batch was indexed instead of the first, 
but the index has to map physical position to the start of the batch, leading 
to the mismatch.

It seems like {{DumpLogSegments}} should not report these cases as mismatches, 
which users might interpret as an error or problem



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)