[jira] [Created] (KAFKA-16252) Maligned Metrics formatting

2024-02-13 Thread James (Jira)
James created KAFKA-16252:
-

 Summary: Maligned Metrics formatting
 Key: KAFKA-16252
 URL: https://issues.apache.org/jira/browse/KAFKA-16252
 Project: Kafka
  Issue Type: Bug
  Components: documentation
Affects Versions: 3.6.1
Reporter: James
 Attachments: image-2024-02-13-22-34-31-371.png

There's some inconsistencies, and I believe indeed some buggy content in the 
documentation for monitoring kafka.

1. Some MBean documentation is presented as a TR with a colspan of 3 instead of 
its normal location of the third column

2. There seems to be some erroneous data posted in the headings for a handful 
of documentation sections, ex
{code:java}
 [2023-09-15 00:40:42,725] INFO Metrics scheduler closed 
(org.apache.kafka.common.metrics.Metrics:693) [2023-09-15 00:40:42,729] INFO 
Metrics reporters closed (org.apache.kafka.common.metrics.Metrics:703)  {code}
Links to erroneous content (not permalinks)

 
 * [https://kafka.apache.org/documentation/#producer_sender_monitoring]
 * [https://kafka.apache.org/documentation/#consumer_fetch_monitoring]
 * [https://kafka.apache.org/documentation/#connect_monitoring]

This image demonstrates both issues

!image-2024-02-13-22-34-31-371.png!



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Jenkins build is still unstable: Kafka » Kafka Branch Builder » trunk #2643

2024-02-13 Thread Apache Jenkins Server
See 




Re: [DISCUSS] KIP-932: Queues for Kafka

2024-02-13 Thread Jun Rao
Hi, Andrew,

Thanks for the reply.

10. The impact from doing server side filtering is that we lose zero-copy
transfer, which provides two potential benefits: (1) more direct/efficient
transfer from disk to network; (2) less memory usage in heap since we don't
need to copy the records to heap to send a fetch response. (1) may already
be lost if SSL is used. However, it would be useful to outline the impact
of (2). For example, do we need to use memory_records in fetch responses?
How much additional heap memory will the server use? Do we need to cache
records in heap? If so, is the cache bounded?

12. If the group is configured with share and a client tries to join as a
consumer, what error do we return in the RPC and in the client API? Ditto
for the reverse case where a share client tries to join a group configured
with consumer.

17. "What is the client going to do with the exception?" Well, the reason
that we added this option was for the case where the same data could be
obtained from some other place. Suppose that we use CDC to get
database changes into a Kafka topic. If the consumer is slow
and unconsumed data is deleted in Kafka because of retention, by receiving
an exception, the consumer will know that there is potential missing data
in Kafka and could bootstrap all data from the database first before
resuming the consumption from Kafka.

18.2 So, to serve a ShareAcknowledgeRequest, the leader needs to write
share records for the acknowledgement and wait for the records to be fully
replicated before sending a response? It would be useful to document that
in the section of handling ShareAcknowledgeRequest.
18.3 Could we document when the leader writes SHARE_CHECKPOINT vs
SHARE_DELTA?

19. Could we document the difference in delivery guarantee between
share and regular consumer?

21. Do we lose zero-copy transfer for all consumers because of that (since
we don't know which topics contain control records?

24. Could we document the index layout?

25. "The callback tells you whether the acknowledgements for the entire
topic-partition succeeded or failed, rather than each record individually."
The issue is that with implicit acknowledgement, it's not clear which
records are in the batch. For example, if the client just keeps calling
poll() with no explicit commits, when a callback is received, does the
client know which records are covered by the callback?

30. ListGroupsOptions
30.1 public ListGroupsOptions types(Set states);
  Should states be types?
30.2 ListConsumerGroupsOptions supports filtering by state. Should we
support it here too?

31. ConsumerGroupListing includes state. Should we include state in
GroupListing to be consistent?

Jun

On Mon, Feb 12, 2024 at 3:55 AM Andrew Schofield <
andrew_schofield_j...@outlook.com> wrote:

> Hi Jun
> Thanks for your comments.
>
> 10. For read-uncommitted isolation level, the consumer just reads all
> records.
> For read-committed isolation level, the share-partition leader does the
> filtering to
> enable correct skipping of aborted records. The consumers in a share group
> are not
> aware of the filtering, unlike consumers in consumer groups.
>
> 11. The “classic” type is the pre-KIP 848 consumer groups.
>
> 12. By setting the configuration for a group resource, you are saying
> “when a new group is
> created with this name, it must have this type”. It’s not changing the
> type of an existing
> group.
>
> 13. Good catch. The Server Assignor should be at group level. I will
> change it.
>
> 14. That is true. I have maintained it to keep similarity with consumer
> groups,
> but it is not currently exposed to clients. It might be best to remove it.
>
> 15. I had intended that SimpleAssignor implements
> org.apache.kafka.clients.consumer.ConsumerPartitionAssignor.
> Actually, I think there’s benefit to using a new interface so that someone
> doesn’t inadvertently
> configure something like the RoundRobinAssignor for a share group. It
> wouldn’t go well. I will
> add a new interface to the KIP.
>
> 16. When an existing member issues a ShareGroupHeartbeatRequest to the new
> coordinator,
> the coordinator returns UNKNOWN_MEMBER_ID. The client then sends another
> ShareGroupHeartbeatRequest
> containing no member ID and epoch 0. The coordinator then returns the
> member ID.
>
> 17. I don’t think so. What is the client going to do with the exception?
> Share groups are
> intentionally removing some of the details of using Kafka offsets from the
> consumers. If the
> SPSO needs to be reset due to retention, it just does that automatically.
>
> 18. The proposed use of control records needs some careful thought.
> 18.1. They’re written by the share-partition leader, not the coordinator.
> 18.2. If the client commits the acknowledgement, it is only confirmed to
> the client
> once it has been replicated to the other replica brokers. So, committing
> an acknowledgement
> is very similar to sending a record to a topic in terms of the behaviour.
>
> 19. You are 

Jenkins build is still unstable: Kafka » Kafka Branch Builder » trunk #2642

2024-02-13 Thread Apache Jenkins Server
See 




[jira] [Created] (KAFKA-16251) Fenced member should not send heartbeats while waiting for onPartitionsLost to complete

2024-02-13 Thread Lianet Magrans (Jira)
Lianet Magrans created KAFKA-16251:
--

 Summary: Fenced member should not send heartbeats while waiting 
for onPartitionsLost to complete
 Key: KAFKA-16251
 URL: https://issues.apache.org/jira/browse/KAFKA-16251
 Project: Kafka
  Issue Type: Sub-task
  Components: clients, consumer
Reporter: Lianet Magrans
Assignee: Lianet Magrans


When a member gets fenced, it transitions to FENCED state and triggers the 
onPartitionsLost callback to release it assignment. Members should stop sending 
heartbeats while FENCED, and resume sending it only after completing the 
callback, when it transitions to JOINING.




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (KAFKA-16250) Consumer group coordinator should perform sanity check on the offset commits.

2024-02-13 Thread Calvin Liu (Jira)
Calvin Liu created KAFKA-16250:
--

 Summary: Consumer group coordinator should perform sanity check on 
the offset commits.
 Key: KAFKA-16250
 URL: https://issues.apache.org/jira/browse/KAFKA-16250
 Project: Kafka
  Issue Type: Improvement
Reporter: Calvin Liu


The current coordinator does not validate the offset commits before persisting 
it in the record.

In a real case, though, I am not sure why the consumer generates the offset 
commits with a consumer offset valued at -2, the "illegal" consumer offset 
value caused confusion with the admin cli when describing the consumer group. 
The consumer offset field is marked "-".

 

 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Resolved] (KAFKA-14576) Move ConsoleConsumer to tools

2024-02-13 Thread Mickael Maison (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-14576?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mickael Maison resolved KAFKA-14576.

Fix Version/s: 3.8.0
   Resolution: Fixed

> Move ConsoleConsumer to tools
> -
>
> Key: KAFKA-14576
> URL: https://issues.apache.org/jira/browse/KAFKA-14576
> Project: Kafka
>  Issue Type: Sub-task
>Reporter: Mickael Maison
>Assignee: Mickael Maison
>Priority: Major
> Fix For: 3.8.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: [DISCUSS] KIP-996: Pre-Vote

2024-02-13 Thread Alyssa Huang
Thank you Ziming, I guess it's misleading to say "Raft paper" when I'm
actually referring to the extended version of the paper (Ongaro's PHD
thesis). I have that version linked but I'll update the language to be more
specific!

On Wed, Feb 7, 2024 at 7:16 PM ziming deng  wrote:

> Hi Alyssa,
>
> I have a minor question about the description in motivation section
>
> > Pre-Vote (as originally detailed in the Raft paper and in KIP-650)
>
> It seems Pre-vote is not mentioned in Raft paper, can you check out it
> again and rectify it? It would be helpful, thank you!
>
> -
> Thanks,
> Ziming
>
>
> > On Dec 8, 2023, at 16:13, Luke Chen  wrote:
> >
> > Hi Alyssa,
> >
> > Thanks for the update.
> > LGTM now.
> >
> > Luke
> >
> > On Fri, Dec 8, 2023 at 10:03 AM José Armando García Sancio
> >  wrote:
> >
> >> Hi Alyssa,
> >>
> >> Thanks for the answers and the updates to the KIP. I took a look at
> >> the latest version and it looks good to me.
> >>
> >> --
> >> -José
> >>
>
>


[jira] [Resolved] (KAFKA-14822) Allow restricting File and Directory ConfigProviders to specific paths

2024-02-13 Thread Mickael Maison (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-14822?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mickael Maison resolved KAFKA-14822.

Fix Version/s: 3.8.0
 Assignee: Gantigmaa Selenge  (was: Mickael Maison)
   Resolution: Fixed

> Allow restricting File and Directory ConfigProviders to specific paths
> --
>
> Key: KAFKA-14822
> URL: https://issues.apache.org/jira/browse/KAFKA-14822
> Project: Kafka
>  Issue Type: Improvement
>Reporter: Mickael Maison
>Assignee: Gantigmaa Selenge
>Priority: Major
>  Labels: need-kip
> Fix For: 3.8.0
>
>
> In sensitive environments, it would be interesting to be able to restrict the 
> files that can be accessed by the built-in configuration providers.
> For example:
> config.providers=directory
> config.providers.directory.class=org.apache.kafka.connect.configs.DirectoryConfigProvider
> config.providers.directory.path=/var/run
> Then if a caller tries to access another path, for example
> ssl.keystore.password=${directory:/etc/passwd:keystore-password}
> it would be rejected.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: [PR] MINIOR: Mark JBOD as an early release [kafka-site]

2024-02-13 Thread via GitHub


stanislavkozlovski commented on PR #579:
URL: https://github.com/apache/kafka-site/pull/579#issuecomment-1941921252

   Thanks!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] MINIOR: Mark JBOD as an early release [kafka-site]

2024-02-13 Thread via GitHub


stanislavkozlovski merged PR #579:
URL: https://github.com/apache/kafka-site/pull/579


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Created] (KAFKA-16249) Improve reconciliation state machine

2024-02-13 Thread David Jacot (Jira)
David Jacot created KAFKA-16249:
---

 Summary: Improve reconciliation state machine
 Key: KAFKA-16249
 URL: https://issues.apache.org/jira/browse/KAFKA-16249
 Project: Kafka
  Issue Type: Sub-task
Reporter: David Jacot
Assignee: David Jacot






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Jenkins build is still unstable: Kafka » Kafka Branch Builder » trunk #2641

2024-02-13 Thread Apache Jenkins Server
See 




[jira] [Created] (KAFKA-16248) Kafka consumer should cache leader offset ranges

2024-02-13 Thread Lucas Brutschy (Jira)
Lucas Brutschy created KAFKA-16248:
--

 Summary: Kafka consumer should cache leader offset ranges
 Key: KAFKA-16248
 URL: https://issues.apache.org/jira/browse/KAFKA-16248
 Project: Kafka
  Issue Type: Bug
Reporter: Lucas Brutschy


We noticed a streams application received an OFFSET_OUT_OF_RANGE error 
following a network partition and streams task rebalance and subsequently reset 
its offsets to the beginning.

Inspecting the logs, we saw multiple consumer log messages like: 
{code:java}
Setting offset for partition tp to the committed offset 
FetchPosition{offset=1234, offsetEpoch=Optional.empty...)
{code}
Inspecting the streams code, it looks like kafka streams calls `commitSync` 
passing through an explicit OffsetAndMetadata object but does not populate the 
offset leader epoch.

The offset leader epoch is required in the offset commit to ensure that all 
consumers in the consumer group have coherent metadata before fetching. 
Otherwise after a consumer group rebalance, a consumer may fetch with a stale 
leader epoch with respect to the committed offset and get an offset out of 
range error from a zombie partition leader.

An example of where this can cause issues:
1. We have a consumer group with consumer 1 and consumer 2. Partition P is 
assigned to consumer 1 which has up-to-date metadata for P. Consumer 2 has 
stale metadata for P.
2. Consumer 1 fetches partition P with offset 50, epoch 8. commits the offset 
50 without an epoch.
3. The consumer group rebalances and P is now assigned to consumer 2. Consumer 
2 has a stale leader epoch for P (let's say leader epoch 7). Consumer 2 will 
now try to fetch with leader epoch 7, offset 50. If we have a zombie leader due 
to a network partition, the zombie leader may accept consumer 2's fetch leader 
epoch and return an OFFSET_OUT_OF_RANGE to consumer 2.

If in step 1, consumer 1 committed the leader epoch for the message, then when 
consumer 2 receives assignment P it would force a metadata refresh to discover 
a sufficiently new leader epoch for the committed offset.

Kafka Streams cannot fully determine the leader epoch of the offsets it wants 
to commit - in EOS mode, streams commits the offset after the last control 
records (to avoid always having a lag of >0), but the leader epoch of the 
control record is not known to streams (since only non-control records are 
returned from Consumer.poll).

A fix discussed with [~hachikuji] is to have the consumer cache leader epoch 
ranges, similar to how the broker maintains a leader epoch cache.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (KAFKA-16247) 1 replica keep out-of-sync after migrating broker to KRaft

2024-02-13 Thread Luke Chen (Jira)
Luke Chen created KAFKA-16247:
-

 Summary: 1 replica keep out-of-sync after migrating broker to KRaft
 Key: KAFKA-16247
 URL: https://issues.apache.org/jira/browse/KAFKA-16247
 Project: Kafka
  Issue Type: Bug
Reporter: Luke Chen


We are deploying 3 controllers and 3 brokers, and following the steps in 
[doc|https://kafka.apache.org/documentation/#kraft_zk_migration]. When we're 
moving from "Enabling the migration on the brokers" state to "Migrating brokers 
to KRaft" state, the first rolled broker becomes out-of-sync and never become 
in-sync. 


>From the log, we can see some "reject alterPartition" errors, but it just 
>happen 2 times. Theoretically, the leader should add the follower  into ISR as 
>long as the follower is fetching since we don't have client writing data. But 
>can't figure out why it didn't fetch. 


Logs: https://gist.github.com/showuon/64c4dcecb238a317bdbdec8db17fd494



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (KAFKA-16246) Cleanups in ConsoleConsumer

2024-02-13 Thread Mickael Maison (Jira)
Mickael Maison created KAFKA-16246:
--

 Summary: Cleanups in ConsoleConsumer
 Key: KAFKA-16246
 URL: https://issues.apache.org/jira/browse/KAFKA-16246
 Project: Kafka
  Issue Type: Improvement
  Components: tools
Reporter: Mickael Maison


When rewriting ConsoleConsumer in Java, in order to keep the conversion and 
review process simple we mimicked the logic flow and types used in the Scala 
implementation.

Once the rewrite is merged, we should refactor some of the logic to make it 
more Java-like. This include removing Optional where it makes sense and moving 
all the argument checking logic into ConsoleConsumerOptions.


See https://github.com/apache/kafka/pull/15274 for pointers.

  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: [DISCUSS] Kafka-Streams-Scala for Scala 3

2024-02-13 Thread Matthias Berndt
Hey Matthew,

Kafka-Streams-Scala is an entirely separate codebase from Kafka Core, and
the fact that almost three years after the release of Scala 3 there is
still no release of Kafka-Streams-Scala for Scala 3 – even though the
required changes to the codebase are trivial – tells me that these things
should not be coupled in the way that they currently are. I think having
Kafka-Streams-Scala users wait for Kafka 4 would be doing them a
disservice. Are we really going to delay this because of some CI scripts?

All the best
Matthias

Am Di., 13. Feb. 2024 um 11:20 Uhr schrieb Matthew de Detrich
:

> As added info on top of what Josep said, in the Scala space most OS
> software supports
> Scala 2.12/Scala 2.13 and Scala 3 but with Scala 2.12 specifically the
> Scala OS
> community itself wants to eventually have people to stop supporting it
> (it's in maintenance mode) so it makes sense to tie this into Kafka 4.0.x
> which will
> drop Scala 2.12 support.
>
> We should also take care and make sure the Scala 3 version is pinned to
> 3.3.x since
> it's part of the LTS[1] series, in other words we should be careful that
> people won't bump
> the Scala version to 3.4.x.
>
> 1:
>
> https://www.scala-lang.org/blog/2022/08/17/long-term-compatibility-plans.html
>
> On Sat, Feb 10, 2024 at 11:54 AM Matthias Berndt <
> matthias.ber...@ttmzero.com> wrote:
>
> > Hey there,
> >
> > I'd like to discuss a Scala 3 release of the Kafka-Streams-Scala library.
> > As you might have seen already, I have recently created a ticket
> > https://issues.apache.org/jira/browse/KAFKA-16237
> > and a PR
> > https://github.com/apache/kafka/pull/15338
> > to move this forward. The changes required to make Kafka-Streams-Scala
> > compile with Scala 3 are trivial; the trickier part is the build system
> and
> > the release process
> > I have made some changes to the build system (feel free to comment on the
> > above PR about that) that make it possible to test Kafka-Streams-Scala
> and
> > build the jar. What remains to be done is the CI and release process.
> There
> > is a `release.py` file in the Kafka repository's root directory, which
> > assumes that all artifacts are available for all supported Scala
> versions.
> > This is no longer the case with my changes because while porting
> > Kafka-Streams-Scala to Scala 3 is trivial, porting Kafka to Scala 3 is
> less
> > so, and shouldn't hold back a Scala 3 release of Kafka-Streams-Scala. I
> > would appreciate some guidance as to what the release process should look
> > like in the future.
> >
> > Oh and I've made a PR to remove a syntax error from release.py.
> > https://github.com/apache/kafka/pull/15350
> >
> > All the best,
> > Matthias
> >
>
>
> --
>
> Matthew de Detrich
>
> *Aiven Deutschland GmbH*
>
> Immanuelkirchstraße 26, 10405 Berlin
>
> Alexanderufer 3-7, 10117 Berlin
>
> Amtsgericht Charlottenburg, HRB 209739 B
>
> Geschäftsführer: Oskari Saarenmaa & Hannu Valtonen
>
> *m:* +491603708037
>
> *w:* aiven.io *e:* matthew.dedetr...@aiven.io
>


Re: [DISCUSS] Kafka-Streams-Scala for Scala 3

2024-02-13 Thread Matthew de Detrich
As added info on top of what Josep said, in the Scala space most OS
software supports
Scala 2.12/Scala 2.13 and Scala 3 but with Scala 2.12 specifically the
Scala OS
community itself wants to eventually have people to stop supporting it
(it's in maintenance mode) so it makes sense to tie this into Kafka 4.0.x
which will
drop Scala 2.12 support.

We should also take care and make sure the Scala 3 version is pinned to
3.3.x since
it's part of the LTS[1] series, in other words we should be careful that
people won't bump
the Scala version to 3.4.x.

1:
https://www.scala-lang.org/blog/2022/08/17/long-term-compatibility-plans.html

On Sat, Feb 10, 2024 at 11:54 AM Matthias Berndt <
matthias.ber...@ttmzero.com> wrote:

> Hey there,
>
> I'd like to discuss a Scala 3 release of the Kafka-Streams-Scala library.
> As you might have seen already, I have recently created a ticket
> https://issues.apache.org/jira/browse/KAFKA-16237
> and a PR
> https://github.com/apache/kafka/pull/15338
> to move this forward. The changes required to make Kafka-Streams-Scala
> compile with Scala 3 are trivial; the trickier part is the build system and
> the release process
> I have made some changes to the build system (feel free to comment on the
> above PR about that) that make it possible to test Kafka-Streams-Scala and
> build the jar. What remains to be done is the CI and release process. There
> is a `release.py` file in the Kafka repository's root directory, which
> assumes that all artifacts are available for all supported Scala versions.
> This is no longer the case with my changes because while porting
> Kafka-Streams-Scala to Scala 3 is trivial, porting Kafka to Scala 3 is less
> so, and shouldn't hold back a Scala 3 release of Kafka-Streams-Scala. I
> would appreciate some guidance as to what the release process should look
> like in the future.
>
> Oh and I've made a PR to remove a syntax error from release.py.
> https://github.com/apache/kafka/pull/15350
>
> All the best,
> Matthias
>


-- 

Matthew de Detrich

*Aiven Deutschland GmbH*

Immanuelkirchstraße 26, 10405 Berlin

Alexanderufer 3-7, 10117 Berlin

Amtsgericht Charlottenburg, HRB 209739 B

Geschäftsführer: Oskari Saarenmaa & Hannu Valtonen

*m:* +491603708037

*w:* aiven.io *e:* matthew.dedetr...@aiven.io


Jenkins build is still unstable: Kafka » Kafka Branch Builder » trunk #2640

2024-02-13 Thread Apache Jenkins Server
See