[jira] [Resolved] (KAFKA-16445) PATCH method for connector configuration

2024-05-10 Thread Ivan Yurchenko (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-16445?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ivan Yurchenko resolved KAFKA-16445.

Resolution: Fixed

> PATCH method for connector configuration
> 
>
> Key: KAFKA-16445
> URL: https://issues.apache.org/jira/browse/KAFKA-16445
> Project: Kafka
>  Issue Type: Improvement
>  Components: connect
>Reporter: Ivan Yurchenko
>Assignee: Ivan Yurchenko
>Priority: Minor
>
> As  [KIP-477: Add PATCH method for connector config in Connect REST 
> API|https://cwiki.apache.org/confluence/display/KAFKA/KIP-477%3A+Add+PATCH+method+for+connector+config+in+Connect+REST+API]
>  suggests, we should introduce the PATCH method for connector configuration.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (KAFKA-16445) PATCH method for connecto configuration

2024-03-28 Thread Ivan Yurchenko (Jira)
Ivan Yurchenko created KAFKA-16445:
--

 Summary: PATCH method for connecto configuration
 Key: KAFKA-16445
 URL: https://issues.apache.org/jira/browse/KAFKA-16445
 Project: Kafka
  Issue Type: Improvement
  Components: connect
Reporter: Ivan Yurchenko
Assignee: Ivan Yurchenko


As  [KIP-477: Add PATCH method for connector config in Connect REST 
API|https://cwiki.apache.org/confluence/display/KAFKA/KIP-477%3A+Add+PATCH+method+for+connector+config+in+Connect+REST+API]
 suggests, we should introduce the PATCH method for connector configuration.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (KAFKA-15931) Cached transaction index gets closed if tiered storage read is interrupted

2023-11-29 Thread Ivan Yurchenko (Jira)
Ivan Yurchenko created KAFKA-15931:
--

 Summary: Cached transaction index gets closed if tiered storage 
read is interrupted
 Key: KAFKA-15931
 URL: https://issues.apache.org/jira/browse/KAFKA-15931
 Project: Kafka
  Issue Type: Bug
  Components: Tiered-Storage
Affects Versions: 3.6.0
Reporter: Ivan Yurchenko


This reproduces when reading from remote storage with the default 
{{fetch.max.wait.ms}} (500) or lower. This error is logged

 
{noformat}
[2023-11-29 14:01:01,166] ERROR Error occurred while reading the remote data 
for topic1-0 (kafka.log.remote.RemoteLogReader)
org.apache.kafka.common.KafkaException: Failed read position from the 
transaction index 
    at 
org.apache.kafka.storage.internals.log.TransactionIndex$1.hasNext(TransactionIndex.java:235)
    at 
org.apache.kafka.storage.internals.log.TransactionIndex.collectAbortedTxns(TransactionIndex.java:171)
    at 
kafka.log.remote.RemoteLogManager.collectAbortedTransactions(RemoteLogManager.java:1359)
    at 
kafka.log.remote.RemoteLogManager.addAbortedTransactions(RemoteLogManager.java:1341)
    at kafka.log.remote.RemoteLogManager.read(RemoteLogManager.java:1310)
    at kafka.log.remote.RemoteLogReader.call(RemoteLogReader.java:62)
    at kafka.log.remote.RemoteLogReader.call(RemoteLogReader.java:31)
    at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
    at 
java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
    at 
java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
    at java.base/java.lang.Thread.run(Thread.java:829)
Caused by: java.nio.channels.ClosedChannelException
    at java.base/sun.nio.ch.FileChannelImpl.ensureOpen(FileChannelImpl.java:150)
    at java.base/sun.nio.ch.FileChannelImpl.position(FileChannelImpl.java:325)
    at 
org.apache.kafka.storage.internals.log.TransactionIndex$1.hasNext(TransactionIndex.java:233)
    ... 10 more
{noformat}
and after that this txn index becomes unusable until the process is restarted.

I suspect, it's caused by the reading thread being interrupted due to the fetch 
timeout. At least [this 
code|https://github.com/AdoptOpenJDK/openjdk-jdk11/blob/19fb8f93c59dfd791f62d41f332db9e306bc1422/src/java.base/share/classes/java/nio/channels/spi/AbstractInterruptibleChannel.java#L159-L160]
 in {{AbstractInterruptibleChannel}} is called.

 

 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (KAFKA-15107) Additional custom metadata for remote log segment

2023-06-19 Thread Ivan Yurchenko (Jira)
Ivan Yurchenko created KAFKA-15107:
--

 Summary: Additional custom metadata for remote log segment
 Key: KAFKA-15107
 URL: https://issues.apache.org/jira/browse/KAFKA-15107
 Project: Kafka
  Issue Type: Improvement
  Components: core
Reporter: Ivan Yurchenko
Assignee: Ivan Yurchenko
 Fix For: 3.6.0


Based on the [KIP-917: Additional custom metadata for remote log 
segment|https://cwiki.apache.org/confluence/display/KAFKA/KIP-917%3A+Additional+custom+metadata+for+remote+log+segment],
 the following needs to be implemented:
 # {{{}{{RemoteLogSegmentMetadata}}{}}}{{{}.{{{}CustomMetadata{}}}{}}}.
 # {{RemoteStorageManager.copyLogSegmentData}} needs to be updated to the new 
return type (+ javadoc).
 # {{RemoteLogSegmentMetadata.customMetadata}} and 
{{RemoteLogSegmentMetadata.{{{}createWithCustomMetadata{} methods. Same in 
{{{}RemoteLogSegmentMetadataSnapshot{}}}.
 # {{RemoteLogSegmentMetadataRecord}} and 
{{RemoteLogSegmentMetadataSnapshotRecord}} definitions need to be updated.
 # Custom metadata should be persisted by {{RemoteLogManager}} if provided.
 # The new config {{remote.log.metadata.custom.metadata.max.size}} needs to be 
introduced.
 # The custom metadata size limit must be applied according to the KIP.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (KAFKA-14795) Provide message formatter for RemoteLogMetadata

2023-03-08 Thread Ivan Yurchenko (Jira)
Ivan Yurchenko created KAFKA-14795:
--

 Summary: Provide message formatter for RemoteLogMetadata
 Key: KAFKA-14795
 URL: https://issues.apache.org/jira/browse/KAFKA-14795
 Project: Kafka
  Issue Type: Improvement
  Components: tools
Reporter: Ivan Yurchenko
Assignee: Ivan Yurchenko


For the debugging purpose, it'd be convenient to have a message formatter that 
can display the content of the {{__remote_log_metadata}} topic.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (KAFKA-13376) Allow MirrorMaker producer and consumer customization per replication flow

2021-10-14 Thread Ivan Yurchenko (Jira)
Ivan Yurchenko created KAFKA-13376:
--

 Summary: Allow MirrorMaker producer and consumer customization per 
replication flow
 Key: KAFKA-13376
 URL: https://issues.apache.org/jira/browse/KAFKA-13376
 Project: Kafka
  Issue Type: Bug
  Components: mirrormaker
Reporter: Ivan Yurchenko


Currently, it's possible to set producer and consumer configurations for a 
cluster in MirrorMaker, like this:

{noformat}
{source}.consumer.{consumer_config_name}
{target}.producer.{producer_config_name}
{noformat}

However, in some cases it makes sense to set these configs differently for 
different replication flows (e.g. when they have different latency/throughput 
trade-offs), something like:

{noformat}
{source}->{target}.{source}.consumer.{consumer_config_name}
{source}->{target}.{target}.producer.{producer_config_name}
{noformat}




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (KAFKA-12835) Topic IDs can mismatch on brokers (after interbroker protocol version update)

2021-05-21 Thread Ivan Yurchenko (Jira)
Ivan Yurchenko created KAFKA-12835:
--

 Summary: Topic IDs can mismatch on brokers (after interbroker 
protocol version update)
 Key: KAFKA-12835
 URL: https://issues.apache.org/jira/browse/KAFKA-12835
 Project: Kafka
  Issue Type: Bug
  Components: core
Affects Versions: 2.8.0
Reporter: Ivan Yurchenko


We had a Kafka cluster running 2.8 version with interbroker protocol set to 
2.7. It had a number of topics and everything was fine.
Then we decided to update the interbroker protocol to 2.8 by the following 
procedure:
1. Run new brokers with the interbroker protocol set to 2.8.
2. Move the data from the old brokers to the new ones (normal partition 
reassignment API).
3. Decommission the old brokers.

At the stage 2 we had the problem: old brokers started failing on 
{{LeaderAndIsrRequest}} handling with
{code:java}
ERROR [Broker id=<...>] Topic Id in memory: <...> does not match the topic Id 
for partition <...> provided in the request: <...>. (state.change.logger)
{code}
for multiple topics. Topics were not recreated.

We checked {{partition.metadata}} files and IDs there were indeed different 
from the values in ZooKeeper. It was fixed by deleting the metadata files (and 
letting them be recreated).

 


The logs, unfortunately, didn't show anything that might point to the cause of 
the issue (or it happened longer ago than we store the logs).

We tried to reproduce this also, but no success.

If the community can point out what to check or beware of in future, it will be 
great. We'll be happy to provide additional information if needed. Thank you! 

Sorry for the ticket that might be not very actionable. We hope to at least 
rise awareness of this issue.

 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (KAFKA-12430) emit.heartbeats.enabled = false should disable heartbeats topic creation

2021-03-05 Thread Ivan Yurchenko (Jira)
Ivan Yurchenko created KAFKA-12430:
--

 Summary: emit.heartbeats.enabled = false should disable heartbeats 
topic creation
 Key: KAFKA-12430
 URL: https://issues.apache.org/jira/browse/KAFKA-12430
 Project: Kafka
  Issue Type: Bug
  Components: mirrormaker
Reporter: Ivan Yurchenko


Currently, MirrorMaker 2's {{MirrorHeartbeatConnector}} emits heartbeats or not 
based on {{emit.heartbeats.enabled}} setting. However, {{heartbeats}} topic is 
created unconditionally. It seems that the same setting should really disable 
the topic creation as well.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (KAFKA-12235) ZkAdminManager.describeConfigs returns no config when 2+ configuration keys are specified

2021-01-25 Thread Ivan Yurchenko (Jira)
Ivan Yurchenko created KAFKA-12235:
--

 Summary: ZkAdminManager.describeConfigs returns no config when 2+ 
configuration keys are specified
 Key: KAFKA-12235
 URL: https://issues.apache.org/jira/browse/KAFKA-12235
 Project: Kafka
  Issue Type: Bug
  Components: core
Affects Versions: 2.7.0
Reporter: Ivan Yurchenko
Assignee: Ivan Yurchenko


When {{ZkAdminManager.describeConfigs}} receives {{DescribeConfigsResource}} 
with 2 or more {{configurationKeys}} specified, it returns an empty 
configuration.

Here's a test for {{ZkAdminManagerTest}} that reproduces this issue:
  
{code:scala}
@Test
def testDescribeConfigsWithConfigurationKeys(): Unit = {
  EasyMock.expect(zkClient.getEntityConfigs(ConfigType.Topic, 
topic)).andReturn(TestUtils.createBrokerConfig(brokerId, "zk"))
  EasyMock.expect(metadataCache.contains(topic)).andReturn(true)

  EasyMock.replay(zkClient, metadataCache)

  val resources = List(new DescribeConfigsRequestData.DescribeConfigsResource()
.setResourceName(topic)
.setResourceType(ConfigResource.Type.TOPIC.id)
.setConfigurationKeys(List("retention.ms", "retention.bytes", 
"segment.bytes").asJava)
  )

  val adminManager = createAdminManager()
  val results: List[DescribeConfigsResponseData.DescribeConfigsResult] = 
adminManager.describeConfigs(resources, true, true)
  assertEquals(Errors.NONE.code, results.head.errorCode())
  val resultConfigKeys = results.head.configs().asScala.map(r => r.name()).toSet
  assertEquals(Set("retention.ms", "retention.bytes", "segment.bytes"), 
resultConfigKeys)
}

{code}
Works fine with one configuration key, though.

The patch is following shortly.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (KAFKA-9672) Dead broker in ISR cause isr-expiration to fail with exception

2020-03-06 Thread Ivan Yurchenko (Jira)
Ivan Yurchenko created KAFKA-9672:
-

 Summary: Dead broker in ISR cause isr-expiration to fail with 
exception
 Key: KAFKA-9672
 URL: https://issues.apache.org/jira/browse/KAFKA-9672
 Project: Kafka
  Issue Type: Bug
  Components: core
Affects Versions: 2.4.0, 2.4.1
Reporter: Ivan Yurchenko


We're running Kafka 2.4 and facing a pretty strange situation.
 Let's say there were three brokers in the cluster 0, 1, and 2. Then:
 1. Broker 3 was added.
 2. Partitions were reassigned from broker 0 to broker 3.
 3. Broker 0 was shut down (not gracefully) and removed from the cluster.
 4. We see the following state in ZooKeeper:
{code:java}
ls /brokers/ids
[1, 2, 3]

get /brokers/topics/foo
{"version":2,"partitions":{"0":[2,1,3]},"adding_replicas":{},"removing_replicas":{}}

get /brokers/topics/foo/partitions/0/state
{"controller_epoch":123,"leader":1,"version":1,"leader_epoch":42,"isr":[0,2,3,1]}
{code}
It means, the dead broker 0 remains in the partitions's ISR. A big share of the 
partitions in the cluster have this issue.

This is actually causing an errors:
{code:java}
Uncaught exception in scheduled task 'isr-expiration' 
(kafka.utils.KafkaScheduler)
org.apache.kafka.common.errors.ReplicaNotAvailableException: Replica with id 12 
is not available on broker 17
{code}
It means that effectively {{isr-expiration}} task is not working any more.

I have a suspicion that this was introduced by [this commit (line 
selected)|https://github.com/apache/kafka/commit/57baa4079d9fc14103411f790b9a025c9f2146a4#diff-5450baca03f57b9f2030f93a480e6969R856]

Unfortunately, I haven't been able to reproduce this in isolation.

Any hints about how to reproduce (so I can write a patch) or mitigate the issue 
on a running cluster are welcome.

Generally, I assume that not throwing {{ReplicaNotAvailableException}} on a 
dead (i.e. non-existent) broker, considering them out-of-sync and removing from 
the ISR should fix the problem.

 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (KAFKA-9478) Controller may stop react on partition reassignment command in ZooKeeper

2020-01-28 Thread Ivan Yurchenko (Jira)
Ivan Yurchenko created KAFKA-9478:
-

 Summary: Controller may stop react on partition reassignment 
command in ZooKeeper
 Key: KAFKA-9478
 URL: https://issues.apache.org/jira/browse/KAFKA-9478
 Project: Kafka
  Issue Type: Bug
  Components: controller, core
Affects Versions: 2.4.0, 2.4.1
Reporter: Ivan Yurchenko
Assignee: Ivan Yurchenko


Seemingly after 
[bdf2446ccce592f3c000290f11de88520327aa19|https://github.com/apache/kafka/commit/bdf2446ccce592f3c000290f11de88520327aa19],
 the controller may stop watching {{/admin/reassign_partitions}} node in 
ZooKeeper and consequently accept partition reassignment commands via ZooKeeper.

I'm not 100% sure that bdf2446ccce592f3c000290f11de88520327aa19 causes this, 
but it doesn't reproduce on 
[3fe6b5e951db8f7184a4098f8ad8a1afb2b2c1a0|https://github.com/apache/kafka/commit/3fe6b5e951db8f7184a4098f8ad8a1afb2b2c1a0]
 - the one right before it.

Also, reproduces on the trunk HEAD 
[a87decb9e4df5bfa092c26ae4346f65c426f1321|https://github.com/apache/kafka/commit/a87decb9e4df5bfa092c26ae4346f65c426f1321].
h1. How to reproduce

1. Run ZooKeeper and two Kafka brokers.

2. Create a topic with 100 partitions and place them on Broker 0:
{code:bash}
distro/bin/kafka-topics.sh --bootstrap-server localhost:9092,localhost:9093 
--create \
--topic foo \
--replica-assignment $(for i in {0..99}; do echo -n "0,"; done | sed 
's/.$$//')
{code}
3. Add some data:
{code:bash}
seq 1 100 | bin/kafka-console-producer.sh --broker-list 
localhost:9092,localhost:9093 --topic foo
{code}
4. Create the partition reassignment node {{/admin/reassign_partitions}} in Zoo 
and shortly after that update the data in the node (even the same value will 
do). I made a simple Python script for this:
{code:python}
import time
import json
from kazoo.client import KazooClient

zk = KazooClient(hosts='127.0.0.1:2181')
zk.start()

reassign = {
"version": 1,
"partitions":[]
}
for p in range(100):
reassign["partitions"].append({"topic": "foo", "partition": p, 
"replicas": [1]})

zk.create("/admin/reassign_partitions", json.dumps(reassign).encode())

time.sleep(0.05)

zk.set("/admin/reassign_partitions", json.dumps(reassign).encode())
{code}
4. Observe that the controller doesn't react on further updates to 
{{/admin/reassign_partitions}} and doesn't delete the node.

Also, it can be confirmed with
{code:bash}
echo wchc | nc 127.0.0.1 2181
{code}
that there is no watch on the node in ZooKeeper (for this, you should run 
ZooKeeper with {{4lw.commands.whitelist=*}}).

Since it's about timing, it might not work on first attempt, so you might need 
to do 4 a couple of times. However, the reproducibility rate is pretty high.

The data in the topic and the big amount of partitions are not needed per se, 
only to make the timing more favourable.

Controller re-election will solve the issue, but a new controller can be put in 
this state the same way.
h1. Proposed solution

TBD, suggestions are welcome.

 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (KAFKA-9143) DistributedHerder misleadingly log error on connector task reconfiguration

2019-11-05 Thread Ivan Yurchenko (Jira)
Ivan Yurchenko created KAFKA-9143:
-

 Summary: DistributedHerder misleadingly log error on connector 
task reconfiguration
 Key: KAFKA-9143
 URL: https://issues.apache.org/jira/browse/KAFKA-9143
 Project: Kafka
  Issue Type: Bug
  Components: KafkaConnect
Reporter: Ivan Yurchenko
Assignee: Ivan Yurchenko


In {{DistributedHerder}} in {{reconfigureConnectorTasksWithRetry}} method 
there's a 
[callback|https://github.com/apache/kafka/blob/c552c06aed50b4d4d9a85f73ccc89bc06fa7e094/connect/runtime/src/main/java/org/apache/kafka/connect/runtime/distributed/DistributedHerder.java#L1247]:

{code:java}
@Override
public void onCompletion(Throwable error, Void result) {
log.error("Unexpected error during connector task reconfiguration: ", 
error);
log.error("Task reconfiguration for {} failed unexpectedly, this connector 
will not be properly reconfigured unless manually triggered.", connName);
}
{code}

It an error message even when the operation succeeded (i.e., {{error}} is 
{{null}}).

It should include {{if (error != null)}} condition, like in the same class [in 
another 
method|https://github.com/apache/kafka/blob/c552c06aed50b4d4d9a85f73ccc89bc06fa7e094/connect/runtime/src/main/java/org/apache/kafka/connect/runtime/distributed/DistributedHerder.java#L792].



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (KAFKA-9035) Improve

2019-10-13 Thread Ivan Yurchenko (Jira)
Ivan Yurchenko created KAFKA-9035:
-

 Summary: Improve 
 Key: KAFKA-9035
 URL: https://issues.apache.org/jira/browse/KAFKA-9035
 Project: Kafka
  Issue Type: Improvement
  Components: KafkaConnect
Affects Versions: 2.2.1, 2.3.0, 2.1.1, 2.2.0, 2.1.0, 2.0.1, 2.0.0
Reporter: Ivan Yurchenko
Assignee: Ivan Yurchenko


Kafka Connect workers in the distributed mode can be set up so that they have 
the same advertised URL (e.g. 
{{[http://127.0.0.1:8083|http://127.0.0.1:8083/]}}). When a request (e.g., for 
connector creation) lands on a worker that is not the leader, it will be 
forwarded to the leader's advertised URL. However, if advertised URLs are all 
the same, it might never reach the leader (due to the limited number of 
forwards). (See https://issues.apache.org/jira/browse/KAFKA-7121 for an 
example.)

I propose to address this by detecting such a situation and warning the user on 
the log.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)