[GitHub] [kafka-site] anlance closed pull request #492: Fixed typos in introduction documentation
anlance closed pull request #492: Fixed typos in introduction documentation URL: https://github.com/apache/kafka-site/pull/492 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@kafka.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [DISCUSS] KIP-793: Sink Connectors: Support topic-mutating SMTs for async connectors (preCommit users)
Hi Chris, > I was actually envisioning something like `void > open(Collection originalPartitions, > Collection transformedPartitions)` Ah okay, this does make a lot more sense. Sorry, I think I misunderstood you earlier. I do agree with you that this seems better than splitting it off into two new sets of open / close methods from a complexity standpoint. > Plus, if a connector is intentionally designed to use > pre-transformation topic partitions in its open/close > methods, wouldn't we just be trading one form of the > problem for another by making this switch? On thinking about this a bit more, I'm not so convinced that we need to expose the pre-transform / original topic partitions in the new open / close methods. The purpose of the open / close methods is to allow sink tasks to allocate and deallocate resources for each topic partition assigned to the task and the purpose of topic-mutating SMTs is to essentially modify the source topic name from the point of view of the sink connector. Why would a sink connector ever need to or want to allocate resources for pre-transform topic partitions? Is the argument here that since we'll be exposing both the pre-transform and post-transform topic partitions per record, we should also expose the same info via open / close and allow sink connector implementations to disregard topic-mutating SMTs completely if they wanted to? Either way, I've gone ahead and updated the KIP to reflect all of our previous discussion here since it had become quite outdated. I've also updated the KIP title from "Sink Connectors: Support topic-mutating SMTs for async connectors (preCommit users)" to "Allow sink connectors to be used with topic-mutating SMTs" since the improvements to the open / close mechanism doesn't pertain only to asynchronous sink connectors. The new KIP URL is: https://cwiki.apache.org/confluence/display/KAFKA/KIP-793%3A+Allow+sink+connectors+to+be+used+with+topic-mutating+SMTs Thanks, Yash On Tue, Feb 14, 2023 at 11:39 PM Chris Egerton wrote: > Hi Yash, > > I was actually envisioning something like `void > open(Collection > originalPartitions, Collection transformedPartitions)`, > since we already convert and transform each batch of records that we poll > from the sink task's consumer en masse, meaning we could discover several > new transformed partitions in between consecutive calls to SinkTask::put. > > It's also worth noting that we'll probably want to deprecate the existing > open/close methods, at which point keeping one non-deprecated variant of > each seems more appealing and less complex than keeping two. > > Honestly though, I think we're both on the same page enough that I wouldn't > object to either approach. We've probably reached the saturation point for > ROI here and as long as we provide developers a way to get the information > they need from the runtime and take care to add Javadocs and update our > docs page (possibly including the connector development quickstart), it > should be fine. > > At this point, it might be worth updating the KIP based on recent > discussion so that others can see the latest proposal, and we can both take > a look and make sure everything looks good enough before opening a vote > thread. > > Finally, I think you make a convincing case for a time-based eviction > policy. I wasn't thinking about the fairly common SMT pattern of deriving a > topic name from, e.g., a record field or header. > > Cheers, > > Chris > > On Tue, Feb 14, 2023 at 11:42 AM Yash Mayya wrote: > > > Hi Chris, > > > > > Plus, if a connector is intentionally designed to > > > use pre-transformation topic partitions in its > > > open/close methods, wouldn't we just be trading > > > one form of the problem for another by making this > > > switch? > > > > Thanks, this makes sense, and given that the KIP already proposes a way > for > > sink connector implementations to distinguish between pre-transform and > > post-transform topics per record, I think I'm convinced that going with > new > > `open()` / `close()` methods is the right approach. However, I still feel > > like having overloaded methods will make it a lot less unintuitive given > > that the two sets of methods would be different in terms of when they're > > called and what arguments they are passed (also I'm presuming that the > > overloaded methods you're prescribing will only have a single > > `TopicPartition` rather than a `Collection` as their > > parameters). I guess my concern is largely around the fact that it won't > be > > possible to distinguish between the overloaded methods' use cases just > from > > the method signatures. I agree that naming is going to be difficult here, > > but I think that having two sets of `SinkTask::openXyz` / > > `SinkTask::closeXyz` methods will be less complicated to understand from > a > > connector developer perspective (as compared to overloaded methods with > > only differing documentation). Of your suggested options, I think > >
Re: [DISCUSS] KIP-641 An new java interface to replace 'kafka.common.MessageReader'
On 2023/02/18 08:44:05 Tom Bentley wrote: > Hi Chia-Ping, > > To be honest the stateful version, setting an input stream once using the > `readFrom(InputStream)` method and then repeatedly asking for the next > record using a parameterless `readRecord()`, seems a bit more natural to me > than `readRecord(InputStream inputStream)` being called repeatedly with (I > assume) the same input stream. I think the contract is simpler to describe > and understand. I prefer readRecord() also. It is a trade-off between having `Configurable` interface and having a parameterless readRecord(). If the `Configurable` is not required, I'd like to revert to readRecord(). WDYT? > > It's worth thinking about how implementers might have to read bytes from > the stream to discover the end of one record and the start of the next. > Unless we've guaranteed that the input stream supports mark and reset then > they have to buffer the initial bytes of the next record that they've just > read from the stream so that they can use them when called next time. So I > think RecordReaders are (in general) inherently stateful and therefore it > seems harmless for them to also have the input stream itself as some of > that state. you are right. As the input stream is keyboard input, it would be hard to expect the number of bytes for one record.
Re: [VOTE] KIP-641 An new java interface to replace 'kafka.common.MessageReader'
Dear all, There is a change after I start this vote. The package is moved from "org.apache.kafka.common" to "org.apache.kafka.clients.tool". The purpose of the change is to enable tool-related interfaces to access code from "producer", "consumer", and "admin". Please verify the change before voting. thanks! -- chia-ping
Re: [DISCUSS] KIP-641 An new java interface to replace 'kafka.common.MessageReader'
Dear all, I noticed a issue after starting the vote :( The package "org.apache.kafka.common" disallows to import "org.apache.kafka.clients.producer.ProducerRecord". Hence, "org.apache.kafka.common.RecordReader" can't reference "org.apache.kafka.clients.producer.ProducerRecord" directly. It seems to me the tool-related interface should be able to access all public classes from "common", "producer", "consumer", and "admin". Accordingly, I'd like to move the package of "RecordReader" from "org.apache.kafka.common" to ""org.apache.kafka.clients.tool". Please take a look at above change. thanks, chia-ping
[jira] [Created] (KAFKA-14732) Use an exponential backoff retry mechanism while reconfiguring connector tasks
Yash Mayya created KAFKA-14732: -- Summary: Use an exponential backoff retry mechanism while reconfiguring connector tasks Key: KAFKA-14732 URL: https://issues.apache.org/jira/browse/KAFKA-14732 Project: Kafka Issue Type: Improvement Components: KafkaConnect Reporter: Yash Mayya Assignee: Yash Mayya Kafka Connect in distributed mode retries infinitely with a fixed retry backoff (250 ms) in case of errors arising during connector task reconfiguration. Tasks can be "reconfigured" during connector startup (to get the initial task configs from the connector), a connector resume or if a connector explicitly requests it via its context. Task reconfiguration essentially entails requesting a connector instance for its task configs and writing them to the Connect cluster's config storage (in case a change in task configs is detected). A fixed retry backoff of 250 ms leads to very aggressive retries - consider a Debezium connector which attempts to initiate a database connection in its [taskConfigs method|https://github.com/debezium/debezium/blob/bf347da71ad9b0819998a3bc9754b3cc96cc1563/debezium-connector-sqlserver/src/main/java/io/debezium/connector/sqlserver/SqlServerConnector.java#L63]. If the connection fails due to something like an invalid login, the Connect worker will essentially spam connection attempts frequently and indefinitely (until the connector config / database side configs are fixed). An exponential backoff retry mechanism seems more well suited for the [DistributedHerder::reconfigureConnectorTasksWithRetry|https://github.com/apache/kafka/blob/a54a34a11c1c867ff62a7234334cad5139547fd7/connect/runtime/src/main/java/org/apache/kafka/connect/runtime/distributed/DistributedHerder.java#L1873-L1898] method. -- This message was sent by Atlassian Jira (v8.20.10#820010)
Re: [VOTE] KIP-641 An new java interface to replace 'kafka.common.MessageReader'
+ 1 (non binding) Thanks On Sat, Feb 18, 2023 at 9:36 AM Chia-Ping Tsai wrote: > > Hi, > > I'd like to start the vote on KIP-614: An new java interface to replace > 'kafka.common.MessageReader' > > KIP-614: > https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=158866569 > > thread: > https://lists.apache.org/thread.html/r6db6708f64345bb8fe0d573e05014fb790e69d501f21f855ca65619a%40%3Cdev.kafka.apache.org%3E > > Cheers, > Chia-Ping
Re: [DISCUSS] KIP-641 An new java interface to replace 'kafka.common.MessageReader'
Hi Chia-Ping, To be honest the stateful version, setting an input stream once using the `readFrom(InputStream)` method and then repeatedly asking for the next record using a parameterless `readRecord()`, seems a bit more natural to me than `readRecord(InputStream inputStream)` being called repeatedly with (I assume) the same input stream. I think the contract is simpler to describe and understand. It's worth thinking about how implementers might have to read bytes from the stream to discover the end of one record and the start of the next. Unless we've guaranteed that the input stream supports mark and reset then they have to buffer the initial bytes of the next record that they've just read from the stream so that they can use them when called next time. So I think RecordReaders are (in general) inherently stateful and therefore it seems harmless for them to also have the input stream itself as some of that state. Cheers, Tom On Sat, 18 Feb 2023 at 08:25, Chia-Ping Tsai wrote: > > > On 2023/02/17 06:47:18 Luke Chen wrote: > > Hi Chia-Ping, > > > > Thanks for the KIP! > > > > Overall LGTM, just one minor comment: > > Could we log warning messages to users when using deprecated > MessageReader? > > Sure. I will address it when implementing the KIP. > >
[VOTE] KIP-641 An new java interface to replace 'kafka.common.MessageReader'
Hi, I'd like to start the vote on KIP-614: An new java interface to replace 'kafka.common.MessageReader' KIP-614: https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=158866569 thread: https://lists.apache.org/thread.html/r6db6708f64345bb8fe0d573e05014fb790e69d501f21f855ca65619a%40%3Cdev.kafka.apache.org%3E Cheers, Chia-Ping
Re: [DISCUSS] KIP-641 An new java interface to replace 'kafka.common.MessageReader'
On 2023/02/17 06:47:18 Luke Chen wrote: > Hi Chia-Ping, > > Thanks for the KIP! > > Overall LGTM, just one minor comment: > Could we log warning messages to users when using deprecated MessageReader? Sure. I will address it when implementing the KIP.