[jira] [Resolved] (KAFKA-16685) RLMTask warning logs do not include parent exception trace
[ https://issues.apache.org/jira/browse/KAFKA-16685?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jorge Esteban Quilcate Otoya resolved KAFKA-16685. -- Resolution: Fixed Merged https://github.com/apache/kafka/pull/15880 > RLMTask warning logs do not include parent exception trace > -- > > Key: KAFKA-16685 > URL: https://issues.apache.org/jira/browse/KAFKA-16685 > Project: Kafka > Issue Type: Improvement > Components: Tiered-Storage >Reporter: Jorge Esteban Quilcate Otoya >Assignee: Jorge Esteban Quilcate Otoya >Priority: Major > > When RLMTask warning exceptions happen and are logged, it only includes the > exception message, but we lose the stack trace. > See > [https://github.com/apache/kafka/blob/70b8c5ae8e9336dbf4792e2d3cf100bbd70d6480/core/src/main/java/kafka/log/remote/RemoteLogManager.java#L821-L831] > This makes it difficult to troubleshoot issues. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (KAFKA-16691) Support for nested structures: TimestampConverter
Jorge Esteban Quilcate Otoya created KAFKA-16691: Summary: Support for nested structures: TimestampConverter Key: KAFKA-16691 URL: https://issues.apache.org/jira/browse/KAFKA-16691 Project: Kafka Issue Type: Sub-task Reporter: Jorge Esteban Quilcate Otoya -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (KAFKA-16690) Support for nested structures: HeaderFrom
Jorge Esteban Quilcate Otoya created KAFKA-16690: Summary: Support for nested structures: HeaderFrom Key: KAFKA-16690 URL: https://issues.apache.org/jira/browse/KAFKA-16690 Project: Kafka Issue Type: Sub-task Reporter: Jorge Esteban Quilcate Otoya -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Resolved] (KAFKA-14226) Introduce support for nested structures
[ https://issues.apache.org/jira/browse/KAFKA-14226?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jorge Esteban Quilcate Otoya resolved KAFKA-14226. -- Resolution: Fixed Merged: https://github.com/apache/kafka/pull/15379 > Introduce support for nested structures > --- > > Key: KAFKA-14226 > URL: https://issues.apache.org/jira/browse/KAFKA-14226 > Project: Kafka > Issue Type: Sub-task >Reporter: Jorge Esteban Quilcate Otoya >Assignee: Jorge Esteban Quilcate Otoya >Priority: Major > > Abstraction for FieldPath and initial SMTs: > * ExtractField > * HeaderFrom > * TimestampConverter -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (KAFKA-16685) RSM Task warn logs do not include parent exception trace
Jorge Esteban Quilcate Otoya created KAFKA-16685: Summary: RSM Task warn logs do not include parent exception trace Key: KAFKA-16685 URL: https://issues.apache.org/jira/browse/KAFKA-16685 Project: Kafka Issue Type: Improvement Components: Tiered-Storage Reporter: Jorge Esteban Quilcate Otoya Assignee: Jorge Esteban Quilcate Otoya When RSMTask exceptions happen and are logged, it only includes the exception message, but we lose the stack trace. See [https://github.com/apache/kafka/blob/70b8c5ae8e9336dbf4792e2d3cf100bbd70d6480/core/src/main/java/kafka/log/remote/RemoteLogManager.java#L821-L831] This makes it difficult to troubleshoot issues. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (KAFKA-16264) Expose `producer.id.expiration.check.interval.ms` as dynamic broker configuration
Jorge Esteban Quilcate Otoya created KAFKA-16264: Summary: Expose `producer.id.expiration.check.interval.ms` as dynamic broker configuration Key: KAFKA-16264 URL: https://issues.apache.org/jira/browse/KAFKA-16264 Project: Kafka Issue Type: Improvement Reporter: Jorge Esteban Quilcate Otoya Assignee: Jorge Esteban Quilcate Otoya Dealing with a scenario where too many producer ids lead to issues (e.g. high cpu utilization, see KAFKA-16229) put operators in need to flush producer ids more promptly than usual. Currently, only the expiration timeout `producer.id.expiration.ms` is exposed as dynamic config. This is helpful (e.g. by reducing the timeout, less producer would eventually be kept in memory), but not enough if the evaluation frequency is not sufficiently short to flush producer ids before becoming an issue. Only by tuning both, the issue could be workaround. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (KAFKA-16229) Slow expiration of Producer IDs leading to high CPU usage
Jorge Esteban Quilcate Otoya created KAFKA-16229: Summary: Slow expiration of Producer IDs leading to high CPU usage Key: KAFKA-16229 URL: https://issues.apache.org/jira/browse/KAFKA-16229 Project: Kafka Issue Type: Bug Reporter: Jorge Esteban Quilcate Otoya Assignee: Jorge Esteban Quilcate Otoya Expiration of ProducerIds is implemented with a slow removal of map keys: ``` producers.keySet().removeAll(keys); ``` Unnecessarily going through all producer ids and then throw all expired keys to be removed. This leads to exponential time on worst case when most/all keys need to be removed: ``` Benchmark (numProducerIds) Mode Cnt Score Error Units ProducerStateManagerBench.testDeleteExpiringIds 100 avgt 3 9164.043 ± 10647.877 ns/op ProducerStateManagerBench.testDeleteExpiringIds 1000 avgt 3 341561.093 ± 20283.211 ns/op ProducerStateManagerBench.testDeleteExpiringIds 1 avgt 3 44957983.550 ± 9389011.290 ns/op ProducerStateManagerBench.testDeleteExpiringIds 10 avgt 3 5683374164.167 ± 1446242131.466 ns/op ``` A simple fix is to use map#remove(key) instead, leading to a more linear growth: ``` Benchmark(numProducerIds) Mode Cnt Score Error Units ProducerStateManagerBench.testDeleteExpiringIds 100 avgt3 5779.056 ± 651.389 ns/op ProducerStateManagerBench.testDeleteExpiringIds 1000 avgt3 61430.530 ± 21875.644 ns/op ProducerStateManagerBench.testDeleteExpiringIds 1 avgt3 643887.031 ± 600475.302 ns/op ProducerStateManagerBench.testDeleteExpiringIds10 avgt3 7741689.539 ± 3218317.079 ns/op ``` -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (KAFKA-15806) Signal next segment when remote fetching
Jorge Esteban Quilcate Otoya created KAFKA-15806: Summary: Signal next segment when remote fetching Key: KAFKA-15806 URL: https://issues.apache.org/jira/browse/KAFKA-15806 Project: Kafka Issue Type: Improvement Components: Tiered-Storage Reporter: Jorge Esteban Quilcate Otoya Assignee: Jorge Esteban Quilcate Otoya Improve remote fetching performance when fetching across segment by signaling the next segment and allow Remote Storage Manager implementations to optimize their pre-fetching. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (KAFKA-15805) Fetch Remote Indexes at once
Jorge Esteban Quilcate Otoya created KAFKA-15805: Summary: Fetch Remote Indexes at once Key: KAFKA-15805 URL: https://issues.apache.org/jira/browse/KAFKA-15805 Project: Kafka Issue Type: Improvement Components: Tiered-Storage Reporter: Jorge Esteban Quilcate Otoya Assignee: Jorge Esteban Quilcate Otoya Reduce Tiered Storage latency when fetching indexes by allowing to fetch many indexes at once. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (KAFKA-15314) No Quota applied if client-id is null or empty
Jorge Esteban Quilcate Otoya created KAFKA-15314: Summary: No Quota applied if client-id is null or empty Key: KAFKA-15314 URL: https://issues.apache.org/jira/browse/KAFKA-15314 Project: Kafka Issue Type: Bug Components: core Reporter: Jorge Esteban Quilcate Otoya When Quotas where proposed, KIP-13[1] stated: > In addition, there will be a quota reserved for clients not presenting a >client id (for e.g. simple consumers not setting the id). This will default to >an empty client id ("") and all such clients will share the quota for that >empty id (which should be the default quota). Though, seems that when client-id is null or empty and a default quota for client-id is present, no quota is applied. Even though Java clients set a default value [2][3], the protocol accepts null client-id[4], and other clients implementations could send a null value to by-pass a quota. Related code[5][6] shows that preparing metric pair for quotas with client-id (potentially null) and setting quota to null when both client-id and (sanitize) user are null. Adding some tests to showcase this: [https://github.com/apache/kafka/pull/14165] Is it expected for client-id=null to by-pass quotas? If it is, then KIP or documentation to clarify this; otherwise we should amend this behavior bug. e.g we could "sanitize" client-id similar to user name to be empty string when input is null or empty. As a side-note, similar behavior could happen with user I guess. Even though value is default to ANONYMOUS, if a client implementation sends empty value, it may as well by-pass the default quota – though I need to further test this once this is considered a bug. [1]: [https://cwiki.apache.org/confluence/display/KAFKA/KIP-13+-+Quotas] [2]: [https://github.com/apache/kafka/blob/e98508747acc8972ac5ceb921e0fd3a7d7bd5e9c/clients/src/main/java/org/apache/kafka/clients/producer/ProducerConfig.java#L498-L508] [3]: [https://github.com/apache/kafka/blob/ab71c56973518bac8e1868eccdc40b17d7da35c1/clients/src/main/java/org/apache/kafka/clients/consumer/ConsumerConfig.java#L616-L628] [4]: [https://github.com/apache/kafka/blob/9f26906fcc2fd095b7d27c504e342b9a8d619b4b/clients/src/main/resources/common/message/RequestHeader.json#L34-L40] [5]: [https://github.com/apache/kafka/blob/322ac86ba282f35373382854cc9e790e4b7fb5fc/core/src/main/scala/kafka/server/ClientQuotaManager.scala#L588-L628] [6]: [https://github.com/apache/kafka/blob/322ac86ba282f35373382854cc9e790e4b7fb5fc/core/src/main/scala/kafka/server/ClientQuotaManager.scala#L651-L652] -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (KAFKA-15231) Add ability to pause/resume Remote Log Manager tasks
Jorge Esteban Quilcate Otoya created KAFKA-15231: Summary: Add ability to pause/resume Remote Log Manager tasks Key: KAFKA-15231 URL: https://issues.apache.org/jira/browse/KAFKA-15231 Project: Kafka Issue Type: Improvement Components: core Reporter: Jorge Esteban Quilcate Otoya Once Tiered Storage is enabled, there may be situations where needed to pause uploading tasks to a remote-tier. e.g. remote storage maintenance, troubleshooting, etc. An RSM implementation may not be able to do this by itself without throwing exceptions, polluting the logs, etc. Could we consider adding this ability to the Tiered Storage framework? Remote Log Manager seems like a good candidate place for this; though I'm wondering on how to expose it. Would be interested to hear if this sounds like a good idea, and what options we have to include these. We have been considering extending RLM tasks with a pause flag, and having an MBean to switch them on demand. Another option may be to extend the Kafka protocol to expose this – but seems much moved involved. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (KAFKA-15181) Race condition on partition assigned to TopicBasedRemoteLogMetadataManager
Jorge Esteban Quilcate Otoya created KAFKA-15181: Summary: Race condition on partition assigned to TopicBasedRemoteLogMetadataManager Key: KAFKA-15181 URL: https://issues.apache.org/jira/browse/KAFKA-15181 Project: Kafka Issue Type: Bug Components: core Reporter: Jorge Esteban Quilcate Otoya Assignee: Jorge Esteban Quilcate Otoya TopicBasedRemoteLogMetadataManager (TBRLMM) uses a cache to be prepared whever partitions are assigned. When partitions are assigned to the TBRLMM instance, a consumer is started to keep the cache up to date. If the cache hasn't finalized to build, TBRLMM fails to return remote metadata about partitions that are store on the backing topic. TBRLMM may not recover from this failing state. A proposal to fix this issue would be wait after a partition is assigned for the consumer to catch up. A similar logic is used at the moment when TBRLMM writes to the topic, and uses send callback to wait for consumer to catch up. This logic can be reused whever a partition is assigned, so when TBRLMM is marked as initialized, cache is ready to serve requests. Reference: https://github.com/aiven/kafka/issues/33 -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (KAFKA-15147) Measure pending and outstanding Remote Segment operations
Jorge Esteban Quilcate Otoya created KAFKA-15147: Summary: Measure pending and outstanding Remote Segment operations Key: KAFKA-15147 URL: https://issues.apache.org/jira/browse/KAFKA-15147 Project: Kafka Issue Type: Improvement Components: core Reporter: Jorge Esteban Quilcate Otoya Remote Log Segment operations (copy/delete) are executed by the Remote Storage Manager, and recorded by Remote Log Metadata Manager (e.g. default TopicBasedRLMM writes to the internal Kafka topic state changes on remote log segments). As executions run, fail, and retry; it will be important to know how many operations are pending and outstanding over time to alert operators. Pending operations are not enough to alert, as values can oscillate closer to zero. An additional condition needs to apply (running time > threshold) to consider an operation outstanding. Proposal: RemoteLogManager could be extended with 2 concurrent maps (pendingSegmentCopies, pendingSegmentDeletes) `Map[Uuid, Long]` to measure segmentId time when operation started, and based on this expose 2 metrics per operation: * pendingSegmentCopies: gauge of pendingSegmentCopies map * outstandingSegmentCopies: loop over pending ops, and if now - startedTime > timeout, then outstanding++ (maybe on debug level?) Is this a valuable metric to add to Tiered Storage? or better to solve on a custom RLMM implementation? Also, does it require a KIP? Thanks! -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (KAFKA-15142) Add Client Metadata to RemoteStorageFetchInfo
Jorge Esteban Quilcate Otoya created KAFKA-15142: Summary: Add Client Metadata to RemoteStorageFetchInfo Key: KAFKA-15142 URL: https://issues.apache.org/jira/browse/KAFKA-15142 Project: Kafka Issue Type: Improvement Components: core Reporter: Jorge Esteban Quilcate Otoya Assignee: Jorge Esteban Quilcate Otoya Once Tiered Storage is deployed, it will be important to understand how remote data is accessed and what consumption patterns emerge on each deployment. To do this, tiered storage logs/metrics could provide more context about which client is fetching which partition/offset range and when. At the moment, Client metadata is not propagated to the tiered-storage framework. To fix this, {{RemoteStorageFetchInfo}} can be extended with {{Optional[ClientMetadata]}} available on {{{}FetchParams{}}}, and have this bits of data available to improve the logging/metrics when fetching. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Resolved] (KAFKA-15131) Improve RemoteStorageManager exception handling documentation
[ https://issues.apache.org/jira/browse/KAFKA-15131?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jorge Esteban Quilcate Otoya resolved KAFKA-15131. -- Resolution: Fixed > Improve RemoteStorageManager exception handling documentation > - > > Key: KAFKA-15131 > URL: https://issues.apache.org/jira/browse/KAFKA-15131 > Project: Kafka > Issue Type: Improvement > Components: core >Reporter: Jorge Esteban Quilcate Otoya >Assignee: Jorge Esteban Quilcate Otoya >Priority: Major > Labels: tiered-storage > > As discussed here[1], RemoteStorageManager javadocs requires clarification > regarding error handling: > * Remove ambiguity on `RemoteResourceNotFoundException` description > * Describe when `RemoteResourceNotFoundException` can/should be thrown > * Describe expectations for idempotent operations when copying/deleting > > [1] > https://issues.apache.org/jira/browse/KAFKA-7739?focusedCommentId=17720936=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-17720936 -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (KAFKA-15135) RLM listener configurations passed but ignored by RLMM
Jorge Esteban Quilcate Otoya created KAFKA-15135: Summary: RLM listener configurations passed but ignored by RLMM Key: KAFKA-15135 URL: https://issues.apache.org/jira/browse/KAFKA-15135 Project: Kafka Issue Type: Bug Components: core Reporter: Jorge Esteban Quilcate Otoya As describe here [1] properties captured from listener are passed but ignored by TopicBasedRLMM. [1] https://github.com/apache/kafka/pull/13828#issuecomment-1611155345 -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (KAFKA-15131) Improve RemoteStorageManager exception handling
Jorge Esteban Quilcate Otoya created KAFKA-15131: Summary: Improve RemoteStorageManager exception handling Key: KAFKA-15131 URL: https://issues.apache.org/jira/browse/KAFKA-15131 Project: Kafka Issue Type: Improvement Components: core Reporter: Jorge Esteban Quilcate Otoya Assignee: Jorge Esteban Quilcate Otoya As discussed here[1], RemoteStorageManager javadocs requires clarification regarding error handling: * Remove ambiguity on `RemoteResourceNotFoundException` description * Describe when `RemoteResourceNotFoundException` can/should be thrown * Describe expectations for idempotent operations when copying/deleting [1] https://issues.apache.org/jira/browse/KAFKA-7739?focusedCommentId=17720936=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-17720936 -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (KAFKA-15051) docs: add missing connector plugin endpoint to documentation
Jorge Esteban Quilcate Otoya created KAFKA-15051: Summary: docs: add missing connector plugin endpoint to documentation Key: KAFKA-15051 URL: https://issues.apache.org/jira/browse/KAFKA-15051 Project: Kafka Issue Type: Task Components: docs, documentation Reporter: Jorge Esteban Quilcate Otoya GET /plugin/config endpoint added in [KIP-769|https://cwiki.apache.org/confluence/display/KAFKA/KIP-769%3A+Connect+APIs+to+list+all+connector+plugins+and+retrieve+their+configuration+definitions] is not included in the connect documentation page: https://kafka.apache.org/documentation/#connect_rest -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (KAFKA-15014) KIP-935: Extend AlterConfigPolicy with existing configurations
Jorge Esteban Quilcate Otoya created KAFKA-15014: Summary: KIP-935: Extend AlterConfigPolicy with existing configurations Key: KAFKA-15014 URL: https://issues.apache.org/jira/browse/KAFKA-15014 Project: Kafka Issue Type: Improvement Components: core Reporter: Jorge Esteban Quilcate Otoya KIP: https://cwiki.apache.org/confluence/display/KAFKA/KIP-935%3A+Extend+AlterConfigPolicy+with+existing+configurations -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (KAFKA-15013) KIP-934: Add DeleteTopicPolicy
Jorge Esteban Quilcate Otoya created KAFKA-15013: Summary: KIP-934: Add DeleteTopicPolicy Key: KAFKA-15013 URL: https://issues.apache.org/jira/browse/KAFKA-15013 Project: Kafka Issue Type: New Feature Components: core Reporter: Jorge Esteban Quilcate Otoya KIP: https://cwiki.apache.org/confluence/display/KAFKA/KIP-934%3A+Add+DeleteTopicPolicy -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (KAFKA-14843) Connector plugins config endpoint does not include Common configs
Jorge Esteban Quilcate Otoya created KAFKA-14843: Summary: Connector plugins config endpoint does not include Common configs Key: KAFKA-14843 URL: https://issues.apache.org/jira/browse/KAFKA-14843 Project: Kafka Issue Type: Bug Components: KafkaConnect Affects Versions: 3.2.0 Reporter: Jorge Esteban Quilcate Otoya Connector plugins GET config endpoint introduced in [https://cwiki.apache.org/confluence/display/KAFKA/KIP-769%3A+Connect+APIs+to+list+all+connector+plugins+and+retrieve+their+configuration+definitions|https://cwiki.apache.org/confluence/display/KAFKA/KIP-769%3A+Connect+APIs+to+list+all+connector+plugins+and+retrieve+their+configuration+definitions,] allows to get plugin configuration from the rest endpoint. This configuration only includes the plugin configuration, but not the base configuration of the Sink/Source Connector. For instance, when validating the configuration of a plugin, _all_ configs are returned: ``` curl -s $CONNECT_URL/connector-plugins/io.aiven.kafka.connect.http.HttpSinkConnector/config | jq -r '.[].name' | sort -u | wc -l 21 curl -s $CONNECT_URL/connector-plugins/io.aiven.kafka.connect.http.HttpSinkConnector/config/validate -XPUT -H 'Content-type: application/json' --data "\{\"connector.class\": \"io.aiven.kafka.connect.http.HttpSinkConnector\", \"topics\": \"example-topic-name\"}" | jq -r '.configs[].definition.name' | sort -u | wc -l 39 ``` and the missing configs are all from base config: ``` diff validate.txt config.txt 6,14d5 < config.action.reload < connector.class < errors.deadletterqueue.context.headers.enable < errors.deadletterqueue.topic.name < errors.deadletterqueue.topic.replication.factor < errors.log.enable < errors.log.include.messages < errors.retry.delay.max.ms < errors.retry.timeout 16d6 < header.converter 24d13 < key.converter 26d14 < name 33d20 < predicates 35,39d21 < tasks.max < topics < topics.regex < transforms < value.converter ``` Would be great to get the base configs from the same endpoint as well, so we could rely on it instead of using the validate endpoint to get all configs. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (KAFKA-14815) Move Kafka documentation to Markdown/Hugo
Jorge Esteban Quilcate Otoya created KAFKA-14815: Summary: Move Kafka documentation to Markdown/Hugo Key: KAFKA-14815 URL: https://issues.apache.org/jira/browse/KAFKA-14815 Project: Kafka Issue Type: Task Components: documentation Reporter: Jorge Esteban Quilcate Otoya Follow up from https://issues.apache.org/jira/browse/KAFKA-2967 Creating this task to discuss the adoption of Markdown and Hugo, and replace the HTML code. The reasons to move away from HTML are outlined in KAFKA-2967. Markdown and Asciidoc are both alternatives, but Markdown has been used, as I found some blockers when trying to migrate to Asciidoc: * Hugo requires Asciidoctor as additional binary (Markdown supported ootb) * To use Asciidoctor some security policies need to be opened: [https://stackoverflow.com/questions/71058236/hugo-with-asciidoctor] * I haven't managed to use shortcodes in Asciidoctor to inject versions, tables, etc. Given the lower friction of Markdown and default support from Hugo, I would like to propose using these. Though I'm open to collaborate if someone has experience with Asciidoc and Hugo to make the migration as there is an existing interest to use Asciidoc if possible. Draft repo: https://github.com/jeqo/ak-docs -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (KAFKA-14441) Benchmark performance impact of metrics library
Jorge Esteban Quilcate Otoya created KAFKA-14441: Summary: Benchmark performance impact of metrics library Key: KAFKA-14441 URL: https://issues.apache.org/jira/browse/KAFKA-14441 Project: Kafka Issue Type: Task Components: metrics Reporter: Jorge Esteban Quilcate Otoya While discussing KIP-864 ([https://lists.apache.org/thread/k6rh2mr7pg94935fgpqw8b5fj308f2n7]) there is a concern on how much impact there is on sampling metric values, particularly when adding metrics that record values per-record instead of per-batch. By adding benchmarks for sampling values, there will be more confidence whether to design metrics to be exposed at a DEBUG or INFO level depending on their impact. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (KAFKA-14409) Clean ProcessorParameters from casting
Jorge Esteban Quilcate Otoya created KAFKA-14409: Summary: Clean ProcessorParameters from casting Key: KAFKA-14409 URL: https://issues.apache.org/jira/browse/KAFKA-14409 Project: Kafka Issue Type: Improvement Reporter: Jorge Esteban Quilcate Otoya ProcessorParameters currently includes a set of methods to cast to specific supplier types: * kTableSourceSupplier * kTableProcessorSupplier * kTableKTableJoinMergerProcessorSupplier As most of these are used on specific classes, and the usage assumptions may vary (some expect nulls and other don't), this ticket proposes to remove these methods and move the casting into the class using it. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (KAFKA-14408) Consider enabling DEBUG log level on tests for Streams
Jorge Esteban Quilcate Otoya created KAFKA-14408: Summary: Consider enabling DEBUG log level on tests for Streams Key: KAFKA-14408 URL: https://issues.apache.org/jira/browse/KAFKA-14408 Project: Kafka Issue Type: Improvement Components: streams Reporter: Jorge Esteban Quilcate Otoya Ideally logging should not trigger any side effect, though we have found that it did for https://issues.apache.org/jira/browse/KAFKA-14325. This ticket is to request if we should consider enabling higher logging levels (currently INFO) during tests to validate these paths. There may be some additional costs on log file sizes and verbosity, so it's open to discuss if this is worth it or not, and whether to expand this to other components as well. Additional discussion: https://github.com/apache/kafka/pull/12859#discussion_r1027007714 -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (KAFKA-14232) Support for nested structures: InsertField
Jorge Esteban Quilcate Otoya created KAFKA-14232: Summary: Support for nested structures: InsertField Key: KAFKA-14232 URL: https://issues.apache.org/jira/browse/KAFKA-14232 Project: Kafka Issue Type: Sub-task Reporter: Jorge Esteban Quilcate Otoya Assignee: Jorge Esteban Quilcate Otoya -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (KAFKA-14231) Support for nested structures: ReplaceField
Jorge Esteban Quilcate Otoya created KAFKA-14231: Summary: Support for nested structures: ReplaceField Key: KAFKA-14231 URL: https://issues.apache.org/jira/browse/KAFKA-14231 Project: Kafka Issue Type: Sub-task Reporter: Jorge Esteban Quilcate Otoya Assignee: Jorge Esteban Quilcate Otoya -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (KAFKA-14230) Support for nested structures: Cast
Jorge Esteban Quilcate Otoya created KAFKA-14230: Summary: Support for nested structures: Cast Key: KAFKA-14230 URL: https://issues.apache.org/jira/browse/KAFKA-14230 Project: Kafka Issue Type: Sub-task Reporter: Jorge Esteban Quilcate Otoya Assignee: Jorge Esteban Quilcate Otoya -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (KAFKA-14228) Support for nested structures: ValueToKey
Jorge Esteban Quilcate Otoya created KAFKA-14228: Summary: Support for nested structures: ValueToKey Key: KAFKA-14228 URL: https://issues.apache.org/jira/browse/KAFKA-14228 Project: Kafka Issue Type: Sub-task Reporter: Jorge Esteban Quilcate Otoya Assignee: Jorge Esteban Quilcate Otoya -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (KAFKA-14229) Support for nested structures: HoistValue
Jorge Esteban Quilcate Otoya created KAFKA-14229: Summary: Support for nested structures: HoistValue Key: KAFKA-14229 URL: https://issues.apache.org/jira/browse/KAFKA-14229 Project: Kafka Issue Type: Sub-task Reporter: Jorge Esteban Quilcate Otoya Assignee: Jorge Esteban Quilcate Otoya -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (KAFKA-14227) Support for nested structures: MaskField
Jorge Esteban Quilcate Otoya created KAFKA-14227: Summary: Support for nested structures: MaskField Key: KAFKA-14227 URL: https://issues.apache.org/jira/browse/KAFKA-14227 Project: Kafka Issue Type: Sub-task Reporter: Jorge Esteban Quilcate Otoya Assignee: Jorge Esteban Quilcate Otoya -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (KAFKA-14226) Introduce support for nested structures
Jorge Esteban Quilcate Otoya created KAFKA-14226: Summary: Introduce support for nested structures Key: KAFKA-14226 URL: https://issues.apache.org/jira/browse/KAFKA-14226 Project: Kafka Issue Type: Sub-task Reporter: Jorge Esteban Quilcate Otoya Assignee: Jorge Esteban Quilcate Otoya Abstraction for FieldPath and initial SMTs: * ExtractField * HeaderFrom * TimestampConverter -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (KAFKA-14191) Add end-to-end latency metrics to Connectors
Jorge Esteban Quilcate Otoya created KAFKA-14191: Summary: Add end-to-end latency metrics to Connectors Key: KAFKA-14191 URL: https://issues.apache.org/jira/browse/KAFKA-14191 Project: Kafka Issue Type: Improvement Components: metrics Reporter: Jorge Esteban Quilcate Otoya Request to add latency metrics to connectors to measure transformation latency and e2e latency on the sink side. KIP: https://cwiki.apache.org/confluence/display/KAFKA/KIP-864%3A+Add+End-To-End+Latency+Metrics+to+Connectors -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (KAFKA-13822) Update Kafka Streams Adjust Thread Count tests to new Processor API
Jorge Esteban Quilcate Otoya created KAFKA-13822: Summary: Update Kafka Streams Adjust Thread Count tests to new Processor API Key: KAFKA-13822 URL: https://issues.apache.org/jira/browse/KAFKA-13822 Project: Kafka Issue Type: Task Reporter: Jorge Esteban Quilcate Otoya h4. Once KIP-820 is merged and release, AdjustStreamThreadCountTest[1] will be using deprecated APIs: [https://github.com/apache/kafka/pull/11993#discussion_r847769618|https://github.com/apache/kafka/pull/11993#discussion_r847744046] [1] [https://github.com/apache/kafka/blob/0d518aaed158896ee9ee6949b8f38128d1d73634/streams/src/test/java/org/apache/kafka/streams/integration/AdjustStreamThreadCountTest.java|https://github.com/apache/kafka/blob/0d518aaed158896ee9ee6949b8f38128d1d73634/streams/examples/src/main/java/org/apache/kafka/streams/examples/wordcount/WordCountTransformerDemo.java] -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Created] (KAFKA-13821) Update Kafka Streams demo to new Processor API
Jorge Esteban Quilcate Otoya created KAFKA-13821: Summary: Update Kafka Streams demo to new Processor API Key: KAFKA-13821 URL: https://issues.apache.org/jira/browse/KAFKA-13821 Project: Kafka Issue Type: Task Components: streams Affects Versions: 3.3.0 Reporter: Jorge Esteban Quilcate Otoya Once KIP-820 is merged and release, demo will be using deprecated APIs: https://github.com/apache/kafka/pull/11993#discussion_r847744046 -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Resolved] (KAFKA-13742) Quota byte-rate/request metrics are loaded only when at least one quota is register
[ https://issues.apache.org/jira/browse/KAFKA-13742?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jorge Esteban Quilcate Otoya resolved KAFKA-13742. -- Resolution: Not A Problem Given that I got a better understanding of how quota metrics work (https://issues.apache.org/jira/browse/KAFKA-13744), I will close this one. The only comment I may add is that there are no metrics at the moment to match client/users with topic/partitions. This information is only captured on the client-side as far as I know. I found quota metrics as a good proxy to get this mapping, though is still incomplete as the map to the topic is implicit at the moment on the broker-side. Would be a nice addition to have this mapping available, but it should be discussed in another issue if there's interest in that. > Quota byte-rate/request metrics are loaded only when at least one quota is > register > --- > > Key: KAFKA-13742 > URL: https://issues.apache.org/jira/browse/KAFKA-13742 > Project: Kafka > Issue Type: Bug > Components: core, metrics >Reporter: Jorge Esteban Quilcate Otoya >Priority: Major > Labels: quotas > > Quota metrics are loaded only when at least one quota is present: > * Metrics: > [https://github.com/apache/kafka/blob/0b9a8bac36f16b5397e9ec3a0441758e4b60a384/core/src/main/scala/kafka/server/ClientQuotaManager.scala#L552-L563] > * Reporting when quotas are enabled: > [https://github.com/apache/kafka/blob/0b9a8bac36f16b5397e9ec3a0441758e4b60a384/core/src/main/scala/kafka/server/ClientQuotaManager.scala#L249-L256] > * Quotas enabled: `def quotasEnabled: Boolean = quotaTypesEnabled != > QuotaTypes.NoQuotas` > Even though throttling is specific for quotas, byte-rate/request per > user/client-id is a valid metric for any deployment. > > The current workaround is to add _any_ quota, as this will enable metrics for > *all* client-id/users. > If these metrics are captured for all clients regardless of the quotas > created, it would be a better experience to have a config to opt-in into > these metrics instead of creating meaningless quotas just to get these > metrics. > For threshold metrics, it makes sense to me to enable them only when quotas > are enabled. > -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Resolved] (KAFKA-13744) Quota metric tags are inconsistent
[ https://issues.apache.org/jira/browse/KAFKA-13744?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jorge Esteban Quilcate Otoya resolved KAFKA-13744. -- Resolution: Not A Problem > Quota metric tags are inconsistent > -- > > Key: KAFKA-13744 > URL: https://issues.apache.org/jira/browse/KAFKA-13744 > Project: Kafka > Issue Type: Bug > Components: core, metrics >Reporter: Jorge Esteban Quilcate Otoya >Priority: Major > Labels: quotas > Attachments: image-2022-03-15-16-57-12-583.png > > > When enabling metrics for quotas the metrics apply to _all_ clients (see > https://issues.apache.org/jira/browse/KAFKA-13742). > Though, the tags are calculated depending on the quotas registered and > applied to all clients: > [https://github.com/apache/kafka/blob/0b9a8bac36f16b5397e9ec3a0441758e4b60a384/core/src/main/scala/kafka/server/ClientQuotaManager.scala#L649-L694] > This causes different metric tags result depending on which quota is > registered first. > For instance, if a quota is registered with userId and clientId, then metrics > are tagged with both, though if then a quota is registered with only tagged > with clientId, then all metrics are only tagged by clientId — even though > user principal is available. > !image-2022-03-15-16-57-12-583.png|width=1034,height=415! > I managed to reproduce this behavior here: > * From 10:30 to 10:45, there was a quota with both client-id and user-id > * It was removed by 10:45, so no metrics were exposed. > * After, a quota with client id was created, and metrics were collected only > with client id, even though the user was available. > I'd expect metrics to always contain both, if available — and simplify the > logic here > [https://github.com/apache/kafka/blob/0b9a8bac36f16b5397e9ec3a0441758e4b60a384/core/src/main/scala/kafka/server/ClientQuotaManager.scala#L649-L694]. -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Created] (KAFKA-13744) Quota metric tags are inconsistent
Jorge Esteban Quilcate Otoya created KAFKA-13744: Summary: Quota metric tags are inconsistent Key: KAFKA-13744 URL: https://issues.apache.org/jira/browse/KAFKA-13744 Project: Kafka Issue Type: Bug Components: core, metrics Reporter: Jorge Esteban Quilcate Otoya Attachments: image-2022-03-15-16-57-12-583.png When enabling metrics for quotas the metrics apply to _all_ clients (see https://issues.apache.org/jira/browse/KAFKA-13742). Though, the tags are calculated depending on the quotas registered, and applied to all clients: [https://github.com/apache/kafka/blob/0b9a8bac36f16b5397e9ec3a0441758e4b60a384/core/src/main/scala/kafka/server/ClientQuotaManager.scala#L649-L694] This cause different metric tags result depending on which quota is registered first. For instance, if a quota is registered with userId and clientId, then metrics are tagged with both, though if then a quota is registered with only tagged with clientId, then all metrics are only tagged by clientId — even though user principal is available. !image-2022-03-15-16-57-12-583.png! I'd expect metrics to always contain both, if available — and simplify the logic here [https://github.com/apache/kafka/blob/0b9a8bac36f16b5397e9ec3a0441758e4b60a384/core/src/main/scala/kafka/server/ClientQuotaManager.scala#L649-L694]. -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Created] (KAFKA-13742) Quota byte-rate/request metrics are loaded only when at least one quota is register
Jorge Esteban Quilcate Otoya created KAFKA-13742: Summary: Quota byte-rate/request metrics are loaded only when at least one quota is register Key: KAFKA-13742 URL: https://issues.apache.org/jira/browse/KAFKA-13742 Project: Kafka Issue Type: Bug Components: core, metrics Reporter: Jorge Esteban Quilcate Otoya Quota metrics are loaded only when at least one quota is present: * Metrics: [https://github.com/apache/kafka/blob/0b9a8bac36f16b5397e9ec3a0441758e4b60a384/core/src/main/scala/kafka/server/ClientQuotaManager.scala#L552-L563] * Reporting when quotas are enabled: [https://github.com/apache/kafka/blob/0b9a8bac36f16b5397e9ec3a0441758e4b60a384/core/src/main/scala/kafka/server/ClientQuotaManager.scala#L249-L256] * Quotas enabled: `def quotasEnabled: Boolean = quotaTypesEnabled != QuotaTypes.NoQuotas` Even though throttling is specific for quotas, byte-rate/request per user/client-id is a valid metric for any deployment. The current workaround is to add _any_ quota, as this will enable metrics for *all* client-id/users. If these metrics are captured for all clients regardless of the quotas created, it would be a better experience to have a config to opt-in into these metrics instead of creating meaningless quotas just to get these metrics. For threshold metrics, it makes sense to me to enable them only when quotas are enabled. -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Created] (KAFKA-13662) Migrate DeserializationExceptionHandler to latest ProcessorContext API
Jorge Esteban Quilcate Otoya created KAFKA-13662: Summary: Migrate DeserializationExceptionHandler to latest ProcessorContext API Key: KAFKA-13662 URL: https://issues.apache.org/jira/browse/KAFKA-13662 Project: Kafka Issue Type: Task Reporter: Jorge Esteban Quilcate Otoya DeserializationExceptionHandler depends on old ProcessorContext API. This API has been deprecated and migrating to the latest one requires a public API that hasn't been considered as part of KIP-478. -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Created] (KAFKA-13656) Connect Transforms support for nested structures
Jorge Esteban Quilcate Otoya created KAFKA-13656: Summary: Connect Transforms support for nested structures Key: KAFKA-13656 URL: https://issues.apache.org/jira/browse/KAFKA-13656 Project: Kafka Issue Type: Improvement Reporter: Jorge Esteban Quilcate Otoya Single Message Transforms (SMT), [KIP-66|https://cwiki.apache.org/confluence/display/KAFKA/KIP-66%3A+Single+Message+Transforms+for+Kafka+Connect], have greatly improved Connector's usability by enabling processing input/output data without the need for additional streaming applications. These benefits have been limited as most SMT implementations are limited to fields available on the root structure: * https://issues.apache.org/jira/browse/KAFKA-7624 * https://issues.apache.org/jira/browse/KAFKA-10640 Therefore, this KIP is aimed to include support for nested structures on the existing SMTs — where this make sense —, and to include the abstractions to reuse this in future SMTs. KIP: https://cwiki.apache.org/confluence/x/BafkCw -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Created] (KAFKA-13654) Extend KStream process with new Processor API
Jorge Esteban Quilcate Otoya created KAFKA-13654: Summary: Extend KStream process with new Processor API Key: KAFKA-13654 URL: https://issues.apache.org/jira/browse/KAFKA-13654 Project: Kafka Issue Type: Improvement Reporter: Jorge Esteban Quilcate Otoya Extending KStream#process to use latest Processor API adopted here: https://issues.apache.org/jira/browse/KAFKA-8410 This new API allow typed returned KStream that will allow to chain results from processors, becoming a new way to transform records with more control over whats forwarded. KIP: https://cwiki.apache.org/confluence/x/yKbkCw -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Resolved] (KAFKA-13117) After processors, migrate TupleForwarder and CacheFlushListener
[ https://issues.apache.org/jira/browse/KAFKA-13117?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jorge Esteban Quilcate Otoya resolved KAFKA-13117. -- Resolution: Fixed [https://github.com/apache/kafka/pull/11481] > After processors, migrate TupleForwarder and CacheFlushListener > --- > > Key: KAFKA-13117 > URL: https://issues.apache.org/jira/browse/KAFKA-13117 > Project: Kafka > Issue Type: Sub-task >Reporter: John Roesler >Assignee: Jorge Esteban Quilcate Otoya >Priority: Major > > Currently, both of these interfaces take plain values in combination with > timestamps: > CacheFlushListener: > {code:java} > void apply(K key, V newValue, V oldValue, long timestamp) > {code} > TimestampedTupleForwarder > {code:java} > void maybeForward(K key, >V newValue, >V oldValue, >long timestamp){code} > These are internally translated to the new PAPI, but after the processors are > migrated, there won't be a need to have this translation. We should update > both of these APIs to just accept {{Record>}}. -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Resolved] (KAFKA-10543) Convert KTable joins to new PAPI
[ https://issues.apache.org/jira/browse/KAFKA-10543?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jorge Esteban Quilcate Otoya resolved KAFKA-10543. -- Resolution: Fixed https://github.com/apache/kafka/pull/11412 > Convert KTable joins to new PAPI > > > Key: KAFKA-10543 > URL: https://issues.apache.org/jira/browse/KAFKA-10543 > Project: Kafka > Issue Type: Sub-task > Components: streams >Reporter: John Roesler >Assignee: Jorge Esteban Quilcate Otoya >Priority: Major > -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Created] (KAFKA-13429) Update gitignore to include new modules
Jorge Esteban Quilcate Otoya created KAFKA-13429: Summary: Update gitignore to include new modules Key: KAFKA-13429 URL: https://issues.apache.org/jira/browse/KAFKA-13429 Project: Kafka Issue Type: Task Reporter: Jorge Esteban Quilcate Otoya Add `/bin/` to `.gitignore` for the following modules: * connect/basic-auth-extension * metadata * raft * tools * togdor -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (KAFKA-10540) Convert KStream aggregations to new PAPI
[ https://issues.apache.org/jira/browse/KAFKA-10540?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jorge Esteban Quilcate Otoya resolved KAFKA-10540. -- Resolution: Fixed https://github.com/apache/kafka/pull/11315 > Convert KStream aggregations to new PAPI > > > Key: KAFKA-10540 > URL: https://issues.apache.org/jira/browse/KAFKA-10540 > Project: Kafka > Issue Type: Sub-task > Components: streams >Reporter: John Roesler >Assignee: Jorge Esteban Quilcate Otoya >Priority: Major > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (KAFKA-10539) Convert KStreamImpl joins to new PAPI
[ https://issues.apache.org/jira/browse/KAFKA-10539?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jorge Esteban Quilcate Otoya resolved KAFKA-10539. -- Resolution: Fixed https://github.com/apache/kafka/pull/11356 > Convert KStreamImpl joins to new PAPI > - > > Key: KAFKA-10539 > URL: https://issues.apache.org/jira/browse/KAFKA-10539 > Project: Kafka > Issue Type: Sub-task > Components: streams >Reporter: John Roesler >Assignee: Jorge Esteban Quilcate Otoya >Priority: Major > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (KAFKA-10544) Convert KTable aggregations to new PAPI
[ https://issues.apache.org/jira/browse/KAFKA-10544?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jorge Esteban Quilcate Otoya resolved KAFKA-10544. -- Resolution: Fixed https://github.com/apache/kafka/pull/11316 > Convert KTable aggregations to new PAPI > --- > > Key: KAFKA-10544 > URL: https://issues.apache.org/jira/browse/KAFKA-10544 > Project: Kafka > Issue Type: Sub-task > Components: streams >Reporter: John Roesler >Assignee: Jorge Esteban Quilcate Otoya >Priority: Major > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (KAFKA-10542) Convert KTable maps to new PAPI
[ https://issues.apache.org/jira/browse/KAFKA-10542?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jorge Esteban Quilcate Otoya resolved KAFKA-10542. -- Resolution: Fixed https://github.com/apache/kafka/pull/11099 > Convert KTable maps to new PAPI > --- > > Key: KAFKA-10542 > URL: https://issues.apache.org/jira/browse/KAFKA-10542 > Project: Kafka > Issue Type: Sub-task > Components: streams >Reporter: John Roesler >Assignee: Jorge Esteban Quilcate Otoya >Priority: Major > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (KAFKA-13201) Convert KTable suppress to new PAPI
[ https://issues.apache.org/jira/browse/KAFKA-13201?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jorge Esteban Quilcate Otoya resolved KAFKA-13201. -- Resolution: Fixed https://github.com/apache/kafka/pull/11213 > Convert KTable suppress to new PAPI > --- > > Key: KAFKA-13201 > URL: https://issues.apache.org/jira/browse/KAFKA-13201 > Project: Kafka > Issue Type: Sub-task > Components: streams >Reporter: Jorge Esteban Quilcate Otoya >Assignee: Jorge Esteban Quilcate Otoya >Priority: Major > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (KAFKA-13201) Convert KTable suppress to new PAPI
Jorge Esteban Quilcate Otoya created KAFKA-13201: Summary: Convert KTable suppress to new PAPI Key: KAFKA-13201 URL: https://issues.apache.org/jira/browse/KAFKA-13201 Project: Kafka Issue Type: Sub-task Components: streams Reporter: Jorge Esteban Quilcate Otoya -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (KAFKA-10541) Convert KTable filters to new PAPI
[ https://issues.apache.org/jira/browse/KAFKA-10541?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jorge Esteban Quilcate Otoya resolved KAFKA-10541. -- Fix Version/s: 3.0.0 Assignee: John Roesler (was: Jorge Esteban Quilcate Otoya) Resolution: Fixed https://github.com/apache/kafka/pull/10869 > Convert KTable filters to new PAPI > -- > > Key: KAFKA-10541 > URL: https://issues.apache.org/jira/browse/KAFKA-10541 > Project: Kafka > Issue Type: Sub-task > Components: streams >Reporter: John Roesler >Assignee: John Roesler >Priority: Major > Fix For: 3.0.0 > > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (KAFKA-10537) Convert KStreamImpl filters to new PAPI
[ https://issues.apache.org/jira/browse/KAFKA-10537?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jorge Esteban Quilcate Otoya resolved KAFKA-10537. -- Resolution: Fixed https://github.com/apache/kafka/pull/10381 > Convert KStreamImpl filters to new PAPI > --- > > Key: KAFKA-10537 > URL: https://issues.apache.org/jira/browse/KAFKA-10537 > Project: Kafka > Issue Type: Sub-task > Components: streams >Reporter: John Roesler >Assignee: Jorge Esteban Quilcate Otoya >Priority: Major > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (KAFKA-10538) Convert KStreamImpl maps to new PAPI
[ https://issues.apache.org/jira/browse/KAFKA-10538?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jorge Esteban Quilcate Otoya resolved KAFKA-10538. -- Resolution: Fixed https://github.com/apache/kafka/pull/10381 > Convert KStreamImpl maps to new PAPI > > > Key: KAFKA-10538 > URL: https://issues.apache.org/jira/browse/KAFKA-10538 > Project: Kafka > Issue Type: Sub-task > Components: streams >Reporter: John Roesler >Assignee: Jorge Esteban Quilcate Otoya >Priority: Major > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (KAFKA-12536) Add Instant-based methods to ReadOnlySessionStore
[ https://issues.apache.org/jira/browse/KAFKA-12536?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jorge Esteban Quilcate Otoya resolved KAFKA-12536. -- Fix Version/s: 3.0.0 Resolution: Fixed > Add Instant-based methods to ReadOnlySessionStore > - > > Key: KAFKA-12536 > URL: https://issues.apache.org/jira/browse/KAFKA-12536 > Project: Kafka > Issue Type: Sub-task > Components: streams >Reporter: Jorge Esteban Quilcate Otoya >Assignee: Jorge Esteban Quilcate Otoya >Priority: Major > Fix For: 3.0.0 > > > [KIP-666|https://cwiki.apache.org/confluence/display/KAFKA/KIP-666%3A+Add+Instant-based+methods+to+ReadOnlySessionStore] > implementation. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (KAFKA-10434) Remove deprecated methods on WindowStore
[ https://issues.apache.org/jira/browse/KAFKA-10434?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jorge Esteban Quilcate Otoya resolved KAFKA-10434. -- Resolution: Fixed > Remove deprecated methods on WindowStore > > > Key: KAFKA-10434 > URL: https://issues.apache.org/jira/browse/KAFKA-10434 > Project: Kafka > Issue Type: Sub-task > Components: streams >Reporter: Jorge Esteban Quilcate Otoya >Assignee: Jorge Esteban Quilcate Otoya >Priority: Blocker > Fix For: 3.0.0 > > > From [https://github.com/apache/kafka/pull/9138#discussion_r474985997] and > [https://github.com/apache/kafka/pull/9138#discussion_r474995606] : > WindowStore contains ReadOnlyWindowStore methods. > We could consider: > * Moving read methods from WindowStore to ReadOnlyWindowStore and/or > * Consider removing long based methods -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (KAFKA-12451) Remove deprecation annotation on long-based read operations in WindowStore
[ https://issues.apache.org/jira/browse/KAFKA-12451?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jorge Esteban Quilcate Otoya resolved KAFKA-12451. -- Fix Version/s: 3.0.0 Resolution: Fixed https://github.com/apache/kafka/pull/10296 > Remove deprecation annotation on long-based read operations in WindowStore > --- > > Key: KAFKA-12451 > URL: https://issues.apache.org/jira/browse/KAFKA-12451 > Project: Kafka > Issue Type: Sub-task > Components: streams >Reporter: Jorge Esteban Quilcate Otoya >Assignee: Jorge Esteban Quilcate Otoya >Priority: Major > Fix For: 3.0.0 > > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (KAFKA-12450) Remove deprecated methods from ReadOnlyWindowStore
[ https://issues.apache.org/jira/browse/KAFKA-12450?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jorge Esteban Quilcate Otoya resolved KAFKA-12450. -- Fix Version/s: 3.0.0 Resolution: Fixed > Remove deprecated methods from ReadOnlyWindowStore > -- > > Key: KAFKA-12450 > URL: https://issues.apache.org/jira/browse/KAFKA-12450 > Project: Kafka > Issue Type: Sub-task > Components: streams >Reporter: Jorge Esteban Quilcate Otoya >Assignee: Jorge Esteban Quilcate Otoya >Priority: Major > Fix For: 3.0.0 > > > Related to KIP-358 > [https://cwiki.apache.org/confluence/display/KAFKA/KIP-358%3A+Migrate+Streams+API+to+Duration+instead+of+long+ms+times] > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (KAFKA-12533) Migrate KStream stateless operators to new Processor API
[ https://issues.apache.org/jira/browse/KAFKA-12533?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jorge Esteban Quilcate Otoya resolved KAFKA-12533. -- Resolution: Duplicate > Migrate KStream stateless operators to new Processor API > > > Key: KAFKA-12533 > URL: https://issues.apache.org/jira/browse/KAFKA-12533 > Project: Kafka > Issue Type: Sub-task > Components: streams >Reporter: Jorge Esteban Quilcate Otoya >Assignee: Jorge Esteban Quilcate Otoya >Priority: Major > > Including these operators: > > * KStream#branch > * KStream#filter > * KStream#flatMap > * KStream#flatMapValues > * KStream#map > * KStream#mapValues > * KStream#peek > * KStream#print > * KStream#passthrough > > These operators are left out, waiting for a new Transformer API > (https://issues.apache.org/jira/browse/KAFKA-8396): > * KStream#flatMapTransform > * KStream#flatMapTransformValues -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (KAFKA-12532) Migrate Stream operators to new Processor API
[ https://issues.apache.org/jira/browse/KAFKA-12532?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jorge Esteban Quilcate Otoya resolved KAFKA-12532. -- Resolution: Duplicate > Migrate Stream operators to new Processor API > - > > Key: KAFKA-12532 > URL: https://issues.apache.org/jira/browse/KAFKA-12532 > Project: Kafka > Issue Type: Improvement > Components: streams >Reporter: Jorge Esteban Quilcate Otoya >Assignee: Jorge Esteban Quilcate Otoya >Priority: Minor > > To continue adoption of > [KIP-478|https://cwiki.apache.org/confluence/display/KAFKA/KIP-478+-+Strongly+typed+Processor+API], > KStream and KTable operators need to be migrated to the new Processor API. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (KAFKA-12536) Add Instant-based methods to ReadOnlySessionStore
Jorge Esteban Quilcate Otoya created KAFKA-12536: Summary: Add Instant-based methods to ReadOnlySessionStore Key: KAFKA-12536 URL: https://issues.apache.org/jira/browse/KAFKA-12536 Project: Kafka Issue Type: Sub-task Reporter: Jorge Esteban Quilcate Otoya -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (KAFKA-12533) Migrate KStream stateless operators to new Processor API
Jorge Esteban Quilcate Otoya created KAFKA-12533: Summary: Migrate KStream stateless operators to new Processor API Key: KAFKA-12533 URL: https://issues.apache.org/jira/browse/KAFKA-12533 Project: Kafka Issue Type: Sub-task Components: streams Reporter: Jorge Esteban Quilcate Otoya Assignee: Jorge Esteban Quilcate Otoya -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (KAFKA-12532) Migrate Stream operators to new Processor API
Jorge Esteban Quilcate Otoya created KAFKA-12532: Summary: Migrate Stream operators to new Processor API Key: KAFKA-12532 URL: https://issues.apache.org/jira/browse/KAFKA-12532 Project: Kafka Issue Type: Improvement Components: streams Reporter: Jorge Esteban Quilcate Otoya Assignee: Jorge Esteban Quilcate Otoya To continue adoption of [KIP-478|https://cwiki.apache.org/confluence/display/KAFKA/KIP-478+-+Strongly+typed+Processor+API], KStream and KTable operators need to be migrated to the new Processor API. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (KAFKA-12450) Remove deprecated methods from ReadOnlyWindowStore
Jorge Esteban Quilcate Otoya created KAFKA-12450: Summary: Remove deprecated methods from ReadOnlyWindowStore Key: KAFKA-12450 URL: https://issues.apache.org/jira/browse/KAFKA-12450 Project: Kafka Issue Type: Sub-task Reporter: Jorge Esteban Quilcate Otoya -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (KAFKA-12451) Remove deprecation annotation on long-based read operations in WindowStore
Jorge Esteban Quilcate Otoya created KAFKA-12451: Summary: Remove deprecation annotation on long-based read operations in WindowStore Key: KAFKA-12451 URL: https://issues.apache.org/jira/browse/KAFKA-12451 Project: Kafka Issue Type: Sub-task Reporter: Jorge Esteban Quilcate Otoya -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (KAFKA-12449) Remove deprecated WindowStore#put
Jorge Esteban Quilcate Otoya created KAFKA-12449: Summary: Remove deprecated WindowStore#put Key: KAFKA-12449 URL: https://issues.apache.org/jira/browse/KAFKA-12449 Project: Kafka Issue Type: Sub-task Reporter: Jorge Esteban Quilcate Otoya -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (KAFKA-12287) Add WARN logging on consumer-groups when reset-offsets by timestamp or duration can't find an offset and defaults to latest.
Jorge Esteban Quilcate Otoya created KAFKA-12287: Summary: Add WARN logging on consumer-groups when reset-offsets by timestamp or duration can't find an offset and defaults to latest. Key: KAFKA-12287 URL: https://issues.apache.org/jira/browse/KAFKA-12287 Project: Kafka Issue Type: Improvement Reporter: Jorge Esteban Quilcate Otoya Assignee: Jorge Esteban Quilcate Otoya >From https://issues.apache.org/jira/browse/KAFKA-9527 Warn when resetting offsets by timestamp in a topic partition that does not have records and return null, to explictly say that we are resetting to latest offset available (e.g. zero in the case no records have been stored in a TP yet). -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (KAFKA-10459) Document IQ APIs where order does not hold between stores
Jorge Esteban Quilcate Otoya created KAFKA-10459: Summary: Document IQ APIs where order does not hold between stores Key: KAFKA-10459 URL: https://issues.apache.org/jira/browse/KAFKA-10459 Project: Kafka Issue Type: Improvement Components: streams Reporter: Jorge Esteban Quilcate Otoya >From [https://github.com/apache/kafka/pull/9138#discussion_r480469688] : This is out of the scope of this PR, but I'd like to point out that the current IQ does not actually obey the ordering when there are multiple local stores hosted on that instance. For example, if there are two stores from two tasks hosting keys \{1, 3} and \{2,4}, then a range query of key [1,4] would return in the order of {{1,3,2,4}} but not {{1,2,3,4}} since it is looping over the stores only. This would be the case for either forward or backward fetches on range-key-range-time. For single key time range fetch, or course, there's no such issue. I think it worth documenting this for now until we have a fix (and actually we are going to propose something soon). -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (KAFKA-10445) Align IQ SessionStore API with Instant-based methods as ReadOnlyWindowStore
Jorge Esteban Quilcate Otoya created KAFKA-10445: Summary: Align IQ SessionStore API with Instant-based methods as ReadOnlyWindowStore Key: KAFKA-10445 URL: https://issues.apache.org/jira/browse/KAFKA-10445 Project: Kafka Issue Type: Improvement Components: streams Reporter: Jorge Esteban Quilcate Otoya -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (KAFKA-10434) Remove deprecated methods on WindowStore
Jorge Esteban Quilcate Otoya created KAFKA-10434: Summary: Remove deprecated methods on WindowStore Key: KAFKA-10434 URL: https://issues.apache.org/jira/browse/KAFKA-10434 Project: Kafka Issue Type: Improvement Components: streams Reporter: Jorge Esteban Quilcate Otoya >From [https://github.com/apache/kafka/pull/9138#discussion_r474985997] and >[https://github.com/apache/kafka/pull/9138#discussion_r474995606] : WindowStore contains ReadOnlyWindowStore methods. We could consider: * Moving read methods from WindowStore to ReadOnlyWindowStore and/or * Consider removing long based methods -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (KAFKA-10409) Refactor Kafka Streams RocksDb iterators
Jorge Esteban Quilcate Otoya created KAFKA-10409: Summary: Refactor Kafka Streams RocksDb iterators Key: KAFKA-10409 URL: https://issues.apache.org/jira/browse/KAFKA-10409 Project: Kafka Issue Type: Improvement Reporter: Jorge Esteban Quilcate Otoya >From [https://github.com/apache/kafka/pull/9137#discussion_r470345513] : [~ableegoldman] : > Kind of unrelated, but WDYT about renaming {{RocksDBDualCFIterator}} to > {{RocksDBDualCFAllIterator}} or something on the side? I feel like these > iterators could be cleaned up a bit in general to be more understandable -- > for example, it's weird that we do the {{iterator#seek}}-ing in the actual > {{all()}} method but for range queries we do the seeking inside the iterator > constructor. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (KAFKA-9929) Support reverse iterator on WindowStore
Jorge Esteban Quilcate Otoya created KAFKA-9929: --- Summary: Support reverse iterator on WindowStore Key: KAFKA-9929 URL: https://issues.apache.org/jira/browse/KAFKA-9929 Project: Kafka Issue Type: Improvement Components: streams Reporter: Jorge Esteban Quilcate Otoya Currently, WindowStore fetch operations return an iterator sorted from earliest to latest result: ``` * For each key, the iterator guarantees ordering of windows, starting from the oldest/earliest * available window to the newest/latest window. ``` We have a use-case where traces are stored in a WindowStore and use Kafka Streams to create a materialized view of traces. A query request comes with a time range (e.g. now-1h, now) and want to return the most recent results, i.e. fetch from this period of time, iterate and pattern match latest/most recent traces, and if enough results, then reply without moving further on the iterator. Same store is used to search for previous traces. In this case, it search a key for the last day, if found traces, we would also like to iterate from the most recent. RocksDb seems to support iterating backward and forward: [https://github.com/facebook/rocksdb/wiki/Iterator#iterating-upper-bound-and-lower-bound] For reference: This in some way extracts some bits from this previous issue: https://issues.apache.org/jira/browse/KAFKA-4212: > The {{RocksDBWindowsStore}}, a {{WindowsStore}}, can expire items via segment > dropping, but it stores multiple items per key, based on their timestamp. But > this store can be repurposed as a cache by fetching the items in reverse > chronological order and returning the first item found. Would like to know if there is any impediment on RocksDb or WindowStore to support this. Adding an argument to reverse in current fetch methods would be great: ``` WindowStore.fetch(from,to,Direction.BACKWARD|FORWARD) ``` -- This message was sent by Atlassian Jira (v8.3.4#803005)