kamalcph commented on PR #14766: URL: https://github.com/apache/kafka/pull/14766#issuecomment-1823109337
> The first one is that I no longer see a point in segment.bytes and segment.ms (and by extension log.segment.bytes and log.segment.ms) with respect to tiered topics. If a person says "hey, I only want 4GB of data or data from the last 10 minutes around" then why would they ever need to configure how often a segment should be closed? If this is the case shouldn't this be followed by ignoring those two properties for tiered topics? `segment.bytes` will take effect when the topic has continuous inflow of data. If the user configures `segment.bytes` to 1 GB and `segment.ms` to 1 day and the partition has a bytes-in load of 1 MB/sec, then the segment gets filled in ~40 mins and gets rotated to passive. `segment.ms` will take effect when the topic has continuous inflow of data. If the user configures `segment.bytes` to 1 GB and `segment.ms` to 1 day and the partition has a bytes-in load of 100 bytes/sec, then the segment won't be filled and gets rotated to passive after 1 day. In this patch, we are trying to handle the case where the topic has some data in the active segment but doesn't have continuous inflow of data. So, both the `segment.ms` and `segment.bytes` configs are applicable for tiered storage topics. > The second one is that you will be changing the definition of local.retention. Prior to this change it meant that closed segments will be served from local disk for at most this much size or time as long as they have been moved to tiered storage. Now it will mean that anything beyond this size and time will be found only in tiered storage. Isn't this a "public facing change" and thus requiring some announcements? This was the original plan and inline with the local-log segments. If a topic was deprecated and won't be having any more payload in future, the user will expect that all the data in that topic will be removed post the retention time, otherwise it can fail to meet certain compliance. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
