Re: [PR] KAFKA-19080 The constraint on segment.ms is not enforced at topic level [kafka]

2025-05-20 Thread via GitHub


m1a2st commented on code in PR #19371:
URL: https://github.com/apache/kafka/pull/19371#discussion_r2098206077


##
server-common/src/main/java/org/apache/kafka/server/config/ServerTopicConfigSynonyms.java:
##
@@ -47,9 +48,11 @@ public final class ServerTopicConfigSynonyms {
  * both the first and the second synonyms are configured, we will use only 
the value of
  * the first synonym and ignore the second.
  */
-// Topic configs with no mapping to a server config can be found in 
`LogConfig.CONFIGS_WITH_NO_SERVER_DEFAULTS`
 public static final Map> 
ALL_TOPIC_CONFIG_SYNONYMS = Utils.mkMap(
 sameNameWithLogPrefix(TopicConfig.SEGMENT_BYTES_CONFIG),
+// Due to the internal.segment.bytes is in storage module, thus we 
could not use the 
+// LogConfig#INTERNAL_SEGMENT_BYTES_CONFIG directly. We need to use 
the string value instead.
+sameNameWithInternalLogPrefix("internal.segment.bytes"),

Review Comment:
   Yes, you are right.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] KAFKA-19080 The constraint on segment.ms is not enforced at topic level [kafka]

2025-05-20 Thread via GitHub


chia7712 commented on code in PR #19371:
URL: https://github.com/apache/kafka/pull/19371#discussion_r2097986937


##
server-common/src/main/java/org/apache/kafka/server/config/ServerTopicConfigSynonyms.java:
##
@@ -47,9 +48,11 @@ public final class ServerTopicConfigSynonyms {
  * both the first and the second synonyms are configured, we will use only 
the value of
  * the first synonym and ignore the second.
  */
-// Topic configs with no mapping to a server config can be found in 
`LogConfig.CONFIGS_WITH_NO_SERVER_DEFAULTS`
 public static final Map> 
ALL_TOPIC_CONFIG_SYNONYMS = Utils.mkMap(
 sameNameWithLogPrefix(TopicConfig.SEGMENT_BYTES_CONFIG),
+// Due to the internal.segment.bytes is in storage module, thus we 
could not use the 
+// LogConfig#INTERNAL_SEGMENT_BYTES_CONFIG directly. We need to use 
the string value instead.
+sameNameWithInternalLogPrefix("internal.segment.bytes"),

Review Comment:
   @m1a2st 
   
   Do you mean this PR supports only topic-level internal config? If so, we 
can't use small segment for Kraft topic as it takes only server-level config. 
Do I understand correctly?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] KAFKA-19080 The constraint on segment.ms is not enforced at topic level [kafka]

2025-05-20 Thread via GitHub


m1a2st commented on code in PR #19371:
URL: https://github.com/apache/kafka/pull/19371#discussion_r2097948691


##
server-common/src/main/java/org/apache/kafka/server/config/ServerTopicConfigSynonyms.java:
##
@@ -47,9 +48,11 @@ public final class ServerTopicConfigSynonyms {
  * both the first and the second synonyms are configured, we will use only 
the value of
  * the first synonym and ignore the second.
  */
-// Topic configs with no mapping to a server config can be found in 
`LogConfig.CONFIGS_WITH_NO_SERVER_DEFAULTS`
 public static final Map> 
ALL_TOPIC_CONFIG_SYNONYMS = Utils.mkMap(
 sameNameWithLogPrefix(TopicConfig.SEGMENT_BYTES_CONFIG),
+// Due to the internal.segment.bytes is in storage module, thus we 
could not use the 
+// LogConfig#INTERNAL_SEGMENT_BYTES_CONFIG directly. We need to use 
the string value instead.
+sameNameWithInternalLogPrefix("internal.segment.bytes"),

Review Comment:
   In the previous solution, we required the server-level 
`internal.log.segment.bytes` to configure `MetadataLogConfig`.
   However, since we now always use `internal.segment.bytes` in the 
`KafkaMetadataLog.apply` method, this server-level config is no longer 
necessary and can be removed.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] KAFKA-19080 The constraint on segment.ms is not enforced at topic level [kafka]

2025-05-20 Thread via GitHub


chia7712 commented on code in PR #19371:
URL: https://github.com/apache/kafka/pull/19371#discussion_r2097427388


##
server-common/src/main/java/org/apache/kafka/server/config/ServerTopicConfigSynonyms.java:
##
@@ -47,9 +48,11 @@ public final class ServerTopicConfigSynonyms {
  * both the first and the second synonyms are configured, we will use only 
the value of
  * the first synonym and ignore the second.
  */
-// Topic configs with no mapping to a server config can be found in 
`LogConfig.CONFIGS_WITH_NO_SERVER_DEFAULTS`
 public static final Map> 
ALL_TOPIC_CONFIG_SYNONYMS = Utils.mkMap(
 sameNameWithLogPrefix(TopicConfig.SEGMENT_BYTES_CONFIG),
+// Due to the internal.segment.bytes is in storage module, thus we 
could not use the 
+// LogConfig#INTERNAL_SEGMENT_BYTES_CONFIG directly. We need to use 
the string value instead.
+sameNameWithInternalLogPrefix("internal.segment.bytes"),

Review Comment:
   We currently don't have small segment tests for Kraft topics. Therefore, a 
more effective way to streamline this PR would be to support topic-level 
configuration only. This means the new internal configuration 
(`internal.segment.bytes`) would apply exclusively to user topics.
   
   We can revisit this config for Kraft topics in the future if it needs.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] KAFKA-19080 The constraint on segment.ms is not enforced at topic level [kafka]

2025-05-19 Thread via GitHub


junrao commented on code in PR #19371:
URL: https://github.com/apache/kafka/pull/19371#discussion_r2096192551


##
server-common/src/main/java/org/apache/kafka/server/config/ServerTopicConfigSynonyms.java:
##
@@ -47,9 +48,11 @@ public final class ServerTopicConfigSynonyms {
  * both the first and the second synonyms are configured, we will use only 
the value of
  * the first synonym and ignore the second.
  */
-// Topic configs with no mapping to a server config can be found in 
`LogConfig.CONFIGS_WITH_NO_SERVER_DEFAULTS`
 public static final Map> 
ALL_TOPIC_CONFIG_SYNONYMS = Utils.mkMap(
 sameNameWithLogPrefix(TopicConfig.SEGMENT_BYTES_CONFIG),
+// Due to the internal.segment.bytes is in storage module, thus we 
could not use the 
+// LogConfig#INTERNAL_SEGMENT_BYTES_CONFIG directly. We need to use 
the string value instead.
+sameNameWithInternalLogPrefix("internal.segment.bytes"),

Review Comment:
   Is the intention to support `internal.segment.bytes` at the broker level? If 
so, should we add it to `KafkaConfig.extractLogConfigMap` too?
   
   Currently, if internal.log.segment.bytes is set at the broker, it won't be 
picked up by a new user topic, but it will be picked up by the kraft topic.
   
   Thinking a bit more. Do we need to support internal.segment.bytes at the 
broker level? For testing, is it enough to just support internal.segment.bytes 
at the topic level?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] KAFKA-19080 The constraint on segment.ms is not enforced at topic level [kafka]

2025-05-19 Thread via GitHub


m1a2st commented on PR #19371:
URL: https://github.com/apache/kafka/pull/19371#issuecomment-2890936867

   > It's useful to check if metadata.log.segment.min.bytes is used in any 
ducktape tests.
   
   I’ve checked all the tests in the Kafka project, and none of them use this 
configuration.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] KAFKA-19080 The constraint on segment.ms is not enforced at topic level [kafka]

2025-05-17 Thread via GitHub


m1a2st commented on code in PR #19371:
URL: https://github.com/apache/kafka/pull/19371#discussion_r2094016478


##
core/src/test/scala/unit/kafka/log/LogConfigTest.scala:
##
@@ -52,8 +52,13 @@ class LogConfigTest {
 assertTrue(LogConfig.configNames.asScala
   .filter(config => 
!LogConfig.CONFIGS_WITH_NO_SERVER_DEFAULTS.contains(config))
   .forall { config =>
-val serverConfigOpt = LogConfig.serverConfigName(config)
-serverConfigOpt.isPresent && (serverConfigOpt.get != null)
+// this internal config naming pattern is not as same as a default 
pattern
+if (config.equals(LogConfig.INTERNAL_SEGMENT_BYTES_CONFIG)) {

Review Comment:
   Thanks for information, I think we can remove this test .



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] KAFKA-19080 The constraint on segment.ms is not enforced at topic level [kafka]

2025-05-16 Thread via GitHub


chia7712 commented on code in PR #19371:
URL: https://github.com/apache/kafka/pull/19371#discussion_r2093848474


##
core/src/test/scala/unit/kafka/log/LogConfigTest.scala:
##
@@ -52,8 +52,13 @@ class LogConfigTest {
 assertTrue(LogConfig.configNames.asScala
   .filter(config => 
!LogConfig.CONFIGS_WITH_NO_SERVER_DEFAULTS.contains(config))
   .forall { config =>
-val serverConfigOpt = LogConfig.serverConfigName(config)
-serverConfigOpt.isPresent && (serverConfigOpt.get != null)
+// this internal config naming pattern is not as same as a default 
pattern
+if (config.equals(LogConfig.INTERNAL_SEGMENT_BYTES_CONFIG)) {

Review Comment:
   I guess this change is related to the comment 
(https://github.com/apache/kafka/pull/19371#discussion_r2091933863)
   
   Maybe we can remove the test `ensureNoStaticInitializationOrderDependency` 
and `CONFIGS_WITH_NO_SERVER_DEFAULTS`. 
   
   They were used to test the bug caused by the initialization order between 
`KafkaConfig` and `LogConfig` (see 
https://github.com/apache/kafka/commit/0ea540552a52b3265c48aacbb9790ea6e71431a5).
 
   Fortunately, the dependence was gone when rewriting `LogConfig` by java (see 
https://github.com/apache/kafka/commit/96d9710c17b34d4e80259f09845d21de66a5efaf)
 :)



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] KAFKA-19080 The constraint on segment.ms is not enforced at topic level [kafka]

2025-05-16 Thread via GitHub


junrao commented on PR #19371:
URL: https://github.com/apache/kafka/pull/19371#issuecomment-2887628228

   It's useful to check if metadata.log.segment.min.bytes is used in any 
ducktape tests.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] KAFKA-19080 The constraint on segment.ms is not enforced at topic level [kafka]

2025-05-16 Thread via GitHub


junrao commented on code in PR #19371:
URL: https://github.com/apache/kafka/pull/19371#discussion_r2093536869


##
core/src/test/scala/unit/kafka/log/LogConfigTest.scala:
##
@@ -52,8 +52,13 @@ class LogConfigTest {
 assertTrue(LogConfig.configNames.asScala
   .filter(config => 
!LogConfig.CONFIGS_WITH_NO_SERVER_DEFAULTS.contains(config))
   .forall { config =>
-val serverConfigOpt = LogConfig.serverConfigName(config)
-serverConfigOpt.isPresent && (serverConfigOpt.get != null)
+// this internal config naming pattern is not as same as a default 
pattern
+if (config.equals(LogConfig.INTERNAL_SEGMENT_BYTES_CONFIG)) {

Review Comment:
   Why do we need this? ALL_TOPIC_CONFIG_SYNONYMS includes 
INTERNAL_SEGMENT_BYTES_CONFIG.



##
server-common/src/main/java/org/apache/kafka/server/config/ServerTopicConfigSynonyms.java:
##
@@ -50,6 +50,7 @@ public final class ServerTopicConfigSynonyms {
 // Topic configs with no mapping to a server config can be found in 
`LogConfig.CONFIGS_WITH_NO_SERVER_DEFAULTS`
 public static final Map> 
ALL_TOPIC_CONFIG_SYNONYMS = Utils.mkMap(
 sameNameWithLogPrefix(TopicConfig.SEGMENT_BYTES_CONFIG),
+sameName("internal.log." + TopicConfig.SEGMENT_BYTES_CONFIG),

Review Comment:
   This is not really the same name. Could we add a method like 
sameNameWithInternalLogPrefix?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] KAFKA-19080 The constraint on segment.ms is not enforced at topic level [kafka]

2025-05-13 Thread via GitHub


m1a2st commented on code in PR #19371:
URL: https://github.com/apache/kafka/pull/19371#discussion_r2086829441


##
raft/src/main/java/org/apache/kafka/raft/MetadataLogConfig.java:
##
@@ -85,14 +82,13 @@ public class MetadataLogConfig {
 .define(METADATA_SNAPSHOT_MAX_INTERVAL_MS_CONFIG, LONG, 
METADATA_SNAPSHOT_MAX_INTERVAL_MS_DEFAULT, atLeast(0), HIGH, 
METADATA_SNAPSHOT_MAX_INTERVAL_MS_DOC)
 .define(METADATA_LOG_DIR_CONFIG, STRING, null, null, HIGH, 
METADATA_LOG_DIR_DOC)
 .define(METADATA_LOG_SEGMENT_BYTES_CONFIG, INT, 
METADATA_LOG_SEGMENT_BYTES_DEFAULT, atLeast(Records.LOG_OVERHEAD), HIGH, 
METADATA_LOG_SEGMENT_BYTES_DOC)
-.defineInternal(METADATA_LOG_SEGMENT_MIN_BYTES_CONFIG, INT, 
METADATA_LOG_SEGMENT_MIN_BYTES_DEFAULT, atLeast(Records.LOG_OVERHEAD), HIGH, 
METADATA_LOG_SEGMENT_MIN_BYTES_DOC)

Review Comment:
   I think we shouldn’t add this unless it’s actually needed by the test. Given 
that it’s an internal configuration, it might be more appropriate to introduce 
it only when a concrete use case arises.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] KAFKA-19080 The constraint on segment.ms is not enforced at topic level [kafka]

2025-05-13 Thread via GitHub


chia7712 commented on code in PR #19371:
URL: https://github.com/apache/kafka/pull/19371#discussion_r2086165049


##
raft/src/main/java/org/apache/kafka/raft/MetadataLogConfig.java:
##
@@ -85,14 +82,13 @@ public class MetadataLogConfig {
 .define(METADATA_SNAPSHOT_MAX_INTERVAL_MS_CONFIG, LONG, 
METADATA_SNAPSHOT_MAX_INTERVAL_MS_DEFAULT, atLeast(0), HIGH, 
METADATA_SNAPSHOT_MAX_INTERVAL_MS_DOC)
 .define(METADATA_LOG_DIR_CONFIG, STRING, null, null, HIGH, 
METADATA_LOG_DIR_DOC)
 .define(METADATA_LOG_SEGMENT_BYTES_CONFIG, INT, 
METADATA_LOG_SEGMENT_BYTES_DEFAULT, atLeast(Records.LOG_OVERHEAD), HIGH, 
METADATA_LOG_SEGMENT_BYTES_DOC)
-.defineInternal(METADATA_LOG_SEGMENT_MIN_BYTES_CONFIG, INT, 
METADATA_LOG_SEGMENT_MIN_BYTES_DEFAULT, atLeast(Records.LOG_OVERHEAD), HIGH, 
METADATA_LOG_SEGMENT_MIN_BYTES_DOC)

Review Comment:
   @m1a2st What do you think about the 
`METADATA_INTERNAL_LOG_SEGMENT_BYTES_CONFIG`? This PR currently add only 
`INTERNAL_LOG_SEGMENT_BYTES_CONFIG` to represent server-level log config. That 
means users can't configure "small" metadata log without impacting other logs. 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] KAFKA-19080 The constraint on segment.ms is not enforced at topic level [kafka]

2025-05-12 Thread via GitHub


m1a2st commented on code in PR #19371:
URL: https://github.com/apache/kafka/pull/19371#discussion_r2084656210


##
streams/src/test/java/org/apache/kafka/streams/StreamsConfigTest.java:
##
@@ -248,7 +247,7 @@ public void 
consumerConfigMustContainStreamPartitionAssignorConfig() {
 props.put(StreamsConfig.PROBING_REBALANCE_INTERVAL_MS_CONFIG, 99_999L);
 
props.put(StreamsConfig.WINDOW_STORE_CHANGE_LOG_ADDITIONAL_RETENTION_MS_CONFIG, 
7L);
 props.put(StreamsConfig.APPLICATION_SERVER_CONFIG, "dummy:host");
-props.put(StreamsConfig.topicPrefix(TopicConfig.SEGMENT_BYTES_CONFIG), 
100);
+props.put(StreamsConfig.topicPrefix("internal.segment.bytes"), 100);

Review Comment:
   We can't add the storage dependency to the Streams module because the 
Streams module targets Java 11, while the Storage module requires Java 17.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] KAFKA-19080 The constraint on segment.ms is not enforced at topic level [kafka]

2025-05-12 Thread via GitHub


m1a2st commented on PR #19371:
URL: https://github.com/apache/kafka/pull/19371#issuecomment-2872391463

   Hello @chia7712, @junrao, Sorry for the late reply.
   
   I’ve updated the patch. Developers can still use the internal config at both 
the server and topic levels. I also updated the logic so that if a user 
configures the topic-level config without specifying the internal one, we will 
no longer return the internal config to the user.
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] KAFKA-19080 The constraint on segment.ms is not enforced at topic level [kafka]

2025-04-26 Thread via GitHub


chia7712 commented on code in PR #19371:
URL: https://github.com/apache/kafka/pull/19371#discussion_r2061411092


##
core/src/test/scala/integration/kafka/api/SaslSslAdminIntegrationTest.scala:
##
@@ -567,7 +568,7 @@ class SaslSslAdminIntegrationTest extends 
BaseAdminIntegrationTest with SaslSetu
 client.createAcls(List(denyAcl).asJava, new 
CreateAclsOptions()).all().get()
 
 val topics = Seq(topic1, topic2)
-val configsOverride = Map(TopicConfig.SEGMENT_BYTES_CONFIG -> 
"10").asJava
+val configsOverride = Map(LogConfig.INTERNAL_SEGMENT_BYTES_CONFIG -> 
"10").asJava

Review Comment:
   this test case is used to verify the custom topic-level config, so we can 
increase the value to make test pass.



##
raft/src/main/java/org/apache/kafka/raft/MetadataLogConfig.java:
##
@@ -52,14 +51,14 @@ public class MetadataLogConfig {
 "configuration. The Kafka node will generate a snapshot when 
either the maximum time interval is reached or the " +
 "maximum bytes limit is reached.";
 
-public static final String METADATA_LOG_SEGMENT_MIN_BYTES_CONFIG = 
"metadata.log.segment.min.bytes";
-public static final String METADATA_LOG_SEGMENT_MIN_BYTES_DOC = "Override 
the minimum size for a single metadata log file. This should be used for 
testing only.";
-public static final int METADATA_LOG_SEGMENT_MIN_BYTES_DEFAULT = 8 * 1024 
* 1024;
-
 public static final String METADATA_LOG_SEGMENT_BYTES_CONFIG = 
"metadata.log.segment.bytes";
 public static final String METADATA_LOG_SEGMENT_BYTES_DOC = "The maximum 
size of a single metadata log file.";
 public static final int METADATA_LOG_SEGMENT_BYTES_DEFAULT = 1024 * 1024 * 
1024;
 
+public static final String INTERNAL_METADATA_LOG_SEGMENT_BYTES_CONFIG = 
"internal.metadata.log.segment.bytes";

Review Comment:
   Do we need another internal config? User can configure 
`internal.segment.bytes` if they want to create small segment, right?



##
core/src/test/scala/integration/kafka/api/SaslSslAdminIntegrationTest.scala:
##
@@ -601,7 +602,7 @@ class SaslSslAdminIntegrationTest extends 
BaseAdminIntegrationTest with SaslSetu
 assertNotEquals(Uuid.ZERO_UUID, createResult.topicId(topic1).get())
 assertEquals(topicIds(topic1), createResult.topicId(topic1).get())
 assertFutureThrows(classOf[TopicAuthorizationException], 
createResult.topicId(topic2))
-
+

Review Comment:
   please revert unrelated change



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] KAFKA-19080 The constraint on segment.ms is not enforced at topic level [kafka]

2025-04-23 Thread via GitHub


chia7712 commented on code in PR #19371:
URL: https://github.com/apache/kafka/pull/19371#discussion_r2056483125


##
raft/src/main/java/org/apache/kafka/raft/MetadataLogConfig.java:
##
@@ -85,14 +82,13 @@ public class MetadataLogConfig {
 .define(METADATA_SNAPSHOT_MAX_INTERVAL_MS_CONFIG, LONG, 
METADATA_SNAPSHOT_MAX_INTERVAL_MS_DEFAULT, atLeast(0), HIGH, 
METADATA_SNAPSHOT_MAX_INTERVAL_MS_DOC)
 .define(METADATA_LOG_DIR_CONFIG, STRING, null, null, HIGH, 
METADATA_LOG_DIR_DOC)
 .define(METADATA_LOG_SEGMENT_BYTES_CONFIG, INT, 
METADATA_LOG_SEGMENT_BYTES_DEFAULT, atLeast(Records.LOG_OVERHEAD), HIGH, 
METADATA_LOG_SEGMENT_BYTES_DOC)
-.defineInternal(METADATA_LOG_SEGMENT_MIN_BYTES_CONFIG, INT, 
METADATA_LOG_SEGMENT_MIN_BYTES_DEFAULT, atLeast(Records.LOG_OVERHEAD), HIGH, 
METADATA_LOG_SEGMENT_MIN_BYTES_DOC)

Review Comment:
   @junrao I have discussed with @m1a2st offline, and he will update the PR 
tomorrow



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] KAFKA-19080 The constraint on segment.ms is not enforced at topic level [kafka]

2025-04-21 Thread via GitHub


junrao commented on code in PR #19371:
URL: https://github.com/apache/kafka/pull/19371#discussion_r2052715308


##
core/src/main/scala/kafka/raft/KafkaMetadataLog.scala:
##
@@ -585,32 +585,62 @@ object KafkaMetadataLog extends Logging {
 config: MetadataLogConfig,
 nodeId: Int
   ): KafkaMetadataLog = {
-val props = new Properties()
-props.setProperty(TopicConfig.MAX_MESSAGE_BYTES_CONFIG, 
config.maxBatchSizeInBytes.toString)
-props.setProperty(TopicConfig.SEGMENT_BYTES_CONFIG, 
config.logSegmentBytes.toString)
-props.setProperty(TopicConfig.SEGMENT_MS_CONFIG, 
config.logSegmentMillis.toString)
-props.setProperty(TopicConfig.FILE_DELETE_DELAY_MS_CONFIG, 
ServerLogConfigs.LOG_DELETE_DELAY_MS_DEFAULT.toString)
-
-// Disable time and byte retention when deleting segments
-props.setProperty(TopicConfig.RETENTION_MS_CONFIG, "-1")
-props.setProperty(TopicConfig.RETENTION_BYTES_CONFIG, "-1")
+val props: Properties = settingLogProperties(config)
 LogConfig.validate(props)
 val defaultLogConfig = new LogConfig(props)
 
-if (config.logSegmentBytes < config.logSegmentMinBytes) {
-  throw new InvalidConfigurationException(
-s"Cannot set ${MetadataLogConfig.METADATA_LOG_SEGMENT_BYTES_CONFIG} 
below ${config.logSegmentMinBytes}: ${config.logSegmentBytes}"
-  )
-} else if (defaultLogConfig.retentionMs >= 0) {
-  throw new InvalidConfigurationException(
-s"Cannot set ${TopicConfig.RETENTION_MS_CONFIG} above -1: 
${defaultLogConfig.retentionMs}."
-  )
-} else if (defaultLogConfig.retentionSize >= 0) {
-  throw new InvalidConfigurationException(
-s"Cannot set ${TopicConfig.RETENTION_BYTES_CONFIG} above -1: 
${defaultLogConfig.retentionSize}."
-  )
+validateConfig(defaultLogConfig)
+
+val metadataLog: KafkaMetadataLog = createMetadataLog(topicPartition, 
topicId, dataDir, time, scheduler, config, nodeId, defaultLogConfig)
+
+// Print a warning if users have overridden the internal config
+if (config.logSegmentBytes() != KafkaRaftClient.MAX_BATCH_SIZE_BYTES) {

Review Comment:
   Hmm,  we don't need this if we add the constraint directly to 
METADATA_LOG_SEGMENT_BYTES_CONFIG, right?



##
raft/src/main/java/org/apache/kafka/raft/MetadataLogConfig.java:
##
@@ -112,15 +108,15 @@ public class MetadataLogConfig {
  * @param deleteDelayMillis The amount of time to wait before deleting a 
file from the filesystem
  */
 public MetadataLogConfig(int logSegmentBytes,
- int logSegmentMinBytes,
+ int internalLogSegmentMinBytes,

Review Comment:
   Hmm, this should be renamed to internalLogSegmentBytes since it's no longer 
the minimum, right?



##
raft/src/main/java/org/apache/kafka/raft/MetadataLogConfig.java:
##
@@ -85,14 +82,13 @@ public class MetadataLogConfig {
 .define(METADATA_SNAPSHOT_MAX_INTERVAL_MS_CONFIG, LONG, 
METADATA_SNAPSHOT_MAX_INTERVAL_MS_DEFAULT, atLeast(0), HIGH, 
METADATA_SNAPSHOT_MAX_INTERVAL_MS_DOC)
 .define(METADATA_LOG_DIR_CONFIG, STRING, null, null, HIGH, 
METADATA_LOG_DIR_DOC)
 .define(METADATA_LOG_SEGMENT_BYTES_CONFIG, INT, 
METADATA_LOG_SEGMENT_BYTES_DEFAULT, atLeast(Records.LOG_OVERHEAD), HIGH, 
METADATA_LOG_SEGMENT_BYTES_DOC)
-.defineInternal(METADATA_LOG_SEGMENT_MIN_BYTES_CONFIG, INT, 
METADATA_LOG_SEGMENT_MIN_BYTES_DEFAULT, atLeast(Records.LOG_OVERHEAD), HIGH, 
METADATA_LOG_SEGMENT_MIN_BYTES_DOC)

Review Comment:
   We need to set the constraint for METADATA_LOG_SEGMENT_BYTES_CONFIG to be at 
least 8MB. 
   
   Also, I thought the plan is to remove METADATA_LOG_SEGMENT_MIN_BYTES_CONFIG, 
but add sth like METADATA_INTERNAL_LOG_SEGMENT_BYTES_CONFIG to match the design 
in LogConfig?



##
core/src/main/scala/kafka/raft/KafkaMetadataLog.scala:
##
@@ -583,35 +584,72 @@ object KafkaMetadataLog extends Logging {
 scheduler: Scheduler,
 config: MetadataLogConfig
   ): KafkaMetadataLog = {
-val props = new Properties()
-props.setProperty(TopicConfig.MAX_MESSAGE_BYTES_CONFIG, 
config.maxBatchSizeInBytes.toString)
-props.setProperty(TopicConfig.SEGMENT_BYTES_CONFIG, 
config.logSegmentBytes.toString)
-props.setProperty(TopicConfig.SEGMENT_MS_CONFIG, 
config.logSegmentMillis.toString)
-props.setProperty(TopicConfig.FILE_DELETE_DELAY_MS_CONFIG, 
ServerLogConfigs.LOG_DELETE_DELAY_MS_DEFAULT.toString)
-
-// Disable time and byte retention when deleting segments
-props.setProperty(TopicConfig.RETENTION_MS_CONFIG, "-1")
-props.setProperty(TopicConfig.RETENTION_BYTES_CONFIG, "-1")
+val props: Properties = settingLogProperties(config)
 LogConfig.validate(props)
 val defaultLogConfig = new LogConfig(props)
 
-if (config.logSegmentBytes < config.logSegmentMinBytes) {
-  throw new InvalidConfigurationException(
-s"Cannot set ${KRaftConfigs.METADATA_LOG_S

Re: [PR] KAFKA-19080 The constraint on segment.ms is not enforced at topic level [kafka]

2025-04-20 Thread via GitHub


m1a2st commented on code in PR #19371:
URL: https://github.com/apache/kafka/pull/19371#discussion_r2051911610


##
core/src/main/scala/kafka/raft/KafkaMetadataLog.scala:
##
@@ -583,35 +584,72 @@ object KafkaMetadataLog extends Logging {
 scheduler: Scheduler,
 config: MetadataLogConfig
   ): KafkaMetadataLog = {
-val props = new Properties()
-props.setProperty(TopicConfig.MAX_MESSAGE_BYTES_CONFIG, 
config.maxBatchSizeInBytes.toString)
-props.setProperty(TopicConfig.SEGMENT_BYTES_CONFIG, 
config.logSegmentBytes.toString)
-props.setProperty(TopicConfig.SEGMENT_MS_CONFIG, 
config.logSegmentMillis.toString)
-props.setProperty(TopicConfig.FILE_DELETE_DELAY_MS_CONFIG, 
ServerLogConfigs.LOG_DELETE_DELAY_MS_DEFAULT.toString)
-
-// Disable time and byte retention when deleting segments
-props.setProperty(TopicConfig.RETENTION_MS_CONFIG, "-1")
-props.setProperty(TopicConfig.RETENTION_BYTES_CONFIG, "-1")
+val props: Properties = settingLogProperties(config)
 LogConfig.validate(props)
 val defaultLogConfig = new LogConfig(props)
 
-if (config.logSegmentBytes < config.logSegmentMinBytes) {
-  throw new InvalidConfigurationException(
-s"Cannot set ${KRaftConfigs.METADATA_LOG_SEGMENT_BYTES_CONFIG} below 
${config.logSegmentMinBytes}: ${config.logSegmentBytes}"
-  )
-} else if (defaultLogConfig.retentionMs >= 0) {
-  throw new InvalidConfigurationException(
-s"Cannot set ${TopicConfig.RETENTION_MS_CONFIG} above -1: 
${defaultLogConfig.retentionMs}."
-  )
-} else if (defaultLogConfig.retentionSize >= 0) {
-  throw new InvalidConfigurationException(
-s"Cannot set ${TopicConfig.RETENTION_BYTES_CONFIG} above -1: 
${defaultLogConfig.retentionSize}."
-  )
+validateConfig(config, defaultLogConfig)
+
+val metadataLog: KafkaMetadataLog = createKafkaMetadataLog(topicPartition, 
topicId, dataDir, time, scheduler, config, defaultLogConfig)
+
+printWarningMessage(config, metadataLog)
+
+// When recovering, truncate fully if the latest snapshot is after the log 
end offset. This can happen to a follower
+// when the follower crashes after downloading a snapshot from the leader 
but before it could truncate the log fully.
+metadataLog.truncateToLatestSnapshot()
+
+metadataLog
+  }
+
+  private def printWarningMessage(config: MetadataLogConfig, metadataLog: 
KafkaMetadataLog): Unit = {
+// Print a warning if users have overridden the internal config
+if (config.logSegmentMinBytes != KafkaRaftClient.MAX_BATCH_SIZE_BYTES) {
+  metadataLog.error(s"Overriding 
${KRaftConfigs.METADATA_LOG_SEGMENT_MIN_BYTES_CONFIG} is only supported for 
testing. Setting " +
+s"this value too low may lead to an inability to write batches of 
metadata records.")
 }
+  }
+
+  // visible for testing
+  def internalApply(

Review Comment:
   Thanks for @chia7712, @junrao comments, addressed it :)



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] KAFKA-19080 The constraint on segment.ms is not enforced at topic level [kafka]

2025-04-17 Thread via GitHub


chia7712 commented on code in PR #19371:
URL: https://github.com/apache/kafka/pull/19371#discussion_r2049095738


##
core/src/main/scala/kafka/raft/KafkaMetadataLog.scala:
##
@@ -583,35 +584,72 @@ object KafkaMetadataLog extends Logging {
 scheduler: Scheduler,
 config: MetadataLogConfig
   ): KafkaMetadataLog = {
-val props = new Properties()
-props.setProperty(TopicConfig.MAX_MESSAGE_BYTES_CONFIG, 
config.maxBatchSizeInBytes.toString)
-props.setProperty(TopicConfig.SEGMENT_BYTES_CONFIG, 
config.logSegmentBytes.toString)
-props.setProperty(TopicConfig.SEGMENT_MS_CONFIG, 
config.logSegmentMillis.toString)
-props.setProperty(TopicConfig.FILE_DELETE_DELAY_MS_CONFIG, 
ServerLogConfigs.LOG_DELETE_DELAY_MS_DEFAULT.toString)
-
-// Disable time and byte retention when deleting segments
-props.setProperty(TopicConfig.RETENTION_MS_CONFIG, "-1")
-props.setProperty(TopicConfig.RETENTION_BYTES_CONFIG, "-1")
+val props: Properties = settingLogProperties(config)
 LogConfig.validate(props)
 val defaultLogConfig = new LogConfig(props)
 
-if (config.logSegmentBytes < config.logSegmentMinBytes) {
-  throw new InvalidConfigurationException(
-s"Cannot set ${KRaftConfigs.METADATA_LOG_SEGMENT_BYTES_CONFIG} below 
${config.logSegmentMinBytes}: ${config.logSegmentBytes}"
-  )
-} else if (defaultLogConfig.retentionMs >= 0) {
-  throw new InvalidConfigurationException(
-s"Cannot set ${TopicConfig.RETENTION_MS_CONFIG} above -1: 
${defaultLogConfig.retentionMs}."
-  )
-} else if (defaultLogConfig.retentionSize >= 0) {
-  throw new InvalidConfigurationException(
-s"Cannot set ${TopicConfig.RETENTION_BYTES_CONFIG} above -1: 
${defaultLogConfig.retentionSize}."
-  )
+validateConfig(config, defaultLogConfig)
+
+val metadataLog: KafkaMetadataLog = createKafkaMetadataLog(topicPartition, 
topicId, dataDir, time, scheduler, config, defaultLogConfig)
+
+printWarningMessage(config, metadataLog)
+
+// When recovering, truncate fully if the latest snapshot is after the log 
end offset. This can happen to a follower
+// when the follower crashes after downloading a snapshot from the leader 
but before it could truncate the log fully.
+metadataLog.truncateToLatestSnapshot()
+
+metadataLog
+  }
+
+  private def printWarningMessage(config: MetadataLogConfig, metadataLog: 
KafkaMetadataLog): Unit = {
+// Print a warning if users have overridden the internal config
+if (config.logSegmentMinBytes != KafkaRaftClient.MAX_BATCH_SIZE_BYTES) {
+  metadataLog.error(s"Overriding 
${KRaftConfigs.METADATA_LOG_SEGMENT_MIN_BYTES_CONFIG} is only supported for 
testing. Setting " +
+s"this value too low may lead to an inability to write batches of 
metadata records.")
 }
+  }
+
+  // visible for testing
+  def internalApply(

Review Comment:
   Personally, I prefer the approach of adding "internal".xxx config as it 
provide better user experience for public configs, allowing users to see the 
"correct" min value. Additionally, we can remove the customized logic of 
validation.
   
   In short, I suggest to add following changes to this PR.
   
   1. remove `METADATA_LOG_SEGMENT_MIN_BYTES_CONFIG`
   2. remove `MetadataLogConfig#logSegmentMinBytes`
   3. add `internal.metadata.log.segment.bytes`
   4. customize `MetadataLogConfig#logSegmentBytes` as following code
   ```java
   public int logSegmentBytes() {
   if (internalSogSegmentBytes != null) return internalSogSegmentBytes;
   return logSegmentBytes;
   }
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] KAFKA-19080 The constraint on segment.ms is not enforced at topic level [kafka]

2025-04-15 Thread via GitHub


junrao commented on code in PR #19371:
URL: https://github.com/apache/kafka/pull/19371#discussion_r2046016358


##
core/src/main/scala/kafka/raft/KafkaMetadataLog.scala:
##
@@ -583,35 +584,72 @@ object KafkaMetadataLog extends Logging {
 scheduler: Scheduler,
 config: MetadataLogConfig
   ): KafkaMetadataLog = {
-val props = new Properties()
-props.setProperty(TopicConfig.MAX_MESSAGE_BYTES_CONFIG, 
config.maxBatchSizeInBytes.toString)
-props.setProperty(TopicConfig.SEGMENT_BYTES_CONFIG, 
config.logSegmentBytes.toString)
-props.setProperty(TopicConfig.SEGMENT_MS_CONFIG, 
config.logSegmentMillis.toString)
-props.setProperty(TopicConfig.FILE_DELETE_DELAY_MS_CONFIG, 
ServerLogConfigs.LOG_DELETE_DELAY_MS_DEFAULT.toString)
-
-// Disable time and byte retention when deleting segments
-props.setProperty(TopicConfig.RETENTION_MS_CONFIG, "-1")
-props.setProperty(TopicConfig.RETENTION_BYTES_CONFIG, "-1")
+val props: Properties = settingLogProperties(config)
 LogConfig.validate(props)
 val defaultLogConfig = new LogConfig(props)
 
-if (config.logSegmentBytes < config.logSegmentMinBytes) {
-  throw new InvalidConfigurationException(
-s"Cannot set ${KRaftConfigs.METADATA_LOG_SEGMENT_BYTES_CONFIG} below 
${config.logSegmentMinBytes}: ${config.logSegmentBytes}"
-  )
-} else if (defaultLogConfig.retentionMs >= 0) {
-  throw new InvalidConfigurationException(
-s"Cannot set ${TopicConfig.RETENTION_MS_CONFIG} above -1: 
${defaultLogConfig.retentionMs}."
-  )
-} else if (defaultLogConfig.retentionSize >= 0) {
-  throw new InvalidConfigurationException(
-s"Cannot set ${TopicConfig.RETENTION_BYTES_CONFIG} above -1: 
${defaultLogConfig.retentionSize}."
-  )
+validateConfig(config, defaultLogConfig)
+
+val metadataLog: KafkaMetadataLog = createKafkaMetadataLog(topicPartition, 
topicId, dataDir, time, scheduler, config, defaultLogConfig)
+
+printWarningMessage(config, metadataLog)
+
+// When recovering, truncate fully if the latest snapshot is after the log 
end offset. This can happen to a follower
+// when the follower crashes after downloading a snapshot from the leader 
but before it could truncate the log fully.
+metadataLog.truncateToLatestSnapshot()
+
+metadataLog
+  }
+
+  private def printWarningMessage(config: MetadataLogConfig, metadataLog: 
KafkaMetadataLog): Unit = {
+// Print a warning if users have overridden the internal config
+if (config.logSegmentMinBytes != KafkaRaftClient.MAX_BATCH_SIZE_BYTES) {
+  metadataLog.error(s"Overriding 
${KRaftConfigs.METADATA_LOG_SEGMENT_MIN_BYTES_CONFIG} is only supported for 
testing. Setting " +
+s"this value too low may lead to an inability to write batches of 
metadata records.")
 }
+  }
+
+  // visible for testing
+  def internalApply(

Review Comment:
   I am just saying that we now have two different ways to achieve the same 
goal. In the metadata log approach, you set the desired value through the 
original config, which is segment.bytes. You then set an internal config to 
change the min constraint.
   
   The approach in this PR is to set the desired value through a different 
internal config. 
   
   It would be useful to choose same approach for both the metadata log and the 
regular log. 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] KAFKA-19080 The constraint on segment.ms is not enforced at topic level [kafka]

2025-04-15 Thread via GitHub


chia7712 commented on code in PR #19371:
URL: https://github.com/apache/kafka/pull/19371#discussion_r2045952460


##
core/src/main/scala/kafka/raft/KafkaMetadataLog.scala:
##
@@ -583,35 +584,72 @@ object KafkaMetadataLog extends Logging {
 scheduler: Scheduler,
 config: MetadataLogConfig
   ): KafkaMetadataLog = {
-val props = new Properties()
-props.setProperty(TopicConfig.MAX_MESSAGE_BYTES_CONFIG, 
config.maxBatchSizeInBytes.toString)
-props.setProperty(TopicConfig.SEGMENT_BYTES_CONFIG, 
config.logSegmentBytes.toString)
-props.setProperty(TopicConfig.SEGMENT_MS_CONFIG, 
config.logSegmentMillis.toString)
-props.setProperty(TopicConfig.FILE_DELETE_DELAY_MS_CONFIG, 
ServerLogConfigs.LOG_DELETE_DELAY_MS_DEFAULT.toString)
-
-// Disable time and byte retention when deleting segments
-props.setProperty(TopicConfig.RETENTION_MS_CONFIG, "-1")
-props.setProperty(TopicConfig.RETENTION_BYTES_CONFIG, "-1")
+val props: Properties = settingLogProperties(config)
 LogConfig.validate(props)
 val defaultLogConfig = new LogConfig(props)
 
-if (config.logSegmentBytes < config.logSegmentMinBytes) {
-  throw new InvalidConfigurationException(
-s"Cannot set ${KRaftConfigs.METADATA_LOG_SEGMENT_BYTES_CONFIG} below 
${config.logSegmentMinBytes}: ${config.logSegmentBytes}"
-  )
-} else if (defaultLogConfig.retentionMs >= 0) {
-  throw new InvalidConfigurationException(
-s"Cannot set ${TopicConfig.RETENTION_MS_CONFIG} above -1: 
${defaultLogConfig.retentionMs}."
-  )
-} else if (defaultLogConfig.retentionSize >= 0) {
-  throw new InvalidConfigurationException(
-s"Cannot set ${TopicConfig.RETENTION_BYTES_CONFIG} above -1: 
${defaultLogConfig.retentionSize}."
-  )
+validateConfig(config, defaultLogConfig)
+
+val metadataLog: KafkaMetadataLog = createKafkaMetadataLog(topicPartition, 
topicId, dataDir, time, scheduler, config, defaultLogConfig)
+
+printWarningMessage(config, metadataLog)
+
+// When recovering, truncate fully if the latest snapshot is after the log 
end offset. This can happen to a follower
+// when the follower crashes after downloading a snapshot from the leader 
but before it could truncate the log fully.
+metadataLog.truncateToLatestSnapshot()
+
+metadataLog
+  }
+
+  private def printWarningMessage(config: MetadataLogConfig, metadataLog: 
KafkaMetadataLog): Unit = {
+// Print a warning if users have overridden the internal config
+if (config.logSegmentMinBytes != KafkaRaftClient.MAX_BATCH_SIZE_BYTES) {
+  metadataLog.error(s"Overriding 
${KRaftConfigs.METADATA_LOG_SEGMENT_MIN_BYTES_CONFIG} is only supported for 
testing. Setting " +
+s"this value too low may lead to an inability to write batches of 
metadata records.")
 }
+  }
+
+  // visible for testing
+  def internalApply(

Review Comment:
   Excuse me, the strategy used by metadata log is to add a "internal" config 
(`METADATA_LOG_SEGMENT_MIN_BYTES_CONFIG`) to change the (metadata) segment size 
in testing, and that is what we want to address in this PR - we add a 
"internal" config for regular log, and so the test can use the "smaller" 
segment size.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] KAFKA-19080 The constraint on segment.ms is not enforced at topic level [kafka]

2025-04-15 Thread via GitHub


junrao commented on code in PR #19371:
URL: https://github.com/apache/kafka/pull/19371#discussion_r2045280340


##
core/src/main/scala/kafka/raft/KafkaMetadataLog.scala:
##
@@ -583,35 +584,72 @@ object KafkaMetadataLog extends Logging {
 scheduler: Scheduler,
 config: MetadataLogConfig
   ): KafkaMetadataLog = {
-val props = new Properties()
-props.setProperty(TopicConfig.MAX_MESSAGE_BYTES_CONFIG, 
config.maxBatchSizeInBytes.toString)
-props.setProperty(TopicConfig.SEGMENT_BYTES_CONFIG, 
config.logSegmentBytes.toString)
-props.setProperty(TopicConfig.SEGMENT_MS_CONFIG, 
config.logSegmentMillis.toString)
-props.setProperty(TopicConfig.FILE_DELETE_DELAY_MS_CONFIG, 
ServerLogConfigs.LOG_DELETE_DELAY_MS_DEFAULT.toString)
-
-// Disable time and byte retention when deleting segments
-props.setProperty(TopicConfig.RETENTION_MS_CONFIG, "-1")
-props.setProperty(TopicConfig.RETENTION_BYTES_CONFIG, "-1")
+val props: Properties = settingLogProperties(config)
 LogConfig.validate(props)
 val defaultLogConfig = new LogConfig(props)
 
-if (config.logSegmentBytes < config.logSegmentMinBytes) {
-  throw new InvalidConfigurationException(
-s"Cannot set ${KRaftConfigs.METADATA_LOG_SEGMENT_BYTES_CONFIG} below 
${config.logSegmentMinBytes}: ${config.logSegmentBytes}"
-  )
-} else if (defaultLogConfig.retentionMs >= 0) {
-  throw new InvalidConfigurationException(
-s"Cannot set ${TopicConfig.RETENTION_MS_CONFIG} above -1: 
${defaultLogConfig.retentionMs}."
-  )
-} else if (defaultLogConfig.retentionSize >= 0) {
-  throw new InvalidConfigurationException(
-s"Cannot set ${TopicConfig.RETENTION_BYTES_CONFIG} above -1: 
${defaultLogConfig.retentionSize}."
-  )
+validateConfig(config, defaultLogConfig)
+
+val metadataLog: KafkaMetadataLog = createKafkaMetadataLog(topicPartition, 
topicId, dataDir, time, scheduler, config, defaultLogConfig)
+
+printWarningMessage(config, metadataLog)
+
+// When recovering, truncate fully if the latest snapshot is after the log 
end offset. This can happen to a follower
+// when the follower crashes after downloading a snapshot from the leader 
but before it could truncate the log fully.
+metadataLog.truncateToLatestSnapshot()
+
+metadataLog
+  }
+
+  private def printWarningMessage(config: MetadataLogConfig, metadataLog: 
KafkaMetadataLog): Unit = {
+// Print a warning if users have overridden the internal config
+if (config.logSegmentMinBytes != KafkaRaftClient.MAX_BATCH_SIZE_BYTES) {
+  metadataLog.error(s"Overriding 
${KRaftConfigs.METADATA_LOG_SEGMENT_MIN_BYTES_CONFIG} is only supported for 
testing. Setting " +
+s"this value too low may lead to an inability to write batches of 
metadata records.")
 }
+  }
+
+  // visible for testing
+  def internalApply(

Review Comment:
   I just realized that metadata log uses a different approach to allow tests 
to use a smaller segment bytes than allowed in production. That approach 
defines the original segment byte config with a small minimal requirement, but 
adds METADATA_LOG_SEGMENT_MIN_BYTES_CONFIG to enforce the actual minimal 
requirement in production. This new config could be changed in tests to allow 
for smaller minimal bytes. The benefit of this approach is that it allows the 
existing config to be used directly to set a smaller value for tests. The 
downside is that the doc for min value is inaccurate and the validation is done 
through a customized logic.
   
   It would be useful to pick the same strategy between metadata log and 
regular log. The metadata log approach seems slightly better since it's less 
intrusive. We could fix the inaccurate min value description for production 
somehow.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] KAFKA-19080 The constraint on segment.ms is not enforced at topic level [kafka]

2025-04-15 Thread via GitHub


chia7712 commented on code in PR #19371:
URL: https://github.com/apache/kafka/pull/19371#discussion_r2045151038


##
core/src/main/scala/kafka/raft/KafkaMetadataLog.scala:
##
@@ -583,35 +584,72 @@ object KafkaMetadataLog extends Logging {
 scheduler: Scheduler,
 config: MetadataLogConfig
   ): KafkaMetadataLog = {
-val props = new Properties()
-props.setProperty(TopicConfig.MAX_MESSAGE_BYTES_CONFIG, 
config.maxBatchSizeInBytes.toString)
-props.setProperty(TopicConfig.SEGMENT_BYTES_CONFIG, 
config.logSegmentBytes.toString)
-props.setProperty(TopicConfig.SEGMENT_MS_CONFIG, 
config.logSegmentMillis.toString)
-props.setProperty(TopicConfig.FILE_DELETE_DELAY_MS_CONFIG, 
ServerLogConfigs.LOG_DELETE_DELAY_MS_DEFAULT.toString)
-
-// Disable time and byte retention when deleting segments
-props.setProperty(TopicConfig.RETENTION_MS_CONFIG, "-1")
-props.setProperty(TopicConfig.RETENTION_BYTES_CONFIG, "-1")
+val props: Properties = settingLogProperties(config)
 LogConfig.validate(props)
 val defaultLogConfig = new LogConfig(props)
 
-if (config.logSegmentBytes < config.logSegmentMinBytes) {
-  throw new InvalidConfigurationException(
-s"Cannot set ${KRaftConfigs.METADATA_LOG_SEGMENT_BYTES_CONFIG} below 
${config.logSegmentMinBytes}: ${config.logSegmentBytes}"
-  )
-} else if (defaultLogConfig.retentionMs >= 0) {
-  throw new InvalidConfigurationException(
-s"Cannot set ${TopicConfig.RETENTION_MS_CONFIG} above -1: 
${defaultLogConfig.retentionMs}."
-  )
-} else if (defaultLogConfig.retentionSize >= 0) {
-  throw new InvalidConfigurationException(
-s"Cannot set ${TopicConfig.RETENTION_BYTES_CONFIG} above -1: 
${defaultLogConfig.retentionSize}."
-  )
+validateConfig(config, defaultLogConfig)
+
+val metadataLog: KafkaMetadataLog = createKafkaMetadataLog(topicPartition, 
topicId, dataDir, time, scheduler, config, defaultLogConfig)
+
+printWarningMessage(config, metadataLog)
+
+// When recovering, truncate fully if the latest snapshot is after the log 
end offset. This can happen to a follower
+// when the follower crashes after downloading a snapshot from the leader 
but before it could truncate the log fully.
+metadataLog.truncateToLatestSnapshot()
+
+metadataLog
+  }
+
+  private def printWarningMessage(config: MetadataLogConfig, metadataLog: 
KafkaMetadataLog): Unit = {
+// Print a warning if users have overridden the internal config
+if (config.logSegmentMinBytes != KafkaRaftClient.MAX_BATCH_SIZE_BYTES) {
+  metadataLog.error(s"Overriding 
${KRaftConfigs.METADATA_LOG_SEGMENT_MIN_BYTES_CONFIG} is only supported for 
testing. Setting " +
+s"this value too low may lead to an inability to write batches of 
metadata records.")
 }
+  }
+
+  // visible for testing
+  def internalApply(

Review Comment:
   +1 to move the internal config to `MetadataLogConfig`, and it would be 
better to wait #19465 extracting the metadata-related configs from other class 
to `MetadataLogConfig`



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] KAFKA-19080 The constraint on segment.ms is not enforced at topic level [kafka]

2025-04-15 Thread via GitHub


junrao commented on code in PR #19371:
URL: https://github.com/apache/kafka/pull/19371#discussion_r2043103248


##
storage/src/main/java/org/apache/kafka/storage/internals/log/LogConfig.java:
##
@@ -186,12 +188,13 @@ public Optional serverConfigName(String 
configName) {
 .define(ServerLogConfigs.CREATE_TOPIC_POLICY_CLASS_NAME_CONFIG, 
CLASS, null, LOW, ServerLogConfigs.CREATE_TOPIC_POLICY_CLASS_NAME_DOC)
 .define(ServerLogConfigs.ALTER_CONFIG_POLICY_CLASS_NAME_CONFIG, 
CLASS, null, LOW, ServerLogConfigs.ALTER_CONFIG_POLICY_CLASS_NAME_DOC)
 .define(ServerLogConfigs.LOG_DIR_FAILURE_TIMEOUT_MS_CONFIG, LONG, 
ServerLogConfigs.LOG_DIR_FAILURE_TIMEOUT_MS_DEFAULT, atLeast(1), LOW, 
ServerLogConfigs.LOG_DIR_FAILURE_TIMEOUT_MS_DOC)
-.defineInternal(ServerLogConfigs.LOG_INITIAL_TASK_DELAY_MS_CONFIG, 
LONG, ServerLogConfigs.LOG_INITIAL_TASK_DELAY_MS_DEFAULT, atLeast(0), LOW, 
ServerLogConfigs.LOG_INITIAL_TASK_DELAY_MS_DOC);
+.defineInternal(ServerLogConfigs.LOG_INITIAL_TASK_DELAY_MS_CONFIG, 
LONG, ServerLogConfigs.LOG_INITIAL_TASK_DELAY_MS_DEFAULT, atLeast(0), LOW, 
ServerLogConfigs.LOG_INITIAL_TASK_DELAY_MS_DOC)
+.defineInternal(LogConfig.INTERNAL_SEGMENT_BYTES_CONFIG, INT, 
null, null, LOW, ServerLogConfigs.LOG_SEGMENT_BYTES_DOC);

Review Comment:
   Could we document that this config is for testing?



##
core/src/main/scala/kafka/raft/KafkaMetadataLog.scala:
##
@@ -583,35 +584,72 @@ object KafkaMetadataLog extends Logging {
 scheduler: Scheduler,
 config: MetadataLogConfig
   ): KafkaMetadataLog = {
-val props = new Properties()
-props.setProperty(TopicConfig.MAX_MESSAGE_BYTES_CONFIG, 
config.maxBatchSizeInBytes.toString)
-props.setProperty(TopicConfig.SEGMENT_BYTES_CONFIG, 
config.logSegmentBytes.toString)
-props.setProperty(TopicConfig.SEGMENT_MS_CONFIG, 
config.logSegmentMillis.toString)
-props.setProperty(TopicConfig.FILE_DELETE_DELAY_MS_CONFIG, 
ServerLogConfigs.LOG_DELETE_DELAY_MS_DEFAULT.toString)
-
-// Disable time and byte retention when deleting segments
-props.setProperty(TopicConfig.RETENTION_MS_CONFIG, "-1")
-props.setProperty(TopicConfig.RETENTION_BYTES_CONFIG, "-1")
+val props: Properties = settingLogProperties(config)
 LogConfig.validate(props)
 val defaultLogConfig = new LogConfig(props)
 
-if (config.logSegmentBytes < config.logSegmentMinBytes) {
-  throw new InvalidConfigurationException(
-s"Cannot set ${KRaftConfigs.METADATA_LOG_SEGMENT_BYTES_CONFIG} below 
${config.logSegmentMinBytes}: ${config.logSegmentBytes}"
-  )
-} else if (defaultLogConfig.retentionMs >= 0) {
-  throw new InvalidConfigurationException(
-s"Cannot set ${TopicConfig.RETENTION_MS_CONFIG} above -1: 
${defaultLogConfig.retentionMs}."
-  )
-} else if (defaultLogConfig.retentionSize >= 0) {
-  throw new InvalidConfigurationException(
-s"Cannot set ${TopicConfig.RETENTION_BYTES_CONFIG} above -1: 
${defaultLogConfig.retentionSize}."
-  )
+validateConfig(config, defaultLogConfig)
+
+val metadataLog: KafkaMetadataLog = createKafkaMetadataLog(topicPartition, 
topicId, dataDir, time, scheduler, config, defaultLogConfig)
+
+printWarningMessage(config, metadataLog)
+
+// When recovering, truncate fully if the latest snapshot is after the log 
end offset. This can happen to a follower
+// when the follower crashes after downloading a snapshot from the leader 
but before it could truncate the log fully.
+metadataLog.truncateToLatestSnapshot()
+
+metadataLog
+  }
+
+  private def printWarningMessage(config: MetadataLogConfig, metadataLog: 
KafkaMetadataLog): Unit = {
+// Print a warning if users have overridden the internal config
+if (config.logSegmentMinBytes != KafkaRaftClient.MAX_BATCH_SIZE_BYTES) {
+  metadataLog.error(s"Overriding 
${KRaftConfigs.METADATA_LOG_SEGMENT_MIN_BYTES_CONFIG} is only supported for 
testing. Setting " +
+s"this value too low may lead to an inability to write batches of 
metadata records.")
 }
+  }
+
+  // visible for testing
+  def internalApply(

Review Comment:
   Instead of having an `internalApply`, could we just use the existing `apply` 
and add `INTERNAL_SEGMENT_BYTES_CONFIG` to `MetadataLogConfig`? 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] KAFKA-19080 The constraint on segment.ms is not enforced at topic level [kafka]

2025-04-15 Thread via GitHub


chia7712 commented on PR #19371:
URL: https://github.com/apache/kafka/pull/19371#issuecomment-2806799851

   @m1a2st could you please check the failed tests?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] KAFKA-19080 The constraint on segment.ms is not enforced at topic level [kafka]

2025-04-14 Thread via GitHub


chia7712 commented on code in PR #19371:
URL: https://github.com/apache/kafka/pull/19371#discussion_r2042115495


##
clients/src/main/java/org/apache/kafka/common/config/TopicConfig.java:
##
@@ -214,6 +214,9 @@ public class TopicConfig {
 "configuration. If message.timestamp.type=CreateTime, the message will 
be rejected if the difference in " +
 "timestamps exceeds this specified threshold. This configuration is 
ignored if message.timestamp.type=LogAppendTime.";
 
+// visible for testing 
+public static final String INTERNAL_SEGMENT_BYTES_CONFIG = 
"internal.segment.bytes";

Review Comment:
   `TopicConfig` is a public class, so could you please move it to `LogConfig`?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] KAFKA-19080 The constraint on segment.ms is not enforced at topic level [kafka]

2025-04-11 Thread via GitHub


github-actions[bot] commented on PR #19371:
URL: https://github.com/apache/kafka/pull/19371#issuecomment-2798418146

   A label of 'needs-attention' was automatically added to this PR in order to 
raise the
   attention of the committers. Once this issue has been triaged, the `triage` 
label
   should be removed to prevent this automation from happening again.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] KAFKA-19080 The constraint on segment.ms is not enforced at topic level [kafka]

2025-04-11 Thread via GitHub


m1a2st commented on PR #19371:
URL: https://github.com/apache/kafka/pull/19371#issuecomment-2796373720

   Thanks for @junrao, @chia7712 comments, This PR is already for review, PTAL.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org