Hi, Thanks for taking this initiative forward. I'd like to propose formatting of that table to also include current unchanged default values and unchanged min/max constraints. In my opinion it would improve the readability of the whole document. Having those values listed even unchanged makes it possible to evaluate if defaults and min/max are in sync with each other.
For example this mentioned segment.index.bytes default value is 10 megabytes. New proposed min limit is 8 due the code I pointed out. However in my opinion new constraints should be also considered to be meaningful. I'd support having segment.index.bytes minimum constraint to be at least multiple kilobytes, maybe even 1 megabyte. With current computing such values are much more aligned with the real world. It is not quite meaningful to have a segment index of 8 bytes. For segment.bytes I believe that in any case new minimum constraint must be truly greater than max.message.bytes, which is by default 1 megabyte to avoid replication issues in case of rebalancing, or replication after node loss, because new broker won't be able to digest messages written into a topic if segment.bytes is smaller than size of largest existing message in a partition. I've tried to describe such problems separately within the context of RecordBatchTooLargeException. On Wed, Oct 30, 2024 at 5:21 PM Divij Vaidya <[email protected]> wrote: > Thanks Tommi. > > I have again started documenting these changes in > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-1030%3A+Change+constraints+and+default+values+for+various+configurations > and will try to get this KIP out of draft stage next week (so that we can > hit the 4.0 KIP freeze timeline of 20th Nov). > > -- > Divij Vaidya > > > > On Wed, Oct 30, 2024 at 12:46 PM Tommi Vainikainen > <[email protected]> wrote: > > > Hi, > > > > I've noticed that similar to these already mentioned settings also > > segment.index.bytes has a minimum value of 4. This conflicts with > > OffsetIndex, which throws `java.lang.IllegalArgumentException: Invalid > max > > index size: 4` with such settings, because hard-coded entry size in > > OffsetIndex is 8. Setting segment.index.bytes to less than 8 leads to > > errors. > > > > On Mon, Mar 11, 2024 at 7:33 PM Divij Vaidya <[email protected]> > > wrote: > > > > > Hey folks > > > > > > Before I file a KIP to change this in 4.0, I wanted to understand the > > > historical context for the value of the following setting. > > > > > > Currently, segment.ms minimum threshold is set to 1ms [1]. > > > > > > Segments are expensive. Every segment uses multiple file descriptors > and > > > it's easy to run out of OS limits when creating a large number of > > segments. > > > Large number of segments also delays log loading on startup because of > > > expensive operations such as iterating through all directories & > > > conditionally loading all producer state. > > > > > > I am currently not aware of a reason as to why someone might want to > work > > > with a segment.ms of less than ~10s (number chosen arbitrary that > looks > > > sane) > > > > > > What was the historical context of setting the minimum threshold to 1ms > > for > > > this setting? > > > > > > [1] > https://kafka.apache.org/documentation.html#topicconfigs_segment.ms > > > > > > -- > > > Divij Vaidya > > > > > >
