hey guys,

Regarding to num.recovery.threads.per.data.dir: I agree, in our company we
use the number of vCPUs to do so as this is not competing with ready
cluster traffic.


On Wed, 13 Mar 2024 at 09:29, Luke Chen <show...@gmail.com> wrote:

> Hi Divij,
>
> Thanks for raising this.
> The valid minimum value 1 for `segment.ms` is completely unreasonable.
> Similarly for `segment.bytes`, `metadata.log.segment.ms`,
> `metadata.log.segment.bytes`.
>
> In addition to that, there are also some config default values we'd like to
> propose to change in v4.0.
> We can collect more comments from the community, and come out with a KIP
> for them.
>
> 1. num.recovery.threads.per.data.dir:
> The current default value is 1. But the log recovery is happening before
> brokers are in ready state, which means, we should use all the available
> resource to speed up the log recovery to bring the broker to ready state
> soon. Default value should be... maybe 4 (to be decided)?
>
> 2. Other configs might be able to consider to change the default, but open
> for comments:
>    2.1. num.replica.fetchers: default is 1, but that's not enough when
> there are multiple partitions in the cluster
>    2.2. `socket.send.buffer.bytes`/`socket.receive.buffer.bytes`:
> Currently, we set 100kb as default value, but that's not enough for
> high-speed network.
>
> Thank you.
> Luke
>
>
> On Tue, Mar 12, 2024 at 1:32 AM Divij Vaidya <divijvaidy...@gmail.com>
> wrote:
>
> > Hey folks
> >
> > Before I file a KIP to change this in 4.0, I wanted to understand the
> > historical context for the value of the following setting.
> >
> > Currently, segment.ms minimum threshold is set to 1ms [1].
> >
> > Segments are expensive. Every segment uses multiple file descriptors and
> > it's easy to run out of OS limits when creating a large number of
> segments.
> > Large number of segments also delays log loading on startup because of
> > expensive operations such as iterating through all directories &
> > conditionally loading all producer state.
> >
> > I am currently not aware of a reason as to why someone might want to work
> > with a segment.ms of less than ~10s (number chosen arbitrary that looks
> > sane)
> >
> > What was the historical context of setting the minimum threshold to 1ms
> for
> > this setting?
> >
> > [1] https://kafka.apache.org/documentation.html#topicconfigs_segment.ms
> >
> > --
> > Divij Vaidya
> >
>

Reply via email to