Re: [DISCUSS] KIP-405: Kafka Tiered Storage

Tom Bentley Wed, 06 Nov 2019 18:29:07 -0800

Hi Ying,

Because only inactive segments can be shipped to remote storage, to be able
> to ship log data as soon
> as possible, we will roll log segment very fast (e.g. every half hour).
>


So that means a consumer which gets behind by half an hour will find its
reads being served from remote storage. And, if I understand the proposed
algorithm, each such consumer fetch request could result in a separate
fetch request from the remote storage. I.e. there's no mechanism to
amortize the cost of the fetching between multiple consumers fetching
similar ranges?

(Actually the doc for RemoteStorageManager.read() says "It will read at
least one batch, if the 1st batch size is larger than maxBytes.". Does that
mean the broker might have to retry with increased maxBytes if the first
request fails to read a batch? If so, how does it know how much to increase
maxBytes by?)

Thanks,

Tom

Re: [DISCUSS] KIP-405: Kafka Tiered Storage

Reply via email to