> Offset and Transaction indexes are probably the only ones that make sense
to cache as are used on every fetch.

I do not think (correct me if I am wrong) that the transaction index is
used on every fetch. It is only used when consumers want to include aborted
transactions [1] i.e. when they use "read_committed" isolation level. Also,
note that in such a situation, we retrieve the transaction index possibly
for all log segments past the fetchOffset until the end offset (or until
LSO) on every fetch [2]. Hence, fetching the transaction index for first
segments efficiently is nice but it is not going to make any major
difference in overall latency since the overall latency will be dominated
by sequential calls to RSM to fetch trx index for other segments.

IMO the best path forward is to implement an "intelligent index fetch from
remote" which determines what index to fetch and how much of those indices
to fetch based on signals such as fetch request args. For example, if
read_committed isolation level is required, we can fetch multiple trx
indices in parallel instead of sequentially (as is done today). We can also
choose to perform parallel fetch for time index and offset index. But this
approach assumes that RSM can support parallel fetches and they are not
expensive, which might not be true depending on the plugin. That is why, I
think it's best if we leave it upto the RSM to determine how much and which
index to fetch based on heuristics.

[1]
https://github.com/apache/kafka/blob/832627fc78484fdc7c8d6da8a2d20e7691dbf882/core/src/main/java/kafka/log/remote/RemoteLogManager.java#L1310

[2]
https://github.com/apache/kafka/blob/832627fc78484fdc7c8d6da8a2d20e7691dbf882/core/src/main/java/kafka/log/remote/RemoteLogManager.java#L1358


--
Divij Vaidya



On Tue, Nov 14, 2023 at 8:30 AM Jorge Esteban Quilcate Otoya <
quilcate.jo...@gmail.com> wrote:

> Divij, thanks for your prompt feedback!
>
> 1. Agree, caching at the plugin level was my initial idea as well; though,
> keeping two caches for the same data both at the broker and at the plugin
> seems wasteful. (added this as a rejected alternative in the meantime)
>
> 2. Not necessarially. The API allows to request a set of indexes. In the
> case of the `RemoteIndexCache`, as it's currently implemented, it would be
> using: [offset, time, transaction] index types.
>
> However, I see your point that there may be scenarios where only 1 of the 3
> indexes are used:
> - Time index used mostly once when fetching sequentially by seeking offset
> by time.
> - Offset and Transaction indexes are probably the only ones that make sense
> to cache as are used on every fetch.
> Arguably, Transaction indexes are not as common, reducing the benefits of
> the proposed approach:
> from initially expecting to fetch 3 indexes at once, to potentially
> fetching only 2 (offset, txn), but most probably fetching 1 (offset).
>
> If there's value perceived from fetching Offset and Transaction together,
> we can keep discussing this KIP. In the meantime, I will look into the
> approach to lazily fetch indexes while waiting for additional feedback.
>
> Cheers,
> Jorge.
>
> On Mon, 13 Nov 2023 at 16:51, Divij Vaidya <divijvaidy...@gmail.com>
> wrote:
>
> > Hi Jorge
> >
> > 1. I don't think we need a new API here because alternatives solutions
> > exist even with the current API. As an example, when the first index is
> > fetched, the RSM plugin can choose to download all indexes and cache it
> > locally. On the next call to fetch an index from the remote tier, we will
> > hit the cache and retrieve the index from there.
> >
> > 2. The KIP assumes that all indexes are required at all times. However,
> > indexes such as transaction indexes are only required for read_committed
> > fetches and time index is only required when a fetch call wants to search
> > offset by timestamp. As a future step in Tiered Storage, I would actually
> > prefer to move towards a direction where we are lazily fetching indexes
> > on-demand instead of fetching them together as proposed in the KIP.
> >
> > --
> > Divij Vaidya
> >
> >
> >
> > On Fri, Nov 10, 2023 at 4:00 PM Jorge Esteban Quilcate Otoya <
> > quilcate.jo...@gmail.com> wrote:
> >
> > > Hello everyone,
> > >
> > > I would like to start the discussion on a KIP for Tiered Storage. It's
> > > about improving cross-segment latencies by reducing calls to fetch
> > indexes
> > > individually.
> > > Have a look:
> > >
> > >
> > >
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-1002%3A+Fetch+remote+segment+indexes+at+once
> > >
> > > Cheers,
> > > Jorge
> > >
> >
>

Reply via email to