[
https://issues.apache.org/jira/browse/KAFKA-15169?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17769119#comment-17769119
]
Divij Vaidya commented on KAFKA-15169:
--------------------------------------
First some correction on what I said above.
> The case you mention assumes that file sitting on disk may get corrupted but
> that is a risk we choose to accept in Kafka,
The files sitting on disk do actually get corrupted. We know of such cases when
the disk gets full and sometimes leaves the indexes in an inconsistent state.
We perform a restart on disk full case and hence, we can assume that during the
lifecycle of a broker, files sitting on disk will not get corrupted. But on
restart, we should definitely perform a check.
Next, for test case 1, it validates recovery if the index fetched from remote
was corrupted during network transfer, i.e.
1. we call getIndexEntry
2. It throws corrupt index exception( This exception will be thrown after
fetching from remote storage ) at
"index.sanityCheck();" (line 361)
3. I haven't looked at how we are handling it, but ideally the system should
retry fetch from remote and this time it should succeed (no corruption during
transfer), the test should validate that a retry occur and it is successful.
Next, for test case 2, the test you mentioned sounds a nice addition. It
validates the situation where we have a file on disk but it's not in cache. In
such case, we should add cache entry from the file if it is correct else try to
fetch from remote. You are right in assuming that this case code never occur
(because ideally if a file exist on disk, it should have a corresponding entry
in cache already), but this code is a fail safe scenario in case we are
accidentally left with an inconsistency between the file on disk and in-memory
cache.
> Add tests for RemoteIndexCache
> ------------------------------
>
> Key: KAFKA-15169
> URL: https://issues.apache.org/jira/browse/KAFKA-15169
> Project: Kafka
> Issue Type: Test
> Reporter: Satish Duggana
> Assignee: Arpit Goyal
> Priority: Major
> Labels: KIP-405
> Fix For: 3.7.0
>
>
> Follow-up from
> https://github.com/apache/kafka/pull/13275#discussion_r1257490978
--
This message was sent by Atlassian Jira
(v8.20.10#820010)