somandal opened a new pull request, #10260:
URL: https://github.com/apache/pinot/pull/10260
This PR relaxes the current constraint that a dictionary and inverted index
must exist to allow disabling the forward index. The following changes have
been made as part of this PR:
- Relax the constraint in the `TableConfigUtils` and other places such as
segment creator and default index handler
- Update the reload code path to handle toggling the dictionary for forward
index disabled columns
- Update the `NoOpForwardIndexCreator` to work for both dictionary based and
non-dictionary based columns
- Add tests
Adding a new index on a column with forward index disabled along with
inverted index or dictionary disabled will result in an error now in reload
since the inverted index + dictionary are needed to generate a temporary
forward index.
The reload code path needs some significant updates in the
`ForwardIndexHandler` to support this change and allow toggling the dictionary
for forward index disabled columns. The approach taken to support this can be
summarized as follows:
- Forward index -> No Forward Index
- Sorted columns are ignored (no-op) [current code path]
- Existing column has dictionary
- Dictionary being removed
- Regenerate temporary forward index to be raw based (post
updateIndex cleanup will remove it)
- Remove dictionary
- Update segment metadata to indicate dictionary disabled
- Remove all dictionary indexes: inverted index and fst (as they
need dictionary) and range (needs to be regenerated and is possible to
regenerate on this path)
- Dictionary being kept [current code path]
- Remove forward index (post updateIndex cleanup)
- Existing column doesn't have dictionary
- Dictionary remains disabled
- Remove forward index (post updateIndex cleanup)
- Dictionary to be enabled [current code path]
- Regenerate temporary forward index to be dictionary based
(post updateIndex cleanup will remove it)
- Update segment metadata to indicate dictionary disabled
- Remove all dictionary indexes: inverted index and fst (they
shouldn't even exist as they need dictionary) and range (needs to be
regenerated and is possible to regenerate on this path)
- No Forward Index -> Forward Index
- Sorted columns are ignored (no-op) [current code path]
- If either dictionary or inverted index is disabled on the existing
column, log a warning and treat this as a no-op
- Otherwise regenerate the forward index if dictionary enabled or
disabled [current code path]
- No Forward Index -> No Forward Index (this is not getting toggled but
maybe something else is)
- Existing column has dictionary
- Dictionary being removed
- Inverted index exists
- Temporary forward index can be regenerated as raw format
from dictionary + inverted index
- Update segment metadata to indicate dictionary disabled
- Remove all dictionary indexes: inverted index and fst (as
they need dictionary) and range (needs to be regenerated and is possible to
regenerate on this path)
- Inverted Index does not exist
- Cannot regenerate the forward index in raw format: SHOULD
FAIL if RANGE INDEX exists since range index needs to be regenerated
- Just remove dictionary
- Update segment metadata to indicate dictionary disabled
- Remove all dictionary indexes: inverted index and fst (as
they need dictionary) and range (needs to be regenerated and cannot be
regenerate on this path)
- Dictionary being kept [current code path]
- no-op
- Existing column doesn't have dictionary
- Dictionary remains removed
- no-op
- Dictionary being enabled
- Cannot rebuild dictionary if forward index is not available.
Throw an error
cc @siddharthteotia @Jackie-Jiang @vvivekiyer
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]