lnbest0707-uber opened a new pull request, #14546:
URL: https://github.com/apache/pinot/pull/14546
`feature``bugfix` `backward-incompat`
This PR will converge the open source SchemaConformingTransformerV2 with
Uber internal's version. The later one has been running in real production
environment for a long time with large scale of data. The convergence would
clean up some function that we found not useful and also added some function
that we found required.
There would also be a complete user manual/instruction to be release to
expose to broader public usages.
Clean up:
- Shingling merged text index generation
New functionalities:
- Enhance case insensitive search by adding extra values to merged text
index by `optimizeCaseInsensitiveSearch = true`.
- Customize merged text index to do search by either key:value order or
value:key order by `reverseTextIndexKeyValueOrder`.
- Customize the document begin anchor, end anchor and key/value separator.
This could optimize the prefix match, suffix match and avoid the confusions
when searching ":". Use `mergedTextIndexBeginOfDocAnchor`
`mergedTextIndexEndOfDocAnchor` and `jsonKeyValueSeparator`.
- Add functionality to skip indexing some special fields by
`fieldPathsToSkipStorage`.
- Add functionality to index document keys with ".". For example, {"a.b": 1}
could only be put to json_data even though there is dedicated column "a.b"
because "a.b" was always translated to {"a": {"b": 1}}. With the config
`useAnonymousDotInFieldNames` enabled, both would end in the same "a.b" column.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]